Model Context Protocol
Model Context Protocol for 1000 Genomes 30x on GRCh38 dataset, sequenced & aligned by New York Genome Center, is available for LLMs integrations.
Key Features
-
real-time access to 138 044 724 unique variants and about 442 billion individual genotypes
-
variant, sample, and genotype selection based on coordinates, annotations, zygosity
-
filtering by VEP, ClinVar, gnomAD AF and AlphaMissense annotations
-
filtering by inheritance model (de novo, heterozygous dominant, homozygous recessive)
Remote MCP
Remote MCP service is available online via Streamable HTTP:
- http://db.dnaerys.org:80/mcp
- https://db.dnaerys.org:443/mcp
Source code
MCP server can be run locally with MCP over stdio transport. Can be started as a subprocess by MCP clients (like Claude Desktop or Goose).
Source code + installation:
Examples
Answers below are from Sonnet 4.5: some from multi-agent Research system, some with extended thinking mode, and some from a single-agent system in normal mode.
“Identify potential modifier variants for well-known pathogenic alleles in TTN - variants that consistently co-occur in the same haplotype block with pathogenic alleles and may alter severity or penetrance. Conduct research for pathogenic alleles documented in the literature. Use KGP dataset of healthy individuals to find potential modifier variants. Start with 100kb for "the same haplotype block" definition, then extend if required. Evaluate statistical significance for the best modifier candidates found. No initial constraints for modifier types.”
- it feels a bit unreal how easily this thing can pull not entirely nonsensical events from a dataset with p = 2.29×10⁻¹³... which makes one wonder what else is possible with a proper study design, specialised disease and control cohorts, and a bit more dedication
- same task for KCNH2, SCN5A, CACNA1C, LMNA, SPAST and BMPR2
or
“Which regions in POLR2A are most likely disease-critical, with strong purifying selection, based on available variation patterns across functional domains in KGP ? Do statistical evaluation.”
or
“In what cardiac related genes, e.g. ion channels, variants in KGP dataset near catalytic residues or ligand-binding pockets show strong depletion compared to flanking residues (±20 amino acids) ?”
- results might be some
or
“Are there patterns of variation in KGP dataset that suggest digenic or oligogenic interactions for Bardet-Biedl syndrome ? Check variety of combinations and zygosity patterns.”
or
“Which variants in the HBB gene are unexpectedly tolerated in the KGP dataset with at least several annotation sources in agreement with regard to their expected pathogenicity ?”
or
“Rank all rare KGP variants in genes associated with arrhythmia disorder by their expected clinical relevance, not by predicted pathogenicity alone. Find affected individuals with highest clinical priority variants.”
- results might be some