1000 Genomes Project Dataset
1000 Genomes 30x on GRCh38 dataset sequenced & aligned by New York Genome Center is available for querying via gRPC, gRPC-web & MCP protocols:
- gRPC
- http://db.dnaerys.org:80
- https://db.dnaerys.org:443
- gRPC-web
- http://db.dnaerys.org:80
- https://db.dnaerys.org:443
- Model Context Protocol
- http://db.dnaerys.org:80/mcp
- https://db.dnaerys.org:443/mcp
Dataset contains 3202 samples with 138 044 723 unique variants: 2504 unrelated samples from the phase three panel and additional 698 samples, related to samples in the 2504 panel (1598 males, 1604 females total). For details and attribution see references.
This dataset was annotated by:
-
Ensembl Variant Effect Predictor (VEP) software, developed by the Ensembl project at EMBL-EBI and the Wellcome Sanger Institute. Data was used as provided.
-
AlphaMissense annotations, developed by Google DeepMind and EMBL-EBI, licensed under the Creative Commons Attribution 4.0 International License. Data was used as provided.
-
gnomAD AF annotations, derived from the Genome Aggregation Database, developed by the gnomAD consortium and the Broad Institute. Data have been used as provided in VEP cache in accordance with their terms of use (CC0 Public Domain Dedication).
-
ClinVar annotations, derived from public archive of interpretations of clinically relevant variants maintained by the National Center for Biotechnology Information (NCBI). The information is publicly available for use without restriction, as provided by the NCBI's data use policy. Data was used as provided.
-
Sequence Ontology variant consequence terms from the Sequence Ontology (SO), which are available under the permissive CC BY 4.0 license. The SO is developed as an open source project for the genomics community. Data was used as provided.
-
Annotation versions:
- VEP="v115.2"
- cache="115_GRCh38" ensembl=115.266b84d ensembl-compara=115.ae48a7a ensembl-funcgen=115.57f7061 ensembl-io=115.25061d3 ensembl-variation=115.b7c2637 1000genomes="phase3" ClinVar="202502" assembly="GRCh38.p14" gencode="GENCODE 49" genebuild="GENCODE49" gnomADe="v4.1" gnomADg="v4.1"
- AlphaMissense thresholds
- 'Likely benign' if score < 0.34, 'Likely pathogenic' if score > 0.564, 'ambiguous' otherwise;
- see doi.org/10.1126/science.adg7492
- VEP="v115.2"
Note
db.dnaerys.org is located Down Under, consider round trip times
Model Context Protocol
OneKGPd - Model Context Protocol Server for 1000 Genomes Project Dataset. Our favorite style for asking genomes the questions!
- all details and instructions: https://github.com/dnaerys/onekgpd-mcp
- remote MCP service via Streamable HTTP:
- http://db.dnaerys.org:80/mcp
- https://db.dnaerys.org:443/mcp
- source code + installation options
gRPC
- examples below use gRPCurl as a client
- any gRPC client should work in principle
- API declarations have to be provided to a client
- examples below use declarations from gRPC API v1.17.2 which are stored in
dnaerys_1.17.2.proto
- examples below use declarations from gRPC API v1.17.2 which are stored in
- homozygous & heterozygous variants from TP53 from all samples, limiting response by 10 variants
grpcurl \
-proto dnaerys_1.17.2.proto \
-d '{"chr":"17", "start":"7661779", "end":"7687546", "hom":"true", "het":"true", "limit":"10", "assembly":"GRCh38"}' \
db.dnaerys.org:443 \
org.dnaerys.cluster.grpc.DnaerysService/SelectVariantsInRegion
- pathogenic variants in TP53
grpcurl \
-proto dnaerys_1.17.2.proto \
-d '{"chr":"17", "start":"7661779", "end":"7687546", "hom":"true", "het":"true", "ann": {"clinsgn":"PATHOGENIC"}, "assembly":"GRCh38"}' \
db.dnaerys.org:443 \
org.dnaerys.cluster.grpc.DnaerysService/SelectVariantsInRegion
- high impact heterozygous variants in transcripts in TP53
grpcurl \
-proto dnaerys_1.17.2.proto \
-d '{"chr":"17", "start":"7661779", "end":"7687546", "het":"true", "ann": {"feature_type":["TRANSCRIPT"], "impact":["HIGH"]}, "assembly":"GRCh38"}' \
db.dnaerys.org:443 \
org.dnaerys.cluster.grpc.DnaerysService/SelectVariantsInRegion
- pathogenic heterozygous variants in sample (NA10842) in TP53
grpcurl \
-proto dnaerys_1.17.2.proto \
-d '{"chr":"17", "start":"7661779", "end":"7687546", "het":"true", "samples":"NA10842", "ann": {"clinsgn":["PATHOGENIC"]}, "assembly":"GRCh38"}' \
db.dnaerys.org:443 \
org.dnaerys.cluster.grpc.DnaerysService/SelectVariantsInRegionInSamples
- Select samples with pathogenic heterozygous variants in transcripts in TP53 with gnomAD exomes AF < 0.0001
grpcurl \
-proto dnaerys_1.17.2.proto \
-d '{"chr":"17", "start":"7661779", "end":"7687546", "het":"true", "ann": {"feature_type":"TRANSCRIPT", "clinsgn":"PATHOGENIC", "gnomad_exomes_af_lt":"0.0001"}, "assembly":"GRCh38"}' \
db.dnaerys.org:443 \
org.dnaerys.cluster.grpc.DnaerysService/SelectSamplesInRegion
- De Novo
- all de novo variants in chromosome 1 in a trio classified as likely pathogenic by AlphaMissense
grpcurl \
-proto dnaerys_1.17.2.proto \
-d '{"parent1":"HG00418", "parent2":"HG00419", "proband":"HG00420", "chr":"1", "start":"1", "end":"248956422", "ann": {"am_class":"AM_LIKELY_PATHOGENIC"}}' \
db.dnaerys.org:443 \
org.dnaerys.cluster.grpc.DnaerysService/SelectDeNovo
- Homozygous Recessive
- all homozygous recessive variants in chromosome 1 in a trio classified as likely pathogenic by AlphaMissense
grpcurl \
-proto dnaerys_1.17.2.proto \
-d '{"unaffected_parent1":"HG00418", "unaffected_parent2":"HG00419", "affected_child":"HG00420", "chr":"1", "start":"1", "end":"248956422", "ann": {"am_class":"AM_LIKELY_PATHOGENIC"}}' \
db.dnaerys.org:443 \
org.dnaerys.cluster.grpc.DnaerysService/SelectHomRecessive
- Heterozygous Dominant
- all heterozygous dominant variants in chromosome 1 in a trio classified as likely pathogenic by AlphaMissense
grpcurl \
-proto dnaerys_1.17.2.proto \
-d '{"affected_parent":"HG00418", "unaffected_parent":"HG00419", "affected_child":"HG00420", "chr":"1", "start":"1", "end":"248956422", "ann": {"am_class":"AM_LIKELY_PATHOGENIC"}}' \
db.dnaerys.org:443 \
org.dnaerys.cluster.grpc.DnaerysService/SelectHetDominant
gRPC-web
gRPC-web proxy (Envoy) is available for web applications with full gRPC API:
- gRPC-web: http://db.dnaerys.org:80
- gRPC-web+TLS: https://db.dnaerys.org:443
1000 Genomes Project references
- https://www.internationalgenome.org/data-portal/data-collection/30x-grch38
- https://www.nature.com/articles/nature15393
- https://academic.oup.com/nar/article/48/D1/D941/5580898
Terms and Conditions
Disclaimer of Warranties
The Services and data provided on dnaerys.org and db.dnaerys.org (“the Services”) are supplied on an “AS IS” and “AS AVAILABLE” basis without warranties of any kind. We do not warrant that the Services will be uninterrupted, error-free, secure, accurate, or complete. All information is provided for informational purposes only, and while reasonable efforts are made to ensure accuracy, we do not guarantee the correctness, completeness, reliability, or timeliness of any data or content made available through the Services.
Limitation of Liability
To the fullest extent permitted by applicable law, in no event shall Dnaerys Pty Ltd, its directors, employees, partners, agents, suppliers, or affiliates be liable for any direct, indirect, incidental, special, exemplary, consequential, or punitive damages, including without limitation loss of profits, business interruption, loss of data, or other losses, arising out of or in connection with:
- your access to or use of, or inability to access or use, the Services;
- any errors, omissions, inaccuracies, or delays in any data or content;
- any results obtained, decisions made, or actions taken based on the Services;
- any other matter relating to the Services.
You acknowledge that your use of the Services is at your sole risk.
No Medical or Clinical Advice
Data and information provided through the Services do not constitute clinical, medical, diagnostic, therapeutic, or patient-care guidance. Users are responsible for independently verifying scientific results and drawing their own conclusions.
No Professional or Guaranteed Results
No guarantees are made regarding accuracy, completeness, performance, or suitability of the Services for a particular purpose. Users bear full responsibility for any decisions or actions taken based on data obtained through the Services.
Service Interruptions and Maintenance
We reserve the right to modify, suspend, or discontinue the Services, in whole or in part, at any time without notice. We shall not be liable for any modification, suspension, downtime, or discontinuation of the Services.
Third-party data sources
Some data may originate from third-party public datasets. We are not responsible for the accuracy, completeness, availability, or licensing of external data sources. Users are responsible for complying with all applicable third-party terms and attribution requirements.