1000 Genomes Project Dataset
1000 Genomes 30x on GRCh38 dataset sequenced & aligned by New York Genome Center is available for querying:
- Web UI: db.dnaerys.org:443
- gRPC: db.dnaerys.org:7443
- REST: db.dnaerys.org:8443
Dataset contains 3202 samples with 130 552 684 unique variants: 2504 unrelated samples from the phase three panel and additional 698 samples, related to samples in the 2504 panel (1598 males, 1604 females total; see references).
Dataset is VEP & ClinVar annotated.
Note
db.dnaerys.org server is located Down Under, hence consider round trip times
Web UI
Web interface for ad-hoc exploration: https://db.dnaerys.org
Note
Web client is using REST API and should not be used for performance evaluation. gRPC-Web based version is coming soon.
gRPC
- examples below use gRPCurl as a client
- any gRPC client should work in principle
- API declarations have to be provided to a client
- examples below use declarations from gRPC API which are stored in
dnaerys.proto
- examples below use declarations from gRPC API which are stored in
- a variant in TP53
grpcurl \
-proto dnaerys.proto \
-d '{"chr":"17", "pos":"7662034", "alt":"A", "assembly":"GRCh38"}' \
db.dnaerys.org:7443 \
org.dnaerys.cluster.grpc.DnaerysService/Beacon
- homozygous & heterozygous variants from TP53 from all samples, limiting response by 10 variants
grpcurl \
-proto dnaerys.proto \
-d '{"chr":"17", "start":"7661779", "end":"7687546", "hom":"true", "het":"true", "limit":"10", "assembly":"GRCh38"}' \
db.dnaerys.org:7443 \
org.dnaerys.cluster.grpc.DnaerysService/SelectVariantsInRegion
- MISSENSE SNVs in transcripts with high impact and PATHOGENIC ClinVar annotations in TP53, both homozygous & heterozygous
grpcurl \
-proto dnaerys.proto \
-d '{"chr":"17", "start":"7661779", "end":"7687546", "hom":"true", "het":"true", "limit":"10", "ann": {"vtypes":"SNV", "ftypes":["TRANSCRIPT"], "impact":["HIGH"], "consequences":"MISSENSE_VARIANT", "clnsgn":"PATHOGENIC"}, "assembly":"GRCh38"}' \
db.dnaerys.org:7443 \
org.dnaerys.cluster.grpc.DnaerysService/SelectVariantsInRegion
- Pathogenic homozygous variants in sample (HG00447) in BRCA2
grpcurl \
-proto dnaerys.proto \
-d '{"chr":"13", "start":"32315086", "end":"32400268", "hom":"true", "samples":"HG00447", "ann": {"clnsgn":["PATHOGENIC"]}, "assembly":"GRCh38"}' \
db.dnaerys.org:7443 \
org.dnaerys.cluster.grpc.DnaerysService/SelectVariantsInRegionInVirtualCohort
- Select all samples with pathogenic homozygous variants in transcripts in TP53
grpcurl \
-proto dnaerys.proto \
-d '{"chr":"17", "start":"7661779", "end":"7687546", "hom":"true", "ann": {"ftypes":["TRANSCRIPT"], "clnsgn":["PATHOGENIC"]}, "assembly":"GRCh38"}' \
db.dnaerys.org:7443 \
org.dnaerys.cluster.grpc.DnaerysService/SelectSamplesInRegion
- Return kinship coefficients φ for all possible pairs in samples HG00418, HG00419, HG00420 (related trio)
grpcurl \
-proto dnaerys.proto \
-d '{"samples": ["HG00418", "HG00419", "HG00420"]}' \
db.dnaerys.org:7443 \
org.dnaerys.cluster.grpc.DnaerysService/Kinship
- De Novo
grpcurl \
-proto dnaerys.proto \
-d '{"parent1":"HG00418", "parent2":"HG00419", "proband":"HG00420", "limit":"1" }' \
db.dnaerys.org:7443 \
org.dnaerys.cluster.grpc.DnaerysService/SelectDeNovo
- Homozygous Recessive
grpcurl \
-proto dnaerys.proto \
-d '{"unaffected_parent1":"HG00418", "unaffected_parent2":"HG00419", "affected_child":"HG00420", "limit":"1" }' \
db.dnaerys.org:7443 \
org.dnaerys.cluster.grpc.DnaerysService/SelectHomRecessive
- Heterozygous Dominant
grpcurl \
-proto dnaerys.proto \
-d '{"affected_parent":"HG00418", "unaffected_parent":"HG00419", "affected_child":"HG00420", "limit":"1" }' \
db.dnaerys.org:7443 \
org.dnaerys.cluster.grpc.DnaerysService/SelectHetDominant
REST
Dnaerys does not provide REST API
- when gRPC can not be used (e.g. for web browsers applications), a gateway can be used as an API Gateway proxy
- Variant Store Abstraction Layer instance is used as a proxy in examples below
- REST API shall not be used for any DBMS performance evaluation - requests are significantly slowed down by marshalling & unmarshalling between the 2 protocols in VSAL
- selecting by annotations is not supported by VSAL atm
jq
in commands below is optional
- VSAL provides GA4GH Beacon v0.2 API
- NB: Beacon v0.2 coordinates are 0-based, as opposed to Dnaerys/VSAL/VCF coordinates, which are 1-based
- GA4GH Beacon v2 API is coming soon
curl "https://db.dnaerys.org:8443/vsal/beacon-v0dot2/query?chrom=17&pos=7662033&allele=A&ref=hg38" | jq
- select 10 homozygous variants from TP53 from all samples
curl "https://db.dnaerys.org:8443/vsal/core/find?dataset=1000GP&chromosome=17&positionStart=7661779&positionEnd=7687546&hom=true&het=true&limit=10&asm=hg38" | jq
- select homozygous variants in sample (HG00447) in BRCA2, limiting response by 10 variants
curl "https://db.dnaerys.org:8443/vsal/core/find?dataset=1000GP&chromosome=13&positionStart=32315086&positionEnd=32400268&hom=true&limit=10&asm=hg38&samples=HG00447" | jq
- select heterozygous variants in sample (HG00447) in BRCA2, limiting response by 10 variants
curl "https://db.dnaerys.org:8443/vsal/core/find?dataset=1000GP&chromosome=13&positionStart=32315086&positionEnd=32400268&het=true&limit=10&asm=hg38&samples=HG00447" | jq
- select all samples with homozygous variants in TP53
curl "https://db.dnaerys.org:8443/vsal/core/find?dataset=1000GP&chromosome=17&positionStart=7661779&positionEnd=7687546&hom=true&selectSamplesByGT=true&asm=hg38" | jq
references
- https://www.internationalgenome.org/data-portal/data-collection/30x-grch38
- https://www.nature.com/articles/nature15393
- https://academic.oup.com/nar/article/48/D1/D941/5580898