• This record comes from PubMed

A corpus of GA4GH phenopackets: Case-level phenotyping for genomic diagnostics and discovery

. 2025 Jan 09 ; 6 (1) : 100371. [epub] 20241010

Language English Country United States Media print-electronic

Document type Journal Article

Grant support
P30 ES010126 NIEHS NIH HHS - United States
R01 HD103805 NICHD NIH HHS - United States
R35 HG011297 NHGRI NIH HHS - United States
U24 HG011449 NHGRI NIH HHS - United States

Links

PubMed 39394689
PubMed Central PMC11564936
DOI 10.1016/j.xhgg.2024.100371
PII: S2666-2477(24)00111-8
Knihovny.cz E-resources

The Global Alliance for Genomics and Health (GA4GH) Phenopacket Schema was released in 2022 and approved by ISO as a standard for sharing clinical and genomic information about an individual, including phenotypic descriptions, numerical measurements, genetic information, diagnoses, and treatments. A phenopacket can be used as an input file for software that supports phenotype-driven genomic diagnostics and for algorithms that facilitate patient classification and stratification for identifying new diseases and treatments. There has been a great need for a collection of phenopackets to test software pipelines and algorithms. Here, we present Phenopacket Store. Phenopacket Store v.0.1.19 includes 6,668 phenopackets representing 475 Mendelian and chromosomal diseases associated with 423 genes and 3,834 unique pathogenic alleles curated from 959 different publications. This represents the first large-scale collection of case-level, standardized phenotypic information derived from case reports in the literature with detailed descriptions of the clinical data and will be useful for many purposes, including the development and testing of software for prioritizing genes and diseases in diagnostic genomics, machine learning analysis of clinical phenotype data, patient stratification, and genotype-phenotype correlations. This corpus also provides best-practice examples for curating literature-derived data using the GA4GH Phenopacket Schema.

Berlin Institute of Health at Charité Universitätsmedizin Berlin Berlin Germany

Berlin Institute of Health at Charité Universitätsmedizin Berlin Berlin Germany; The Jackson Laboratory for Genomic Medicine 10 Discovery Drive Farmington CT 06032 USA

Berlin Institute of Health at Charité Universitätsmedizin Berlin Berlin Germany; The Jackson Laboratory for Genomic Medicine 10 Discovery Drive Farmington CT 06032 USA; ELLIS European Laboratory for Learning and Intelligent Systems

Berlin Institute of Health at Charité Universitätsmedizin Berlin Berlin Germany; Utrecht University Utrecht the Netherlands

Clinic for Immunology and Rheumatology Hanover Medical School Hanover Germany; RESiST Cluster of Excellence 2155 Hanover Medical School Hanover Germany

Department of Allergy and Immunology National Institute of Women's Children's and Adolescents' Health Fernandes Figueira Rio de Janeiro Brazil; High Complexity Laboratory National Institute of Women's Children's and Adolescents' Health Fernandes Figueira Rio de Janeiro Brazil

Department of Biomedical Informatics University of Colorado Anschutz Medical Campus Aurora CO USA

Department of Genetics Genomics and Cancer Sciences University of Leicester Leicester UK

Department of Immunology 2nd Faculty of Medicine Charles University and University Hospital in Motol Prague Czech Republic

Department of Ophthalmology University Clinic Marburg Campus Fulda Fulda Germany

Department of Paediatric Immunology Great Ormond Street Hospital for Children NHS Foundation Trust London UK; University College London Institute of Child Health London UK

Department of Pediatrics Division of Genetic Medicine University of Washington 1959 NE Pacific Street Box 357371 Seattle WA 98195 USA

Department of Pediatrics Division of Genetic Medicine University of Washington 1959 NE Pacific Street Box 357371 Seattle WA 98195 USA; Brotman Baty Institute for Precision Medicine 1959 NE Pacific Street Box 357657 Seattle WA 98195 USA

Department of Pediatrics Division of Genetic Medicine University of Washington 1959 NE Pacific Street Box 357371 Seattle WA 98195 USA; Brotman Baty Institute for Precision Medicine 1959 NE Pacific Street Box 357657 Seattle WA 98195 USA; Department of Pediatrics Division of Genetic Medicine Seattle Children's Hospital Seattle WA 98195 USA

Department of Pediatrics Faculty of Medicine and University Hospital Carl Gustav Carus Technische Universität Dresden Dresden Germany; University Center for Rare Diseases Faculty of Medicine and University Hospital Carl Gustav Carus Technische Universität Dresden Dresden Germany

Department of Pediatrics Faculty of Medicine and University Hospital Carl Gustav Carus Technische Universität Dresden Dresden Germany; University Center for Rare Diseases Faculty of Medicine and University Hospital Carl Gustav Carus Technische Universität Dresden Dresden Germany; German Center for Child and Adolescent Health partner site Leipzig Dresden Dresden Germany

Division of Environmental Genomics and Systems Biology Lawrence Berkeley National Laboratory Berkeley CA USA

Division of Informatics Imaging and Data Science The University of Manchester Manchester UK

Medica Genetics University of Catania Italy Catania Italy; Morgagni Foundation and Clinic Catania Italy

North West Thames Regional Genetics Service Northwick Park and St Mark's Hospitals London UK

Rare Care Centre Perth Children's Hospital Nedlands WA 6009 Australia; SingHealth Duke NUS Institute of Precision Medicine 5 Hospital Drive Level 9 Singapore 169609 Singapore; Telethon Kids Institute Nedlands WA 6009 Australia

The Jackson Laboratory for Genomic Medicine 10 Discovery Drive Farmington CT 06032 USA

University of North Carolina at Chapel Hill Chapel Hill NC USA

William Harvey Research Institute Queen Mary University of London London UK

Update Of

PubMed

See more in PubMed

Haendel M., Vasilevsky N., Unni D., Bologa C., Harris N., Rehm H., Hamosh A., Baynam G., Groza T., McMurry J., et al. How many rare diseases are there? Nat. Rev. Drug Discov. 2020;19:77–78. PubMed PMC

Nguengang Wakap S., Lambert D.M., Olry A., Rodwell C., Gueydan C., Lanneau V., Murphy D., Le Cam Y., Rath A. Estimating cumulative point prevalence of rare diseases: analysis of the Orphanet database. Eur. J. Hum. Genet. 2020;28:165–173. PubMed PMC

Rubinstein Y.R., Robinson P.N., Gahl W.A., Avillach P., Baynam G., Cederroth H., Goodwin R.M., Groft S.C., Hansson M.G., Harris N.L., et al. The case for open science: rare diseases. Jamia Open. 2020;3:472–486. doi: 10.1093/jamiaopen/ooaa030. PubMed DOI PMC

Haendel M.A., Chute C.G., Robinson P.N. Classification, Ontology, and Precision. N. Engl. J. Med. 2018;379:1452–1462. PubMed PMC

Putman T.E., Schaper K., Matentzoglu N., Rubinetti V.P., Alquaddoomi F.S., Cox C., Caufield J.H., Elsarboukh G., Gehrke S., Hegde H., et al. The Monarch Initiative in 2024: an analytic platform integrating phenotypes, genes and diseases across species. Nucleic Acids Res. 2024;52:D938–D949. PubMed PMC

Gargano M.A., Matentzoglu N., Coleman B., Addo-Lartey E.B., Anagnostopoulos A.V., Anderton J., Avillach P., Bagley A.M., Bakštein E., Balhoff J.P., et al. The Human Phenotype Ontology in 2024: phenotypes around the world. Nucleic Acids Res. 2024;52:D1333–D1346. PubMed PMC

Havrilla J.M., Singaravelu A., Driscoll D.M., Minkovsky L., Helbig I., Medne L., Wang K., Krantz I., Desai B.R. PheNominal: an EHR-integrated web application for structured deep phenotyping at the point of care. BMC Med. Inf. Decis. Making. 2022;22:198. PubMed PMC

Daniali M., Galer P.D., Lewis-Smith D., Parthasarathy S., Kim E., Salvucci D.D., Miller J.M., Haag S., Helbig I. Enriching representation learning using 53 million patient notes through human phenotype ontology embedding. Artif. Intell. Med. 2023;139 PubMed PMC

Jacobsen J.O.B., Baudis M., Baynam G.S., Beckmann J.S., Beltran S., Buske O.J., Callahan T.J., Chute C.G., Courtot M., Danis D., et al. The GA4GH Phenopacket schema defines a computable representation of clinical data. Nat. Biotechnol. 2022;40:817–820. PubMed PMC

Ladewig M.S., Jacobsen J.O.B., Wagner A.H., Danis D., El Kassaby B., Gargano M., Groza T., Baudis M., Steinhaus R., Seelow D., et al. GA4GH Phenopackets: A Practical Introduction. Adv. Genet. 2023;4 PubMed PMC

Danis D., Jacobsen J.O.B., Wagner A.H., Groza T., Beckwith M.A., Rekerle L., Carmody L.C., Reese J., Hegde H., Ladewig M.S., et al. Phenopacket-tools: Building and validating GA4GH Phenopackets. PLoS One. 2023;18 PubMed PMC

Goar W., Babb L., Chamala S., Cline M., Freimuth R.R., Hart R.K., Kuzma K., Lee J., Nelson T., Prlić A., et al. Development and application of a computable genotype model in the GA4GH Variation Representation Specification. Pac. Symp. Biocomput. 2023;28:383–394. PubMed PMC

Haendel M., Su A., McMurry J. FAIR-TLC: Metrics to Assess Value of Biomedical Digital Repositories: Response to RFI NOT-OD-16-133. 2016. DOI

Girdea M., Dumitriu S., Fiume M., Bowdin S., Boycott K.M., Chénier S., Chitayat D., Faghfoury H., Meyn M.S., Ray P.N., et al. PhenoTips: Patient Phenotyping Software for Clinical and Research Use. Hum. Mutat. 2013;34:1057–1065. PubMed

Laurie S., Piscia D., Matalonga L., Corvó A., Fernández-Callejo M., Garcia-Linares C., Hernandez-Ferrer C., Luengo C., Martínez I., Papakonstantinou A., et al. The RD-Connect Genome-Phenome Analysis Platform: Accelerating diagnosis, research, and gene discovery for rare diseases. Hum. Mutat. 2022;43:717–733. PubMed PMC

Takahashi Y., Mizusawa H. Initiative on Rare and Undiagnosed Disease in Japan. JMA J. 2021;4:112–118. PubMed PMC

Cohen A.S.A., Farrow E.G., Abdelmoity A.T., Alaimo J.T., Amudhavalli S.M., Anderson J.T., Bansal L., Bartik L., Baybayan P., Belden B., et al. Genomic answers for children: Dynamic analyses of >1000 pediatric rare disease genomes. Genet. Med. 2022;24:1336–1348. PubMed

Smedley D., Jacobsen J.O.B., Jäger M., Köhler S., Holtgrewe M., Schubach M., Siragusa E., Zemojtel T., Buske O.J., Washington N.L., et al. Next-generation diagnostics and disease-gene discovery with the Exomiser. Nat. Protoc. 2015;10:2004–2015. PubMed PMC

Robinson P.N., Köhler S., Oellrich A., Sanger Mouse Genetics Project. Wang K., Mungall C.J., Lewis S.E., Washington N., Bauer S., Seelow D., et al. Improved exome prioritization of disease genes through cross-species phenotype comparison. Genome Res. 2014;24:340–348. PubMed PMC

Robinson P.N., Ravanmehr V., Jacobsen J.O.B., Danis D., Zhang X.A., Carmody L.C., Gargano M.A., Thaxton C.L., UNC Biocuration Core. Karlebach G., et al. Interpretable Clinical Genomics with a Likelihood Ratio Paradigm. Am. J. Hum. Genet. 2020;107:403–417. PubMed PMC

Danis D., Jacobsen J.O.B., Balachandran P., Zhu Q., Yilmaz F., Reese J., Haimel M., Lyon G.J., Helbig I., Mungall C.J., et al. SvAnna: efficient and accurate pathogenicity prediction of coding and regulatory structural variants in long-read genome sequencing. Genome Med. 2022;14:44. PubMed PMC

Zhao M., Havrilla J.M., Fang L., Chen Y., Peng J., Liu C., Wu C., Sarmady M., Botas P., Isla J., et al. Phen2Gene: rapid phenotype-driven gene prioritization for rare diseases. NAR Genom. Bioinform. 2020;2 PubMed PMC

Peng C., Dieck S., Schmid A., Ahmad A., Knaus A., Wenzel M., Mehnert L., Zirn B., Haack T., Ossowski S., et al. CADA: phenotype-driven gene prioritization based on a case-enriched knowledge graph. NAR Genom. Bioinform. 2021;3 PubMed PMC

Lochmüller H., Badowska D.M., Thompson R., Knoers N.V., Aartsma-Rus A., Gut I., Wood L., Harmuth T., Durudas A., Graessner H., et al. RD-Connect, NeurOmics and EURenOmics: collaborative European initiative for rare diseases. Eur. J. Hum. Genet. 2018;26:778–785. PubMed PMC

Zurek B., Ellwanger K., Vissers L.E.L.M., Schüle R., Synofzik M., Töpf A., de Voer R.M., Laurie S., Matalonga L., Gilissen C., et al. Solve-RD: systematic pan-European data sharing and collaborative analysis to solve rare diseases. Eur. J. Hum. Genet. 2021;29:1325–1331. PubMed PMC

Gonzaga-Jauregui C., Lotze T., Jamal L., Penney S., Campbell I.M., Pehlivan D., Hunter J.V., Woodbury S.L., Raymond G., Adesina A.M., et al. Mutations in VRK1 associated with complex motor and sensory axonal neuropathy plus microcephaly. JAMA Neurol. 2013;70:1491–1498. PubMed PMC

Fokkema I.F.A.C., Taschner P.E.M., Schaafsma G.C.P., Celli J., Laros J.F.J., den Dunnen J.T. LOVD v.2.0: the next generation in gene variant databases. Hum. Mutat. 2011;32:557–563. PubMed

Amberger J.S., Bocchini C.A., Scott A.F., Hamosh A. OMIM.org: leveraging knowledge across phenotype-gene relationships. Nucleic Acids Res. 2019;47:D1038–D1043. PubMed PMC

Shefchek K.A., Harris N.L., Gargano M., Matentzoglu N., Unni D., Brush M., Keith D., Conlin T., Vasilevsky N., Zhang X.A., et al. The Monarch Initiative in 2019: an integrative data and analytic platform connecting phenotypes to genotypes across species. Nucleic Acids Res. 2020;48:D704–D715. PubMed PMC

Wagner A.H., Babb L., Alterovitz G., Baudis M., Brush M., Cameron D.L., Cline M., Griffith M., Griffith O.L., Hunt S.E., et al. The GA4GH Variation Representation Specification: A computational framework for variation representation and federated identification. Cell Genom. 2021;1 doi: 10.1016/j.xgen.2021.100027. PubMed DOI PMC

Janecke A.R., Heinz-Erian P., Yin J., Petersen B.-S., Franke A., Lechner S., Fuchs I., Melancon S., Uhlig H.H., Travis S., et al. Reduced sodium/proton exchanger NHE3 activity causes congenital sodium diarrhea. Hum. Mol. Genet. 2015;24:6614–6623. PubMed PMC

Find record

Citation metrics

Loading data ...

Archiving options

Loading data ...