The Global Alliance for Genomics and Health (GA4GH) Phenopacket Schema was released in 2022 and approved by ISO as a standard for sharing clinical and genomic information about an individual, including phenotypic descriptions, numerical measurements, genetic information, diagnoses, and treatments. A phenopacket can be used as an input file for software that supports phenotype-driven genomic diagnostics and for algorithms that facilitate patient classification and stratification for identifying new diseases and treatments. There has been a great need for a collection of phenopackets to test software pipelines and algorithms. Here, we present phenopacket-store. Version 0.1.12 of phenopacket-store includes 4916 phenopackets representing 277 Mendelian and chromosomal diseases associated with 236 genes, and 2872 unique pathogenic alleles curated from 605 different publications. This represents the first large-scale collection of case-level, standardized phenotypic information derived from case reports in the literature with detailed descriptions of the clinical data and will be useful for many purposes, including the development and testing of software for prioritizing genes and diseases in diagnostic genomics, machine learning analysis of clinical phenotype data, patient stratification, and genotype-phenotype correlations. This corpus also provides best-practice examples for curating literature-derived data using the GA4GH Phenopacket Schema.
- Publikační typ
- časopisecké články MeSH
- preprinty MeSH
The Yamnaya archaeological complex appeared around 3300BCE across the steppes north of the Black and Caspian Seas, and by 3000BCE reached its maximal extent from Hungary in the west to Kazakhstan in the east. To localize the ancestral and geographical origins of the Yamnaya among the diverse Eneolithic people that preceded them, we studied ancient DNA data from 428 individuals of which 299 are reported for the first time, demonstrating three previously unknown Eneolithic genetic clines. First, a "Caucasus-Lower Volga" (CLV) Cline suffused with Caucasus hunter-gatherer (CHG) ancestry extended between a Caucasus Neolithic southern end in Neolithic Armenia, and a steppe northern end in Berezhnovka in the Lower Volga. Bidirectional gene flow across the CLV cline created admixed intermediate populations in both the north Caucasus, such as the Maikop people, and on the steppe, such as those at the site of Remontnoye north of the Manych depression. CLV people also helped form two major riverine clines by admixing with distinct groups of European hunter-gatherers. A "Volga Cline" was formed as Lower Volga people mixed with upriver populations that had more Eastern hunter-gatherer (EHG) ancestry, creating genetically hyper-variable populations as at Khvalynsk in the Middle Volga. A "Dnipro Cline" was formed as CLV people bearing both Caucasus Neolithic and Lower Volga ancestry moved west and acquired Ukraine Neolithic hunter-gatherer (UNHG) ancestry to establish the population of the Serednii Stih culture from which the direct ancestors of the Yamnaya themselves were formed around 4000BCE. This population grew rapidly after 3750-3350BCE, precipitating the expansion of people of the Yamnaya culture who totally displaced previous groups on the Volga and further east, while admixing with more sedentary groups in the west. CLV cline people with Lower Volga ancestry contributed four fifths of the ancestry of the Yamnaya, but also, entering Anatolia from the east, contributed at least a tenth of the ancestry of Bronze Age Central Anatolians, where the Hittite language, related to the Indo-European languages spread by the Yamnaya, was spoken. We thus propose that the final unity of the speakers of the "Proto-Indo-Anatolian" ancestral language of both Anatolian and Indo-European languages can be traced to CLV cline people sometime between 4400-4000 BCE.
The seventh iteration of the reference genome assembly for Rattus norvegicus-mRatBN7.2-corrects numerous misplaced segments and reduces base-level errors by approximately 9-fold and increases contiguity by 290-fold compared with its predecessor. Gene annotations are now more complete, improving the mapping precision of genomic, transcriptomic, and proteomics datasets. We jointly analyzed 163 short-read whole-genome sequencing datasets representing 120 laboratory rat strains and substrains using mRatBN7.2. We defined ∼20.0 million sequence variations, of which 18,700 are predicted to potentially impact the function of 6,677 genes. We also generated a new rat genetic map from 1,893 heterogeneous stock rats and annotated transcription start sites and alternative polyadenylation sites. The mRatBN7.2 assembly, along with the extensive analysis of genomic variations among rat strains, enhances our understanding of the rat genome, providing researchers with an expanded resource for studies involving rats.
- Klíčová slova
- Rnor_6.0, genetic map, heterogeneous stock, hybrid rat diversity panel, inbred strains, mRatBN7.2, phylogenetic tree, rat, recombinant inbred, reference genome,
- MeSH
- anotace sekvence MeSH
- genetická variace genetika MeSH
- genom * genetika MeSH
- genomika * MeSH
- krysa rodu Rattus MeSH
- sekvenování celého genomu MeSH
- zvířata MeSH
- Check Tag
- krysa rodu Rattus MeSH
- zvířata MeSH
- Publikační typ
- časopisecké články MeSH
Germ cells are essential to sexual reproduction. Across the animal kingdom, extracellular signaling isoprenoids, such as retinoic acids (RAs) in vertebrates and juvenile hormones (JHs) in invertebrates, facilitate multiple processes in reproduction. Here we investigated the role of these potent signaling molecules in embryonic germ cell development, using JHs in Drosophila melanogaster as a model system. In contrast to their established endocrine roles during larval and adult germline development, we found that JH signaling acts locally during embryonic development. Using an in vivo biosensor, we observed active JH signaling first within and near primordial germ cells (PGCs) as they migrate to the developing gonad. Through in vivo and in vitro assays, we determined that JHs are both necessary and sufficient for PGC migration. Analysis into the mechanisms of this newly uncovered paracrine JH function revealed that PGC migration was compromised when JHs were decreased or increased, suggesting that specific titers or spatiotemporal JH dynamics are required for robust PGC colonization of the gonad. Compromised PGC migration can impair fertility and cause germ cell tumors in many species, including humans. In mammals, retinoids have many roles in development and reproduction. We found that like JHs in Drosophila, RA was sufficient to impact mouse PGC migration in vitro. Together, our study reveals a previously unanticipated role of isoprenoids as local effectors of pre-gonadal PGC development and suggests a broadly shared mechanism in PGC migration.
- Klíčová slova
- Hmgcr, cell movement, embryonic development, gametogenesis, germ cells, gonad, juvenile hormones, ovary, retinoids, testis,
- MeSH
- Drosophila melanogaster * MeSH
- Drosophila MeSH
- gonády MeSH
- juvenilní hormony * MeSH
- lidé MeSH
- myši MeSH
- pohyb buněk MeSH
- savci MeSH
- terpeny MeSH
- zárodečné buňky MeSH
- zvířata MeSH
- Check Tag
- lidé MeSH
- myši MeSH
- zvířata MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Research Support, N.I.H., Extramural MeSH
- Názvy látek
- juvenilní hormony * MeSH
- terpeny MeSH
The Human Phenotype Ontology (HPO) is a widely used resource that comprehensively organizes and defines the phenotypic features of human disease, enabling computational inference and supporting genomic and phenotypic analyses through semantic similarity and machine learning algorithms. The HPO has widespread applications in clinical diagnostics and translational research, including genomic diagnostics, gene-disease discovery, and cohort analytics. In recent years, groups around the world have developed translations of the HPO from English to other languages, and the HPO browser has been internationalized, allowing users to view HPO term labels and in many cases synonyms and definitions in ten languages in addition to English. Since our last report, a total of 2239 new HPO terms and 49235 new HPO annotations were developed, many in collaboration with external groups in the fields of psychiatry, arthrogryposis, immunology and cardiology. The Medical Action Ontology (MAxO) is a new effort to model treatments and other measures taken for clinical management. Finally, the HPO consortium is contributing to efforts to integrate the HPO and the GA4GH Phenopacket Schema into electronic health records (EHRs) with the goal of more standardized and computable integration of rare disease data in EHRs.
Pre-mRNA splicing is a highly coordinated process. While its dysregulation has been linked to neurological deficits, our understanding of the underlying molecular and cellular mechanisms remains limited. We implicated pathogenic variants in U2AF2 and PRPF19, encoding spliceosome subunits in neurodevelopmental disorders (NDDs), by identifying 46 unrelated individuals with 23 de novo U2AF2 missense variants (including 7 recurrent variants in 30 individuals) and 6 individuals with de novo PRPF19 variants. Eight U2AF2 variants dysregulated splicing of a model substrate. Neuritogenesis was reduced in human neurons differentiated from human pluripotent stem cells carrying two U2AF2 hyper-recurrent variants. Neural loss of function (LoF) of the Drosophila orthologs U2af50 and Prp19 led to lethality, abnormal mushroom body (MB) patterning, and social deficits, which were differentially rescued by wild-type and mutant U2AF2 or PRPF19. Transcriptome profiling revealed splicing substrates or effectors (including Rbfox1, a third splicing factor), which rescued MB defects in U2af50-deficient flies. Upon reanalysis of negative clinical exomes followed by data sharing, we further identified 6 patients with NDD who carried RBFOX1 missense variants which, by in vitro testing, showed LoF. Our study implicates 3 splicing factors as NDD-causative genes and establishes a genetic network with hierarchy underlying human brain development and function.
- Klíčová slova
- Development, Genetic diseases, Genetics, Neurodevelopment, iPS cells,
- MeSH
- enzymy opravy DNA genetika MeSH
- genové regulační sítě MeSH
- jaderné proteiny genetika MeSH
- lidé MeSH
- missense mutace MeSH
- neurovývojové poruchy * genetika MeSH
- sestřih RNA MeSH
- sestřihové faktory genetika MeSH
- spliceozomy * genetika MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
- Názvy látek
- enzymy opravy DNA MeSH
- jaderné proteiny MeSH
- PRPF19 protein, human MeSH Prohlížeč
- sestřihové faktory MeSH
The cell type-specific expression of key transcription factors is central to development and disease. Brachyury/T/TBXT is a major transcription factor for gastrulation, tailbud patterning, and notochord formation; however, how its expression is controlled in the mammalian notochord has remained elusive. Here, we identify the complement of notochord-specific enhancers in the mammalian Brachyury/T/TBXT gene. Using transgenic assays in zebrafish, axolotl, and mouse, we discover three conserved Brachyury-controlling notochord enhancers, T3, C, and I, in human, mouse, and marsupial genomes. Acting as Brachyury-responsive, auto-regulatory shadow enhancers, in cis deletion of all three enhancers in mouse abolishes Brachyury/T/Tbxt expression selectively in the notochord, causing specific trunk and neural tube defects without gastrulation or tailbud defects. The three Brachyury-driving notochord enhancers are conserved beyond mammals in the brachyury/tbxtb loci of fishes, dating their origin to the last common ancestor of jawed vertebrates. Our data define the vertebrate enhancers for Brachyury/T/TBXTB notochord expression through an auto-regulatory mechanism that conveys robustness and adaptability as ancient basis for axis development.
- MeSH
- chorda dorsalis * metabolismus MeSH
- dánio pruhované * genetika metabolismus MeSH
- fetální proteiny genetika metabolismus MeSH
- lidé MeSH
- myši MeSH
- proteiny T-boxu genetika metabolismus MeSH
- savci genetika MeSH
- vývojová regulace genové exprese MeSH
- zvířata MeSH
- Check Tag
- lidé MeSH
- myši MeSH
- zvířata MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Research Support, N.I.H., Extramural MeSH
- Research Support, U.S. Gov't, Non-P.H.S. MeSH
- Názvy látek
- Brachyury protein MeSH Prohlížeč
- fetální proteiny MeSH
- proteiny T-boxu MeSH
- TBX1 protein, human MeSH Prohlížeč
- Tbx1 protein, mouse MeSH Prohlížeč
Polygenic risk scores (PRS) have great potential to guide precision colorectal cancer (CRC) prevention by identifying those at higher risk to undertake targeted screening. However, current PRS using European ancestry data have sub-optimal performance in non-European ancestry populations, limiting their utility among these populations. Towards addressing this deficiency, we expand PRS development for CRC by incorporating Asian ancestry data (21,731 cases; 47,444 controls) into European ancestry training datasets (78,473 cases; 107,143 controls). The AUC estimates (95% CI) of PRS are 0.63(0.62-0.64), 0.59(0.57-0.61), 0.62(0.60-0.63), and 0.65(0.63-0.66) in independent datasets including 1681-3651 cases and 8696-115,105 controls of Asian, Black/African American, Latinx/Hispanic, and non-Hispanic White, respectively. They are significantly better than the European-centric PRS in all four major US racial and ethnic groups (p-values < 0.05). Further inclusion of non-European ancestry populations, especially Black/African American and Latinx/Hispanic, is needed to improve the risk prediction and enhance equity in applying PRS in clinical practice.
- MeSH
- celogenomová asociační studie MeSH
- etnicita * genetika MeSH
- genetická predispozice k nemoci MeSH
- jednonukleotidový polymorfismus MeSH
- kolorektální nádory * diagnóza genetika MeSH
- lidé MeSH
- multifaktoriální dědičnost MeSH
- rizikové faktory MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
- Research Support, N.I.H., Extramural MeSH
Long-read sequencing is driving rapid progress in genome assembly across all major groups of life, including species of the family Drosophilidae, a longtime model system for genetics, genomics, and evolution. We previously developed a cost-effective hybrid Oxford Nanopore (ONT) long-read and Illumina short-read sequencing approach and used it to assemble 101 drosophilid genomes from laboratory cultures, greatly increasing the number of genome assemblies for this taxonomic group. The next major challenge is to address the laboratory culture bias in taxon sampling by sequencing genomes of species that cannot easily be reared in the lab. Here, we build upon our previous methods to perform amplification-free ONT sequencing of single wild flies obtained either directly from the field or from ethanol-preserved specimens in museum collections, greatly improving the representation of lesser studied drosophilid taxa in whole-genome data. Using Illumina Novaseq X Plus and ONT P2 sequencers with R10.4.1 chemistry, we set a new benchmark for inexpensive hybrid genome assembly at US $150 per genome while assembling genomes from as little as 35 ng of genomic DNA from a single fly. We present 183 new genome assemblies for 179 species as a resource for drosophilid systematics, phylogenetics, and comparative genomics. Of these genomes, 62 are from pooled lab strains and 121 from single adult flies. Despite the sample limitations of working with small insects, most single-fly diploid assemblies are comparable in contiguity (>1Mb contig N50), completeness (>98% complete dipteran BUSCOs), and accuracy (>QV40 genome-wide with ONT R10.4.1) to assemblies from inbred lines. We present a well-resolved multi-locus phylogeny for 360 drosophilid and 4 outgroup species encompassing all publicly available (as of August 2023) genomes for this group. Finally, we present a Progressive Cactus whole-genome, reference-free alignment built from a subset of 298 suitably high-quality drosophilid genomes. The new assemblies and alignment, along with updated laboratory protocols and computational pipelines, are released as an open resource and as a tool for studying evolution at the scale of an entire insect family.
- MeSH
- dědičné nepolypózní kolorektální nádory * genetika MeSH
- kolorektální nádory * genetika MeSH
- lidé MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Research Support, N.I.H., Extramural MeSH
- Research Support, N.I.H., Intramural MeSH
- Research Support, U.S. Gov't, Non-P.H.S. MeSH
- Research Support, U.S. Gov't, P.H.S. MeSH