Nejvíce citovaný článek - PubMed ID 28628100
Hotspots of missense mutation identify neurodevelopmental disorder genes and functional domains
Neurodevelopmental disorders (NDDs), including severe paediatric epilepsy, autism and intellectual disabilities are heterogeneous conditions in which clinical genetic testing can often identify a pathogenic variant. For many of them, genetic therapies will be tested in this or the coming years in clinical trials. In contrast to first-generation symptomatic treatments, the new disease-modifying precision medicines require a genetic test-informed diagnosis before a patient can be enrolled in a clinical trial. However, even in 2022, most identified genetic variants in NDD genes are 'variants of uncertain significance'. To safely enrol patients in precision medicine clinical trials, it is important to increase our knowledge about which regions in NDD-associated proteins can 'tolerate' missense variants and which ones are 'essential' and will cause a NDD when mutated. In addition, knowledge about functionally indispensable regions in the 3D structure context of proteins can also provide insights into the molecular mechanisms of disease variants. We developed a novel consensus approach that overlays evolutionary, and population based genomic scores to identify 3D essential sites (Essential3D) on protein structures. After extensive benchmarking of AlphaFold predicted and experimentally solved protein structures, we generated the currently largest expert curated protein structure set for 242 NDDs and identified 14 377 Essential3D sites across 189 gene disorders associated proteins. We demonstrate that the consensus annotation of Essential3D sites improves prioritization of disease mutations over single annotations. The identified Essential3D sites were enriched for functional features such as intermembrane regions or active sites and discovered key inter-molecule interactions in protein complexes that were otherwise not annotated. Using the currently largest autism, developmental disorders, and epilepsies exome sequencing studies including >360 000 NDD patients and population controls, we found that missense variants at Essential3D sites are 8-fold enriched in patients. In summary, we developed a comprehensive protein structure set for 242 NDDs and identified 14 377 Essential3D sites in these. All data are available at https://es-ndd.broadinstitute.org for interactive visual inspection to enhance variant interpretation and development of mechanistic hypotheses for 242 NDDs genes. The provided resources will enhance clinical variant interpretation and in silico drug target development for NDD-associated genes and encoded proteins.
- Klíčová slova
- bioinformatics, genetics, neurodevelopmental disorder,
- MeSH
- dítě MeSH
- genetické testování MeSH
- lidé MeSH
- mentální retardace * genetika MeSH
- missense mutace MeSH
- mutace genetika MeSH
- neurovývojové poruchy * genetika MeSH
- Check Tag
- dítě MeSH
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Research Support, N.I.H., Extramural MeSH
Cancer genomes harbor numerous genomic alterations and many cancers accumulate thousands of nucleotide sequence variations. A prominent fraction of these mutations arises as a consequence of the off-target activity of DNA/RNA editing cytosine deaminases followed by the replication/repair of edited sites by DNA polymerases (pol), as deduced from the analysis of the DNA sequence context of mutations in different tumor tissues. We have used the weight matrix (sequence profile) approach to analyze mutagenesis due to Activation Induced Deaminase (AID) and two error-prone DNA polymerases. Control experiments using shuffled weight matrices and somatic mutations in immunoglobulin genes confirmed the power of this method. Analysis of somatic mutations in various cancers suggested that AID and DNA polymerases η and θ contribute to mutagenesis in contexts that almost universally correlate with the context of mutations in A:T and G:C sites during the affinity maturation of immunoglobulin genes. Previously, we demonstrated that AID contributes to mutagenesis in (de)methylated genomic DNA in various cancers. Our current analysis of methylation data from malignant lymphomas suggests that driver genes are subject to different (de)methylation processes than non-driver genes and, in addition to AID, the activity of pols η and θ contributes to the establishment of methylation-dependent mutation profiles. This may reflect the functional importance of interplay between mutagenesis in cancer and (de)methylation processes in different groups of genes. The resulting changes in CpG methylation levels and chromatin modifications are likely to cause changes in the expression levels of driver genes that may affect cancer initiation and/or progression.
- Klíčová slova
- computational biology, database, frequency matrices, gene expression, immunoglobulin genes, somatic hypermutation, tumor cells,
- Publikační typ
- časopisecké články MeSH
Most genes associated with neurodevelopmental disorders (NDDs) were identified with an excess of de novo mutations (DNMs) but the significance in case-control mutation burden analysis is unestablished. Here, we sequence 63 genes in 16,294 NDD cases and an additional 62 genes in 6,211 NDD cases. By combining these with published data, we assess a total of 125 genes in over 16,000 NDD cases and compare the mutation burden to nonpsychiatric controls from ExAC. We identify 48 genes (25 newly reported) showing significant burden of ultra-rare (MAF < 0.01%) gene-disruptive mutations (FDR 5%), six of which reach family-wise error rate (FWER) significance (p < 1.25E-06). Among these 125 targeted genes, we also reevaluate DNM excess in 17,426 NDD trios with 6,499 new autism trios. We identify 90 genes enriched for DNMs (FDR 5%; e.g., GABRG2 and UIMC1); of which, 61 reach FWER significance (p < 3.64E-07; e.g., CASZ1). In addition to doubling the number of patients for many NDD risk genes, we present phenotype-genotype correlations for seven risk genes (CTCF, HNRNPU, KCNQ3, ZBTB18, TCF12, SPEN, and LEO1) based on this large-scale targeted sequencing effort.
- MeSH
- CCCTC-vazebný faktor genetika MeSH
- DNA vazebné proteiny genetika MeSH
- draslíkový kanál KCNQ3 genetika MeSH
- genetická predispozice k nemoci * MeSH
- genetické asociační studie MeSH
- heterogenní jaderný ribonukleoprotein U genetika MeSH
- kohortové studie MeSH
- lidé MeSH
- mutace MeSH
- mutační analýza DNA MeSH
- neurovývojové poruchy genetika MeSH
- proteiny vázající RNA genetika MeSH
- represorové proteiny genetika MeSH
- studie případů a kontrol MeSH
- transkripční faktory bHLH genetika MeSH
- transkripční faktory genetika MeSH
- vysoce účinné nukleotidové sekvenování MeSH
- Check Tag
- lidé MeSH
- mužské pohlaví MeSH
- ženské pohlaví MeSH
- Publikační typ
- časopisecké články MeSH
- multicentrická studie MeSH
- práce podpořená grantem MeSH
- Research Support, N.I.H., Extramural MeSH
- Názvy látek
- CCCTC-vazebný faktor MeSH
- CTCF protein, human MeSH Prohlížeč
- DNA vazebné proteiny MeSH
- draslíkový kanál KCNQ3 MeSH
- heterogenní jaderný ribonukleoprotein U MeSH
- HNRNPU protein, human MeSH Prohlížeč
- KCNQ3 protein, human MeSH Prohlížeč
- LEO1 protein, human MeSH Prohlížeč
- proteiny vázající RNA MeSH
- represorové proteiny MeSH
- SPEN protein, human MeSH Prohlížeč
- TCF12 protein, human MeSH Prohlížeč
- transkripční faktory bHLH MeSH
- transkripční faktory MeSH
- ZBTB18 protein, human MeSH Prohlížeč