Recent advances in AI-based methods have revolutionized the field of structural biology. Concomitantly, high-throughput sequencing and functional genomics have generated genetic variants at an unprecedented scale. However, efficient tools and resources are needed to link disparate data types-to 'map' variants onto protein structures, to better understand how the variation causes disease, and thereby design therapeutics. Here we present the Genomics 2 Proteins portal ( https://g2p.broadinstitute.org/ ): a human proteome-wide resource that maps 20,076,998 genetic variants onto 42,413 protein sequences and 77,923 structures, with a comprehensive set of structural and functional features. Additionally, the Genomics 2 Proteins portal allows users to interactively upload protein residue-wise annotations (for example, variants and scores) as well as the protein structure beyond databases to establish the connection between genomics to proteins. The portal serves as an easy-to-use discovery tool for researchers and scientists to hypothesize the structure-function relationship between natural or synthetic variations and their molecular phenotypes.
- MeSH
- Databases, Protein * MeSH
- Genetic Variation MeSH
- Genetic Testing methods MeSH
- Genomics * methods MeSH
- Protein Conformation MeSH
- Humans MeSH
- Proteins genetics chemistry MeSH
- Proteome genetics MeSH
- Amino Acid Sequence MeSH
- Software MeSH
- Check Tag
- Humans MeSH
- Publication type
- Journal Article MeSH
Here, in a multi-ancestry genome-wide association study meta-analysis of kidney cancer (29,020 cases and 835,670 controls), we identified 63 susceptibility regions (50 novel) containing 108 independent risk loci. In analyses stratified by subtype, 52 regions (78 loci) were associated with clear cell renal cell carcinoma (RCC) and 6 regions (7 loci) with papillary RCC. Notably, we report a variant common in African ancestry individuals ( rs7629500 ) in the 3' untranslated region of VHL, nearly tripling clear cell RCC risk (odds ratio 2.72, 95% confidence interval 2.23-3.30). In cis-expression quantitative trait locus analyses, 48 variants from 34 regions point toward 83 candidate genes. Enrichment of hypoxia-inducible factor-binding sites underscores the importance of hypoxia-related mechanisms in kidney cancer. Our results advance understanding of the genetic architecture of kidney cancer, provide clues for functional investigation and enable generation of a validated polygenic risk score with an estimated area under the curve of 0.65 (0.74 including risk factors) among European ancestry individuals.
- MeSH
- White People genetics MeSH
- Genome-Wide Association Study * MeSH
- Genetic Predisposition to Disease * MeSH
- Polymorphism, Single Nucleotide * MeSH
- Carcinoma, Renal Cell * genetics MeSH
- Humans MeSH
- Quantitative Trait Loci * MeSH
- Von Hippel-Lindau Tumor Suppressor Protein genetics MeSH
- Kidney Neoplasms * genetics MeSH
- Case-Control Studies MeSH
- Check Tag
- Humans MeSH
- Publication type
- Journal Article MeSH
- Meta-Analysis MeSH
Pediatric steroid-sensitive nephrotic syndrome (pSSNS) is the most common childhood glomerular disease. Previous genome-wide association studies (GWAS) identified a risk locus in the HLA Class II region and three additional independent risk loci. But the genetic architecture of pSSNS, and its genetically driven pathobiology, is largely unknown. Here, we conduct a multi-population GWAS meta-analysis in 38,463 participants (2440 cases). We then conduct conditional analyses and population specific GWAS. We discover twelve significant associations-eight from the multi-population meta-analysis (four novel), two from the multi-population conditional analysis (one novel), and two additional novel loci from the European meta-analysis. Fine-mapping implicates specific amino acid haplotypes in HLA-DQA1 and HLA-DQB1 driving the HLA Class II risk locus. Non-HLA loci colocalize with eQTLs of monocytes and numerous T-cell subsets in independent datasets. Colocalization with kidney eQTLs is lacking but overlap with kidney cell open chromatin suggests an uncharacterized disease mechanism in kidney cells. A polygenic risk score (PRS) associates with earlier disease onset. Altogether, these discoveries expand our knowledge of pSSNS genetic architecture across populations and provide cell-specific insights into its molecular drivers. Evaluating these associations in additional cohorts will refine our understanding of population specificity, heterogeneity, and clinical and molecular associations.
- MeSH
- Genome-Wide Association Study * MeSH
- Child MeSH
- Genetic Predisposition to Disease MeSH
- Haplotypes MeSH
- Polymorphism, Single Nucleotide MeSH
- Humans MeSH
- Nephrotic Syndrome * genetics MeSH
- Risk Factors MeSH
- Check Tag
- Child MeSH
- Humans MeSH
- Publication type
- Journal Article MeSH
- Meta-Analysis MeSH
- Research Support, Non-U.S. Gov't MeSH
- Research Support, N.I.H., Extramural MeSH
- Research Support, U.S. Gov't, Non-P.H.S. MeSH