A novel Synthetic phenotype association study approach reveals the landscape of association for genomic variants and phenotypes

. 2022 Dec ; 42 () : 117-133. [epub] 20220412

Jazyk angličtina Země Egypt Médium print-electronic

Typ dokumentu časopisecké články, práce podpořená grantem

Perzistentní odkaz   https://www.medvik.cz/link/pmid36513408
Odkazy

PubMed 36513408
PubMed Central PMC9788956
DOI 10.1016/j.jare.2022.04.004
PII: S2090-1232(22)00099-6
Knihovny.cz E-zdroje

INTRODUCTION: Genome-Wide Association Studies (GWAS) identify tagging variants in the genome that are statistically associated with the phenotype because of their linkage disequilibrium (LD) relationship with the causative mutation (CM). When both low-density genotyped accession panels with phenotypes and resequenced data accession panels are available, tagging variants can assist with post-GWAS challenges in CM discovery. OBJECTIVES: Our objective was to identify additional GWAS evaluation criteria to assess correspondence between genomic variants and phenotypes, as well as enable deeper analysis of the localized landscape of association. METHODS: We used genomic variant positions as Synthetic phenotypes in GWAS that we named "Synthetic phenotype association study" (SPAS). The extreme case of SPAS is what we call an "Inverse GWAS" where we used CM positions of cloned soybean genes. We developed and validated the Accuracy concept as a measure of the correspondence between variant positions and phenotypes. RESULTS: The SPAS approach demonstrated that the genotype status of an associated variant used as a Synthetic phenotype enabled us to explore the relationships between tagging variants and CMs, and further, that utilizing CMs as Synthetic phenotypes in Inverse GWAS illuminated the landscape of association. We implemented the Accuracy calculation for a curated accession panel to an online Accuracy calculation tool (AccuTool) as a resource for gene identification in soybean. We demonstrated our concepts on three examples of soybean cloned genes. As a result of our findings, we devised an enhanced "GWAS to Genes" analysis (Synthetic phenotype to CM strategy, SP2CM). Using SP2CM, we identified a CM for a novel gene. CONCLUSION: The SP2CM strategy utilizing Synthetic phenotypes and the Accuracy calculation of correspondence provides crucial information to assist researchers in CM discovery. The impact of this work is a more effective evaluation of landscapes of GWAS associations.

Zobrazit více v PubMed

Tam V., Patel N., Turcotte M., Bossé Y., Paré G., Meyre D. Benefits and limitations of genome-wide association studies. Nat Rev Genet. 2019;20(8):467–484. doi: 10.1038/s41576-019-0127-1. PubMed DOI

Visscher P.M., Wray N.R., Zhang Q., Sklar P., McCarthy M.I., Brown M.A., et al. 10 years of GWAS discovery: biology, function, and translation. Am J Hum Genet. 2017;101(1):5–22. doi: 10.1016/j.ajhg.2017.06.005. PubMed DOI PMC

Filho DF, Filho JS de SB, Regitano LC de A, Alencar MM de, Alves RR, Meirelles SLC. Tournaments between markers as a strategy to enhance genomic predictions. PLoS One 2019;14:e0217283. 10.1371/journal.pone.0217283. PubMed DOI PMC

Spain S.L., Barrett J.C. Strategies for fine-mapping complex traits. Hum Mol Genet. 2015;24(R1):R111–R119. doi: 10.1093/hmg/ddv260. PubMed DOI PMC

Liu B., Gloudemans M.J., Rao A.S., Ingelsson E., Montgomery S.B. Abundant associations with gene expression complicate GWAS follow-up. Nat Genet. 2019;51(5):768–769. doi: 10.1038/s41588-019-0404-0. PubMed DOI PMC

Zhou Z., Jiang Y.u., Wang Z., Gou Z., Lyu J., Li W., et al. Resequencing 302 wild and cultivated accessions identifies genes related to domestication and improvement in soybean. Nat Biotechnol. 2015;33(4):408–414. doi: 10.1038/nbt.3096. PubMed DOI

Valliyodan B., Dan Qiu, Patil G., Zeng P., Huang J., Dai L.u., et al. Landscape of genomic diversity and trait discovery in soybean. Sci Rep. 2016;6(1) doi: 10.1038/srep23598. PubMed DOI PMC

Kim J.Y., Jeong S., Kim K.H., Lim W.J., Lee H.Y., Jeong N., et al. Dissection of soybean populations according to selection signatures based on whole-genome sequences. GigaScience. 2019;8:1–19. doi: 10.1093/gigascience/giz151. PubMed DOI PMC

Liu Y., Du H., Li P., Shen Y., Peng H., Liu S., et al. Pan-genome of wild and cultivated soybeans. Cell. 2020;182(1):162–176.e13. doi: 10.1016/j.cell.2020.05.023. PubMed DOI

Zhang H, Jiang H, Hu Z, Song Q, An YC. A versatile resource of 1500 diverse wild and cultivated soybean genomes for post-genomics research. BioRxiv 2020:2020.11.16.383950. PubMed PMC

Valliyodan B., Brown A.V., Wang J., Patil G., Liu Y., Otyama P.I., et al. Genetic variation among 481 diverse soybean accessions, inferred from genomic re-sequencing. Sci Data. 2021;8(1) doi: 10.1038/s41597-021-00834-w. PubMed DOI PMC

Langewisch T., Zhang H., Vincent R., Joshi T., Xu D., Bilyeu K., et al. Major soybean maturity gene haplotypes revealed by SNPViz analysis of 72 sequenced soybean genomes. PLoS ONE. 2014;9(4):e94150. doi: 10.1371/journal.pone.0094150. PubMed DOI PMC

Zeng S, Skrabisova M, Lyu Z, Chan YO, Bilyeu K, Joshi T. SNPViz v2.0: A web-based tool for enhanced haplotype analysis using large scale resequencing datasets and discovery of phenotypes causative gene using allelic variations. In: Proc. - 2020 IEEE Int. Conf. Bioinforma. Biomed. BIBM 2020, Institute of Electrical and Electronics Engineers Inc.; 2020, p. 1408–15. 10.1109/BIBM49941.2020.9313539. DOI

Zeng S., Škrabišová M., Lyu Z., Chan Y.O., Dietz N., Bilyeu K., et al. Application of SNPViz v2.0 using next-generation sequencing data sets in the discovery of potential causative mutations in candidate genes associated with phenotypes. IJDMB. 2021;25(1/2):65. doi: 10.1504/IJDMB.2021.116886. DOI

Sachidanandam R., Weissman D., Schmidt S.C., Kakol J.M., Stein L.D., Marth G., et al. A map of human genome sequence variation containing 1.42 million single nucleotide polymorphisms. Nature. 2001;409(6822):928–933. doi: 10.1038/35057149. PubMed DOI

Miranda C., Culp C., Škrabišová M., Joshi T., Belzile F., Grant D.M., et al. Molecular tools for detecting Pdh1 can improve soybean breeding efficiency by reducing yield losses due to pod shatter. Mol Breed. 2019;39:1–9. doi: 10.1007/s11032-019-0935-1. DOI

Langewisch T., Lenis J., Jiang G.-L., Wang D., Pantalone V., Bilyeu K. The development and use of a molecular model for soybean maturity groups. BMC Plant Biol. 2017;17:91. doi: 10.1186/s12870-017-1040-4. PubMed DOI PMC

Li X, Shi Z, Qie Q, Gao J, Wang X, Han Y. CandiHap: a toolkit for haplotype analysis for sequence of samples and fast identification of candidate causal gene(s) in genome-wide association study. Cold Spring Harbor Laboratory; 2020. 10.1101/2020.02.27.967539. DOI

Shi A., Buckley B., Mou B., Motes D., Morris J.B., Ma J., et al. Association analysis of cowpea bacterial blight resistance in USDA cowpea germplasm. Euphytica. 2016;208(1):143–155. doi: 10.1007/s10681-015-1610-1. DOI

Ravelombola W.S., Qin J., Shi A., Nice L., Bao Y., Lorenz A., et al. Genome-wide association study and genomic selection for tolerance of soybean biomass to soybean cyst nematode infestation. PLoS ONE. 2020;15(7):e0235089. doi: 10.1371/journal.pone.0235089. PubMed DOI PMC

Fang C., Ma Y., Wu S., Liu Z., Wang Z., Yang R., et al. Genome-wide association studies dissect the genetic networks underlying agronomical traits in soybean. Genome Biol. 2017;18(1) doi: 10.1186/s13059-017-1289-9. PubMed DOI PMC

Valliyodan B., Cannon S.B., Bayer P.E., Shu S., Brown A.V., Ren L., et al. Construction and comparison of three reference-quality genome assemblies for soybean. Plant J. 2019;100(5):1066–1082. doi: 10.1111/tpj.14500. PubMed DOI

Song Q., Hyten D.L., Jia G., Quigley C.V., Fickus E.W., Nelson R.L., et al. Development and evaluation of SoySNP50K, a high-density genotyping array for soybean. PLoS ONE. 2013;8(1):e54985. doi: 10.1371/journal.pone.0054985. PubMed DOI PMC

Bandillo N., Jarquin D., Song Q., Nelson R., Cregan P., Specht J., et al. A population structure and genome-wide association analysis on the USDA soybean germplasm collection. Plant Genome. 2015;8(3) doi: 10.3835/plantgenome2015.04.0024. PubMed DOI

Bandillo N.B., Lorenz A.J., Graef G.L., Jarquin D., Hyten D.L., Nelson R.L., et al. Genome-wide association mapping of qualitatively inherited traits in a germplasm collection. Plant Genome. 2017;10(2) doi: 10.3835/plantgenome2016.06.0054. PubMed DOI

Zhang J., Singh A.K. Genetic control and geo-climate adaptation of pod dehiscence provide novel insights into soybean domestication. G3. 2020;10:545–554. doi: 10.1534/g3.119.400876. PubMed DOI PMC

Liu Y., Khan S.M., Wang J., Rynge M., Zhang Y., Zeng S., et al. PGen: Large-scale genomic variations analysis workflow and browser in SoyKB. BMC Bioinf. 2016;17(S13) doi: 10.1186/s12859-016-1227-y. PubMed DOI PMC

Joshi T., Patil K., Fitzpatrick M.R., Franklin L.D., Yao Q., Cook J.R., et al. Soybean Knowledge Base (SoyKB): a web resource for soybean translational genomics. BMC Genomics. 2012;13(S1) doi: 10.1186/1471-2164-13-S1-S15. PubMed DOI PMC

McKenna A., Hanna M., Banks E., Sivachenko A., Cibulskis K., Kernytsky A., et al. The genome analysis toolkit: A MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 2010;20(9):1297–1303. doi: 10.1101/gr.107524.110. PubMed DOI PMC

Joshi T, Wang J, Zhang H, Chen S, Zeng S, Xu B, et al. The evolution of soybean knowledge base (SoyKB). Methods Mol. Biol., vol. 1533, Humana Press Inc.; 2017, p. 149–59. 10.1007/978-1-4939-6658-5_7. PubMed DOI

Cingolani P., Platts A., Wang L.L., Coon M., Nguyen T., Wang L., et al. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly (Austin) 2012;6(2):80–92. doi: 10.4161/fly.19695. PubMed DOI PMC

Hill JL, Peregrine EK, Sprau GL, Cremeens CR, Nelson RL, Kenty MM, et al. Evaluation of the USDA soybean germplasm collection: maturity groups 000-IV (PI 578371-PI 612761). US Dep Agric Tech Bull 2001:1894.

Zabala G., Vodkin L.O. A rearrangement resulting in small tandem repeats in the F3′5′H gene of white flower genotypes is associated with the soybean W1 locus. Crop Sci. 2007;47:113–124. doi: 10.2135/cropsci2006.12.0838tpg. DOI

Zhang Z., Ersoz E., Lai C.-Q., Todhunter R.J., Tiwari H.K., Gore M.A., et al. Mixed linear model approach adapted for genome-wide association studies. Nat Genet. 2010;42(4):355–360. doi: 10.1038/ng.546. PubMed DOI PMC

Tang Y., Liu X., Wang J., Li M., Wang Q., Tian F., et al. GAPIT version 2: An enhanced integrated tool for genomic association and prediction. Plant Genome. 2016;9(2) doi: 10.3835/plantgenome2015.11.0120. PubMed DOI

Bradbury P.J., Zhang Z., Kroon D.E., Casstevens T.M., Ramdoss Y., Buckler E.S. TASSEL: software for association mapping of complex traits in diverse samples. Bioinformatics. 2007;23(19):2633–2635. doi: 10.1093/bioinformatics/btm308. PubMed DOI

Metz C.E. Basic principles of ROC analysis. Semin Nucl Med. 1978;8(4):283–298. doi: 10.1016/S0001-2998(78)80014-2. PubMed DOI

Chang W, Cheng J, Allaire J, Xie Y, McPherson J. shiny: web application framework for R. R package version 1.3.2. 2019.

Phanstiel DH, Boyle AP, Araya CL, Snyder M. Sushi: An R/Bioconductor package for visualizing genomic data. R Packag Version 1260 2020.

Wickham H. Springer Science & Business Media; 2016. Programming with ggplot2.

Funatsuki H., Suzuki M., Hirose A., Inaba H., Yamada T., Hajika M., et al. Molecular basis of a shattering resistance boosting global dissemination of soybean. Proc Natl Acad Sci U S A. 2014;111(50):17797–17802. doi: 10.1073/pnas.1417282111. PubMed DOI PMC

Palmer RG, Pfeiffer TW, Buss GR, Kilen TC. Qualitative genetics. In: Shibles RM, Harper JE, Wilson RF, Shoemaker RC, editors. Soybeans Improv. Prod. Uses. 3rd ed., John Wiley & Sons, Ltd; 2016, p. 137–233. 10.2134/agronmonogr16.3ed.c5. DOI

Liu B., Watanabe S., Uchiyama T., Kong F., Kanazawa A., Xia Z., et al. The soybean stem growth habit gene Dt1 is an ortholog of arabidopsis TERMINAL FLOWER1. Plant Physiol. 2010;153(1):198–210. doi: 10.1104/pp.109.150607. PubMed DOI PMC

Tian Z., Wang X., Lee R., Li Y., Specht J.E., Nelson R.L., et al. Artificial selection for determinate growth habit in soybean. Proc Natl Acad Sci U S A. 2010;107(19):8563–8568. doi: 10.1073/pnas.1000088107. PubMed DOI PMC

Dong Y., Yang X., Liu J., Wang B.-H., Liu B.-L., Wang Y.-Z. Pod shattering resistance associated with domestication is mediated by a NAC gene in soybean. Nat Commun. 2014;5:3352. doi: 10.1038/ncomms4352. PubMed DOI

Wang M., Li W., Fang C., Xu F., Liu Y., Wang Z., et al. Parallel selection on a dormancy gene during domestication of crops from multiple families. Nat Genet. 2018;50(10):1435–1441. doi: 10.1038/s41588-018-0229-2. PubMed DOI

Sun L., Miao Z., Cai C., Zhang D., Zhao M., Wu Y., et al. GmHs1-1, encoding a calcineurin-like protein, controls hard-seededness in soybean. Nat Genet. 2015;47(8):939–943. doi: 10.1038/ng.3339. PubMed DOI

Yan F., Githiri S.M., Liu Y., Sang Y., Wang Q., Takahashi R. Loss-of-Function Mutation of Soybean R2R3 MYB Transcription Factor Dilutes Tawny Pubescence Color. Front Plant Sci. 2020;10:1–12. doi: 10.3389/fpls.2019.01809. PubMed DOI PMC

Xia Z., Zhai H., Wu H., Xu K., Watanabe S., Harada K. The Synchronized Efforts to Decipher the Molecular Basis for Soybean Maturity Loci E1, E2, and E3 That Regulate Flowering and Maturity. Front Plant Sci. 2021;12 doi: 10.3389/FPLS.2021.632754. PubMed DOI PMC

Xia Z., Watanabe S., Yamada T., Tsubokura Y., Nakashima H., Zhai H., et al. Positional cloning and characterization reveal the molecular basis for soybean maturity locus E1 that regulates photoperiodic flowering. Proc Natl Acad Sci U S A. 2012;109(32) doi: 10.1073/pnas.1117982109. PubMed DOI PMC

Zabala G., Vodkin L. Cloning of the Pleiotropic T Locus in Soybean and Two Recessive Alleles That Differentially Affect Structure and Expression of the Encoded Flavonoid 3′ Hydroxylase. Genetics. 2003;163:295–309. doi: 10.1093/genetics/163.1.295. PubMed DOI PMC

Gillman J.D., Tetlow A., Lee J.-D., Shannon J., Bilyeu K. Loss-of-function mutations affecting a specific Glycine max R2R3 MYB transcription factor result in brown hilum and brown seed coats. BMC Plant Biol. 2011;11(1):155. doi: 10.1186/1471-2229-11-155. PubMed DOI PMC

Watanabe S., Xia Z., Hideshima R., Tsubokura Y., Sato S., Yamanaka N., et al. A Map-Based Cloning Strategy Employing a Residual Heterozygous Line Reveals that the GIGANTEA Gene Is Involved in Soybean Maturity and Flowering. Genetics. 2011;188(2):395–407. doi: 10.1534/genetics.110.125062. PubMed DOI PMC

Lu S., Dong L., Fang C., Liu S., Kong L., Cheng Q., et al. Stepwise selection on homeologous PRR genes controlling flowering and maturity during soybean domestication. Nat Genet. 2020;52(4):428–436. doi: 10.1038/s41588-020-0604-7. PubMed DOI

Zhang D., Sun L., Li S., Wang W., Ding Y., Swarm S.A., et al. Elevation of soybean seed oil content through selection for seed coat shininess. Nat Plants. 2018;4(1):30–35. doi: 10.1038/s41477-017-0084-7. PubMed DOI

Ping J., Liu Y., Sun L., Zhao M., Li Y., She M., et al. Dt2 is a gain-of-function MADS-domain factor gene that specifies semideterminacy in soybean. Plant Cell. 2014;26(7):2831–2842. doi: 10.1105/tpc.114.126938. PubMed DOI PMC

Jeong N., Suh S.J., Kim M.-H., Lee S., Moon J.-K., Kim H.S., et al. Ln is a key regulator of leaflet shape and number of seeds per pod in soybean. Plant Cell. 2012;24(12):4807–4818. doi: 10.1105/tpc.112.104968. PubMed DOI PMC

Fang C., Li W., Li G., Wang Z., Zhou Z., Ma Y., et al. Cloning of Ln Gene Through Combined Approach of Map-based Cloning and Association Study in Soybean. J Genet Genomics. 2013;40(2):93–96. doi: 10.1016/j.jgg.2013.01.002. PubMed DOI

Sesia M., Bates S., Candès E., Marchini J., Sabatti C. False discovery rate control in genome-wide association studies with population structure. Proc Natl Acad Sci U S A. 2021;118(40) doi: 10.1073/pnas.2105841118. PubMed DOI PMC

Deng Y., Pan W. Improved use of small reference panels for conditional and joint analysis with gwas summary statistics. Genetics. 2018;209:401–408. doi: 10.1534/genetics.118.300813. PubMed DOI PMC

Benner C., Havulinna A.S., Järvelin M.-R., Salomaa V., Ripatti S., Pirinen M. Prospects of fine-mapping trait-associated genomic regions by using summary statistics from genome-wide association studies. Am J Hum Genet. 2017;101(4):539–551. doi: 10.1016/j.ajhg.2017.08.012. PubMed DOI PMC

Zhao H., Yao W., Ouyang Y., Yang W., Wang G., Lian X., et al. RiceVarMap: A comprehensive database of rice genomic variations. Nucleic Acids Res. 2015;43(D1):D1018–D1022. doi: 10.1093/nar/gku894. PubMed DOI PMC

Valluru R., Gazave E.E., Fernandes S.B., Ferguson J.N., Lozano R., Hirannaiah P., et al. Deleterious mutation burden and its association with complex traits in sorghum (Sorghum bicolor) Genetics. 2019;211(3):1075–1087. doi: 10.1534/genetics.118.301742. PubMed DOI PMC

Mockler T. A Complete-Sequence Population for Pan-Genome Analysis of Sorghum 2016. 10.25585/1488180. DOI

Alonso-Blanco C., Andrade J., Becker C., Bemm F., Bergelson J., Borgwardt K.M., et al. 1,135 Genomes Reveal the Global Pattern of Polymorphism in Arabidopsis thaliana. Cell. 2016;166(2):481–491. doi: 10.1016/j.cell.2016.05.063. PubMed DOI PMC

Bauchet G.J., Bett K.E., Cameron C.T., Campbell J.D., Cannon E.K.S., Cannon S.B., et al. The future of legume genetic data resources: Challenges, opportunities, and priorities. Legum Sci. 2019;1(1) doi: 10.1002/leg3.16. DOI

Miao C., Xu Y., Liu S., Schnable P.S., Schnable J.C. Increased power and accuracy of causal locus identification in time-series genome-wide association in sorghum. Plant Physiol. 2020;183(4):1898–1909. doi: 10.1104/pp.20.00277. PubMed DOI PMC

Li D., Liu Q., Schnable P.S. TWAS results are complementary to and less affected by linkage disequilibrium than GWAS. Plant Physiol. 2021;186(4):1800–1811. doi: 10.1093/plphys/kiab161. PubMed DOI PMC

Najít záznam

Citační ukazatele

Nahrávání dat ...

Možnosti archivace

Nahrávání dat ...