A novel Synthetic phenotype association study approach reveals the landscape of association for genomic variants and phenotypes
Jazyk angličtina Země Egypt Médium print-electronic
Typ dokumentu časopisecké články, práce podpořená grantem
PubMed
36513408
PubMed Central
PMC9788956
DOI
10.1016/j.jare.2022.04.004
PII: S2090-1232(22)00099-6
Knihovny.cz E-zdroje
- Klíčová slova
- GWAS, Genomics, Genotyping, Phenotyping, Resequencing, Soybean,
- MeSH
- celogenomová asociační studie * MeSH
- fenotyp MeSH
- genomika * MeSH
- genotyp MeSH
- vazebná nerovnováha MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
INTRODUCTION: Genome-Wide Association Studies (GWAS) identify tagging variants in the genome that are statistically associated with the phenotype because of their linkage disequilibrium (LD) relationship with the causative mutation (CM). When both low-density genotyped accession panels with phenotypes and resequenced data accession panels are available, tagging variants can assist with post-GWAS challenges in CM discovery. OBJECTIVES: Our objective was to identify additional GWAS evaluation criteria to assess correspondence between genomic variants and phenotypes, as well as enable deeper analysis of the localized landscape of association. METHODS: We used genomic variant positions as Synthetic phenotypes in GWAS that we named "Synthetic phenotype association study" (SPAS). The extreme case of SPAS is what we call an "Inverse GWAS" where we used CM positions of cloned soybean genes. We developed and validated the Accuracy concept as a measure of the correspondence between variant positions and phenotypes. RESULTS: The SPAS approach demonstrated that the genotype status of an associated variant used as a Synthetic phenotype enabled us to explore the relationships between tagging variants and CMs, and further, that utilizing CMs as Synthetic phenotypes in Inverse GWAS illuminated the landscape of association. We implemented the Accuracy calculation for a curated accession panel to an online Accuracy calculation tool (AccuTool) as a resource for gene identification in soybean. We demonstrated our concepts on three examples of soybean cloned genes. As a result of our findings, we devised an enhanced "GWAS to Genes" analysis (Synthetic phenotype to CM strategy, SP2CM). Using SP2CM, we identified a CM for a novel gene. CONCLUSION: The SP2CM strategy utilizing Synthetic phenotypes and the Accuracy calculation of correspondence provides crucial information to assist researchers in CM discovery. The impact of this work is a more effective evaluation of landscapes of GWAS associations.
Zobrazit více v PubMed
Tam V., Patel N., Turcotte M., Bossé Y., Paré G., Meyre D. Benefits and limitations of genome-wide association studies. Nat Rev Genet. 2019;20(8):467–484. doi: 10.1038/s41576-019-0127-1. PubMed DOI
Visscher P.M., Wray N.R., Zhang Q., Sklar P., McCarthy M.I., Brown M.A., et al. 10 years of GWAS discovery: biology, function, and translation. Am J Hum Genet. 2017;101(1):5–22. doi: 10.1016/j.ajhg.2017.06.005. PubMed DOI PMC
Filho DF, Filho JS de SB, Regitano LC de A, Alencar MM de, Alves RR, Meirelles SLC. Tournaments between markers as a strategy to enhance genomic predictions. PLoS One 2019;14:e0217283. 10.1371/journal.pone.0217283. PubMed DOI PMC
Spain S.L., Barrett J.C. Strategies for fine-mapping complex traits. Hum Mol Genet. 2015;24(R1):R111–R119. doi: 10.1093/hmg/ddv260. PubMed DOI PMC
Liu B., Gloudemans M.J., Rao A.S., Ingelsson E., Montgomery S.B. Abundant associations with gene expression complicate GWAS follow-up. Nat Genet. 2019;51(5):768–769. doi: 10.1038/s41588-019-0404-0. PubMed DOI PMC
Zhou Z., Jiang Y.u., Wang Z., Gou Z., Lyu J., Li W., et al. Resequencing 302 wild and cultivated accessions identifies genes related to domestication and improvement in soybean. Nat Biotechnol. 2015;33(4):408–414. doi: 10.1038/nbt.3096. PubMed DOI
Valliyodan B., Dan Qiu, Patil G., Zeng P., Huang J., Dai L.u., et al. Landscape of genomic diversity and trait discovery in soybean. Sci Rep. 2016;6(1) doi: 10.1038/srep23598. PubMed DOI PMC
Kim J.Y., Jeong S., Kim K.H., Lim W.J., Lee H.Y., Jeong N., et al. Dissection of soybean populations according to selection signatures based on whole-genome sequences. GigaScience. 2019;8:1–19. doi: 10.1093/gigascience/giz151. PubMed DOI PMC
Liu Y., Du H., Li P., Shen Y., Peng H., Liu S., et al. Pan-genome of wild and cultivated soybeans. Cell. 2020;182(1):162–176.e13. doi: 10.1016/j.cell.2020.05.023. PubMed DOI
Zhang H, Jiang H, Hu Z, Song Q, An YC. A versatile resource of 1500 diverse wild and cultivated soybean genomes for post-genomics research. BioRxiv 2020:2020.11.16.383950. PubMed PMC
Valliyodan B., Brown A.V., Wang J., Patil G., Liu Y., Otyama P.I., et al. Genetic variation among 481 diverse soybean accessions, inferred from genomic re-sequencing. Sci Data. 2021;8(1) doi: 10.1038/s41597-021-00834-w. PubMed DOI PMC
Langewisch T., Zhang H., Vincent R., Joshi T., Xu D., Bilyeu K., et al. Major soybean maturity gene haplotypes revealed by SNPViz analysis of 72 sequenced soybean genomes. PLoS ONE. 2014;9(4):e94150. doi: 10.1371/journal.pone.0094150. PubMed DOI PMC
Zeng S, Skrabisova M, Lyu Z, Chan YO, Bilyeu K, Joshi T. SNPViz v2.0: A web-based tool for enhanced haplotype analysis using large scale resequencing datasets and discovery of phenotypes causative gene using allelic variations. In: Proc. - 2020 IEEE Int. Conf. Bioinforma. Biomed. BIBM 2020, Institute of Electrical and Electronics Engineers Inc.; 2020, p. 1408–15. 10.1109/BIBM49941.2020.9313539. DOI
Zeng S., Škrabišová M., Lyu Z., Chan Y.O., Dietz N., Bilyeu K., et al. Application of SNPViz v2.0 using next-generation sequencing data sets in the discovery of potential causative mutations in candidate genes associated with phenotypes. IJDMB. 2021;25(1/2):65. doi: 10.1504/IJDMB.2021.116886. DOI
Sachidanandam R., Weissman D., Schmidt S.C., Kakol J.M., Stein L.D., Marth G., et al. A map of human genome sequence variation containing 1.42 million single nucleotide polymorphisms. Nature. 2001;409(6822):928–933. doi: 10.1038/35057149. PubMed DOI
Miranda C., Culp C., Škrabišová M., Joshi T., Belzile F., Grant D.M., et al. Molecular tools for detecting Pdh1 can improve soybean breeding efficiency by reducing yield losses due to pod shatter. Mol Breed. 2019;39:1–9. doi: 10.1007/s11032-019-0935-1. DOI
Langewisch T., Lenis J., Jiang G.-L., Wang D., Pantalone V., Bilyeu K. The development and use of a molecular model for soybean maturity groups. BMC Plant Biol. 2017;17:91. doi: 10.1186/s12870-017-1040-4. PubMed DOI PMC
Li X, Shi Z, Qie Q, Gao J, Wang X, Han Y. CandiHap: a toolkit for haplotype analysis for sequence of samples and fast identification of candidate causal gene(s) in genome-wide association study. Cold Spring Harbor Laboratory; 2020. 10.1101/2020.02.27.967539. DOI
Shi A., Buckley B., Mou B., Motes D., Morris J.B., Ma J., et al. Association analysis of cowpea bacterial blight resistance in USDA cowpea germplasm. Euphytica. 2016;208(1):143–155. doi: 10.1007/s10681-015-1610-1. DOI
Ravelombola W.S., Qin J., Shi A., Nice L., Bao Y., Lorenz A., et al. Genome-wide association study and genomic selection for tolerance of soybean biomass to soybean cyst nematode infestation. PLoS ONE. 2020;15(7):e0235089. doi: 10.1371/journal.pone.0235089. PubMed DOI PMC
Fang C., Ma Y., Wu S., Liu Z., Wang Z., Yang R., et al. Genome-wide association studies dissect the genetic networks underlying agronomical traits in soybean. Genome Biol. 2017;18(1) doi: 10.1186/s13059-017-1289-9. PubMed DOI PMC
Valliyodan B., Cannon S.B., Bayer P.E., Shu S., Brown A.V., Ren L., et al. Construction and comparison of three reference-quality genome assemblies for soybean. Plant J. 2019;100(5):1066–1082. doi: 10.1111/tpj.14500. PubMed DOI
Song Q., Hyten D.L., Jia G., Quigley C.V., Fickus E.W., Nelson R.L., et al. Development and evaluation of SoySNP50K, a high-density genotyping array for soybean. PLoS ONE. 2013;8(1):e54985. doi: 10.1371/journal.pone.0054985. PubMed DOI PMC
Bandillo N., Jarquin D., Song Q., Nelson R., Cregan P., Specht J., et al. A population structure and genome-wide association analysis on the USDA soybean germplasm collection. Plant Genome. 2015;8(3) doi: 10.3835/plantgenome2015.04.0024. PubMed DOI
Bandillo N.B., Lorenz A.J., Graef G.L., Jarquin D., Hyten D.L., Nelson R.L., et al. Genome-wide association mapping of qualitatively inherited traits in a germplasm collection. Plant Genome. 2017;10(2) doi: 10.3835/plantgenome2016.06.0054. PubMed DOI
Zhang J., Singh A.K. Genetic control and geo-climate adaptation of pod dehiscence provide novel insights into soybean domestication. G3. 2020;10:545–554. doi: 10.1534/g3.119.400876. PubMed DOI PMC
Liu Y., Khan S.M., Wang J., Rynge M., Zhang Y., Zeng S., et al. PGen: Large-scale genomic variations analysis workflow and browser in SoyKB. BMC Bioinf. 2016;17(S13) doi: 10.1186/s12859-016-1227-y. PubMed DOI PMC
Joshi T., Patil K., Fitzpatrick M.R., Franklin L.D., Yao Q., Cook J.R., et al. Soybean Knowledge Base (SoyKB): a web resource for soybean translational genomics. BMC Genomics. 2012;13(S1) doi: 10.1186/1471-2164-13-S1-S15. PubMed DOI PMC
McKenna A., Hanna M., Banks E., Sivachenko A., Cibulskis K., Kernytsky A., et al. The genome analysis toolkit: A MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 2010;20(9):1297–1303. doi: 10.1101/gr.107524.110. PubMed DOI PMC
Joshi T, Wang J, Zhang H, Chen S, Zeng S, Xu B, et al. The evolution of soybean knowledge base (SoyKB). Methods Mol. Biol., vol. 1533, Humana Press Inc.; 2017, p. 149–59. 10.1007/978-1-4939-6658-5_7. PubMed DOI
Cingolani P., Platts A., Wang L.L., Coon M., Nguyen T., Wang L., et al. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly (Austin) 2012;6(2):80–92. doi: 10.4161/fly.19695. PubMed DOI PMC
Hill JL, Peregrine EK, Sprau GL, Cremeens CR, Nelson RL, Kenty MM, et al. Evaluation of the USDA soybean germplasm collection: maturity groups 000-IV (PI 578371-PI 612761). US Dep Agric Tech Bull 2001:1894.
Zabala G., Vodkin L.O. A rearrangement resulting in small tandem repeats in the F3′5′H gene of white flower genotypes is associated with the soybean W1 locus. Crop Sci. 2007;47:113–124. doi: 10.2135/cropsci2006.12.0838tpg. DOI
Zhang Z., Ersoz E., Lai C.-Q., Todhunter R.J., Tiwari H.K., Gore M.A., et al. Mixed linear model approach adapted for genome-wide association studies. Nat Genet. 2010;42(4):355–360. doi: 10.1038/ng.546. PubMed DOI PMC
Tang Y., Liu X., Wang J., Li M., Wang Q., Tian F., et al. GAPIT version 2: An enhanced integrated tool for genomic association and prediction. Plant Genome. 2016;9(2) doi: 10.3835/plantgenome2015.11.0120. PubMed DOI
Bradbury P.J., Zhang Z., Kroon D.E., Casstevens T.M., Ramdoss Y., Buckler E.S. TASSEL: software for association mapping of complex traits in diverse samples. Bioinformatics. 2007;23(19):2633–2635. doi: 10.1093/bioinformatics/btm308. PubMed DOI
Metz C.E. Basic principles of ROC analysis. Semin Nucl Med. 1978;8(4):283–298. doi: 10.1016/S0001-2998(78)80014-2. PubMed DOI
Chang W, Cheng J, Allaire J, Xie Y, McPherson J. shiny: web application framework for R. R package version 1.3.2. 2019.
Phanstiel DH, Boyle AP, Araya CL, Snyder M. Sushi: An R/Bioconductor package for visualizing genomic data. R Packag Version 1260 2020.
Wickham H. Springer Science & Business Media; 2016. Programming with ggplot2.
Funatsuki H., Suzuki M., Hirose A., Inaba H., Yamada T., Hajika M., et al. Molecular basis of a shattering resistance boosting global dissemination of soybean. Proc Natl Acad Sci U S A. 2014;111(50):17797–17802. doi: 10.1073/pnas.1417282111. PubMed DOI PMC
Palmer RG, Pfeiffer TW, Buss GR, Kilen TC. Qualitative genetics. In: Shibles RM, Harper JE, Wilson RF, Shoemaker RC, editors. Soybeans Improv. Prod. Uses. 3rd ed., John Wiley & Sons, Ltd; 2016, p. 137–233. 10.2134/agronmonogr16.3ed.c5. DOI
Liu B., Watanabe S., Uchiyama T., Kong F., Kanazawa A., Xia Z., et al. The soybean stem growth habit gene Dt1 is an ortholog of arabidopsis TERMINAL FLOWER1. Plant Physiol. 2010;153(1):198–210. doi: 10.1104/pp.109.150607. PubMed DOI PMC
Tian Z., Wang X., Lee R., Li Y., Specht J.E., Nelson R.L., et al. Artificial selection for determinate growth habit in soybean. Proc Natl Acad Sci U S A. 2010;107(19):8563–8568. doi: 10.1073/pnas.1000088107. PubMed DOI PMC
Dong Y., Yang X., Liu J., Wang B.-H., Liu B.-L., Wang Y.-Z. Pod shattering resistance associated with domestication is mediated by a NAC gene in soybean. Nat Commun. 2014;5:3352. doi: 10.1038/ncomms4352. PubMed DOI
Wang M., Li W., Fang C., Xu F., Liu Y., Wang Z., et al. Parallel selection on a dormancy gene during domestication of crops from multiple families. Nat Genet. 2018;50(10):1435–1441. doi: 10.1038/s41588-018-0229-2. PubMed DOI
Sun L., Miao Z., Cai C., Zhang D., Zhao M., Wu Y., et al. GmHs1-1, encoding a calcineurin-like protein, controls hard-seededness in soybean. Nat Genet. 2015;47(8):939–943. doi: 10.1038/ng.3339. PubMed DOI
Yan F., Githiri S.M., Liu Y., Sang Y., Wang Q., Takahashi R. Loss-of-Function Mutation of Soybean R2R3 MYB Transcription Factor Dilutes Tawny Pubescence Color. Front Plant Sci. 2020;10:1–12. doi: 10.3389/fpls.2019.01809. PubMed DOI PMC
Xia Z., Zhai H., Wu H., Xu K., Watanabe S., Harada K. The Synchronized Efforts to Decipher the Molecular Basis for Soybean Maturity Loci E1, E2, and E3 That Regulate Flowering and Maturity. Front Plant Sci. 2021;12 doi: 10.3389/FPLS.2021.632754. PubMed DOI PMC
Xia Z., Watanabe S., Yamada T., Tsubokura Y., Nakashima H., Zhai H., et al. Positional cloning and characterization reveal the molecular basis for soybean maturity locus E1 that regulates photoperiodic flowering. Proc Natl Acad Sci U S A. 2012;109(32) doi: 10.1073/pnas.1117982109. PubMed DOI PMC
Zabala G., Vodkin L. Cloning of the Pleiotropic T Locus in Soybean and Two Recessive Alleles That Differentially Affect Structure and Expression of the Encoded Flavonoid 3′ Hydroxylase. Genetics. 2003;163:295–309. doi: 10.1093/genetics/163.1.295. PubMed DOI PMC
Gillman J.D., Tetlow A., Lee J.-D., Shannon J., Bilyeu K. Loss-of-function mutations affecting a specific Glycine max R2R3 MYB transcription factor result in brown hilum and brown seed coats. BMC Plant Biol. 2011;11(1):155. doi: 10.1186/1471-2229-11-155. PubMed DOI PMC
Watanabe S., Xia Z., Hideshima R., Tsubokura Y., Sato S., Yamanaka N., et al. A Map-Based Cloning Strategy Employing a Residual Heterozygous Line Reveals that the GIGANTEA Gene Is Involved in Soybean Maturity and Flowering. Genetics. 2011;188(2):395–407. doi: 10.1534/genetics.110.125062. PubMed DOI PMC
Lu S., Dong L., Fang C., Liu S., Kong L., Cheng Q., et al. Stepwise selection on homeologous PRR genes controlling flowering and maturity during soybean domestication. Nat Genet. 2020;52(4):428–436. doi: 10.1038/s41588-020-0604-7. PubMed DOI
Zhang D., Sun L., Li S., Wang W., Ding Y., Swarm S.A., et al. Elevation of soybean seed oil content through selection for seed coat shininess. Nat Plants. 2018;4(1):30–35. doi: 10.1038/s41477-017-0084-7. PubMed DOI
Ping J., Liu Y., Sun L., Zhao M., Li Y., She M., et al. Dt2 is a gain-of-function MADS-domain factor gene that specifies semideterminacy in soybean. Plant Cell. 2014;26(7):2831–2842. doi: 10.1105/tpc.114.126938. PubMed DOI PMC
Jeong N., Suh S.J., Kim M.-H., Lee S., Moon J.-K., Kim H.S., et al. Ln is a key regulator of leaflet shape and number of seeds per pod in soybean. Plant Cell. 2012;24(12):4807–4818. doi: 10.1105/tpc.112.104968. PubMed DOI PMC
Fang C., Li W., Li G., Wang Z., Zhou Z., Ma Y., et al. Cloning of Ln Gene Through Combined Approach of Map-based Cloning and Association Study in Soybean. J Genet Genomics. 2013;40(2):93–96. doi: 10.1016/j.jgg.2013.01.002. PubMed DOI
Sesia M., Bates S., Candès E., Marchini J., Sabatti C. False discovery rate control in genome-wide association studies with population structure. Proc Natl Acad Sci U S A. 2021;118(40) doi: 10.1073/pnas.2105841118. PubMed DOI PMC
Deng Y., Pan W. Improved use of small reference panels for conditional and joint analysis with gwas summary statistics. Genetics. 2018;209:401–408. doi: 10.1534/genetics.118.300813. PubMed DOI PMC
Benner C., Havulinna A.S., Järvelin M.-R., Salomaa V., Ripatti S., Pirinen M. Prospects of fine-mapping trait-associated genomic regions by using summary statistics from genome-wide association studies. Am J Hum Genet. 2017;101(4):539–551. doi: 10.1016/j.ajhg.2017.08.012. PubMed DOI PMC
Zhao H., Yao W., Ouyang Y., Yang W., Wang G., Lian X., et al. RiceVarMap: A comprehensive database of rice genomic variations. Nucleic Acids Res. 2015;43(D1):D1018–D1022. doi: 10.1093/nar/gku894. PubMed DOI PMC
Valluru R., Gazave E.E., Fernandes S.B., Ferguson J.N., Lozano R., Hirannaiah P., et al. Deleterious mutation burden and its association with complex traits in sorghum (Sorghum bicolor) Genetics. 2019;211(3):1075–1087. doi: 10.1534/genetics.118.301742. PubMed DOI PMC
Mockler T. A Complete-Sequence Population for Pan-Genome Analysis of Sorghum 2016. 10.25585/1488180. DOI
Alonso-Blanco C., Andrade J., Becker C., Bemm F., Bergelson J., Borgwardt K.M., et al. 1,135 Genomes Reveal the Global Pattern of Polymorphism in Arabidopsis thaliana. Cell. 2016;166(2):481–491. doi: 10.1016/j.cell.2016.05.063. PubMed DOI PMC
Bauchet G.J., Bett K.E., Cameron C.T., Campbell J.D., Cannon E.K.S., Cannon S.B., et al. The future of legume genetic data resources: Challenges, opportunities, and priorities. Legum Sci. 2019;1(1) doi: 10.1002/leg3.16. DOI
Miao C., Xu Y., Liu S., Schnable P.S., Schnable J.C. Increased power and accuracy of causal locus identification in time-series genome-wide association in sorghum. Plant Physiol. 2020;183(4):1898–1909. doi: 10.1104/pp.20.00277. PubMed DOI PMC
Li D., Liu Q., Schnable P.S. TWAS results are complementary to and less affected by linkage disequilibrium than GWAS. Plant Physiol. 2021;186(4):1800–1811. doi: 10.1093/plphys/kiab161. PubMed DOI PMC