JavaScript NENÍ povolen !

Prosím povolte JavaScript.

Článek

FT
PubMed

Záznam pochází z PubMed

Genomic Variations Explorer (GenVarX): a toolset for annotating promoter and CNV regions using genotypic and phenotypic differences

Chan, Yen On
Autor Chan, Yen On MU Institute for Data Science and Informatics, University of Missouri-Columbia, Columbia, MO, United States
Biová, Jana
Autor Biová, Jana Department of Biochemistry, Faculty of Science, Palacky University in Olomouc, Olomouc, Czechia
Mahmood, Anser
Autor Mahmood, Anser Division of Plant Science and Technology, University of Missouri-Columbia, Columbia, MO, United States
Dietz, Nicholas
Autor Dietz, Nicholas Division of Plant Science and Technology, University of Missouri-Columbia, Columbia, MO, United States
Bilyeu, Kristin
Autor Bilyeu, Kristin Division of Plant Science and Technology, University of Missouri-Columbia, Columbia, MO, United States Plant Genetics Research Unit, United States Department of Agriculture-Agricultural Research Service, Columbia, MO, United States
Škrabišová, Mária
Autor Škrabišová, Mária Department of Biochemistry, Faculty of Science, Palacky University in Olomouc, Olomouc, Czechia
Joshi, Trupti
Autor Joshi, Trupti MU Institute for Data Science and Informatics, University of Missouri-Columbia, Columbia, MO, United States Christopher S. Bond Life Sciences Center, University of Missouri-Columbia, Columbia, MO, United States Department of Electrical Engineering and Computer Science, University of Missouri-Columbia, Columbia, MO, United States Department of Biomedical Informatics, Biostatistics and Medical Epidemiology, University of Missouri-Columbia, Columbia, MO, United States

Frontiers in genetics. 2023 ; 14 () : 1251382. [epub] 20231009

Front Genet
ISSN 1664-8021
Zdroj

Status PubMed-not-MEDLINE Jazyk angličtina Země Švýcarsko Médium electronic-ecollection

Typ dokumentu časopisecké články

Perzistentní odkaz https://www.medvik.cz/link/pmid37928239

Online Plný text

PubMed 37928239
PubMed Central PMC10623549
DOI 10.3389/fgene.2023.1251382
PII: 1251382
Knihovny.cz E-zdroje

Klíčová slova
Indels, SNPs, copy number variation, genomic variations, phenotypes, promoter, transcription factor, whole genome re-sequencing data,
Publikační typ
časopisecké články MeSH

The rapid growth of sequencing technology and its increasing popularity in biology-related research over the years has made whole genome re-sequencing (WGRS) data become widely available. A large amount of WGRS data can unlock the knowledge gap between genomics and phenomics through gaining an understanding of the genomic variations that can lead to phenotype changes. These genomic variations are usually comprised of allele and structural changes in DNA, and these changes can affect the regulatory mechanisms causing changes in gene expression and altering the phenotypes of organisms. In this research work, we created the GenVarX toolset, that is backed by transcription factor binding sequence data in promoter regions, the copy number variations data, SNPs and Indels data, and phenotypes data which can potentially provide insights about phenotypic differences and solve compelling questions in plant research. Analytics-wise, we have developed strategies to better utilize the WGRS data and mine the data using efficient data processing scripts, libraries, tools, and frameworks to create the interactive and visualization-enhanced GenVarX toolset that encompasses both promoter regions and copy number variation analysis components. The main capabilities of the GenVarX toolset are to provide easy-to-use interfaces for users to perform queries, visualize data, and interact with the data. Based on different input windows on the user interface, users can provide inputs corresponding to each field and submit the information as a query. The data returned on the results page is usually displayed in a tabular fashion. In addition, interactive figures are also included in the toolset to facilitate the visualization of statistical results or tool outputs. Currently, the GenVarX toolset supports soybean, rice, and Arabidopsis. The researchers can access the soybean GenVarX toolset from SoyKB via https://soykb.org/SoybeanGenVarX/, rice GenVarX toolset, and Arabidopsis GenVarX toolset from KBCommons web portal with links https://kbcommons.org/system/tools/GenVarX/Osativa and https://kbcommons.org/system/tools/GenVarX/Athaliana, respectively.

Christopher S Bond Life Sciences Center University of Missouri Columbia Columbia MO United States

Department of Biochemistry Faculty of Science Palacky University in Olomouc Olomouc Czechia

Department of Biomedical Informatics Biostatistics and Medical Epidemiology University of Missouri Columbia Columbia MO United States

Department of Electrical Engineering and Computer Science University of Missouri Columbia Columbia MO United States

Division of Plant Science and Technology University of Missouri Columbia Columbia MO United States

MU Institute for Data Science and Informatics University of Missouri Columbia Columbia MO United States

Plant Genetics Research Unit United States Department of Agriculture Agricultural Research Service Columbia MO United States

Zobrazit více v PubMed

Alonso-Blanco C., Andrade J., Becker C., Bemm F., Bergelson J., Borgwardt K. M., et al. (2016). 1,135 Genomes Reveal the Global Pattern of Polymorphism in PubMed DOI PMC

Bailey T. L., Johnson J., Grant C. E., Noble W. S. (2015). The MEME Suite. Nucleic Acids Res. 43, W39–W49. 10.1093/nar/gkv416 PubMed DOI PMC

Bayer M. (2012). SQLAlchemy. Mountain view: aosabook.org.

Bolger M., Schwacke R., Gundlach H., Schmutzer T., Chen J., Arend D., et al. (2017). From plant genomes to phenotypes. J. Biotechnol. 261, 46–52. 10.1016/j.jbiotec.2017.06.003 PubMed DOI

Castro-Mondragon J. A., Riudavets-Puig R., Rauluseviciute I., Berhanu lemma R., Turchi L., Blanc-Mathieu R., et al. (2021). JASPAR 2022: the 9th release of the open-access database of transcription factor binding profiles. Nucleic Acids Res. 50, D165–D173. 10.1093/nar/gkab1113 PubMed DOI PMC

Gabrielaite M., Torp M. H., Rasmussen M. S., Andreu-Sánchez S., Vieira F. G., Pedersen C. B., et al. (2021). A Comparison of Tools for Copy-Number Variation Detection in Germline Whole Exome and Whole Genome Sequencing Data. Cancers 13, 6283. 10.3390/cancers13246283 PubMed DOI PMC

Goff S., Vaughn M., Mckay S., Lyons E., Stapleton A., Gessler D., et al. (2011). The iPlant Collaborative: cyberinfrastructure for plant biology. Front. Plant Sci. 2, 34. 10.3389/fpls.2011.00034 PubMed DOI PMC

Goodstein D. M., Shu S., Howson R., Neupane R., Hayes R. D., Fazo J., et al. (2011). Phytozome: a comparative platform for green plant genomics. Nucleic Acids Res. 40, D1178–D1186. 10.1093/nar/gkr944 PubMed DOI PMC

Jin J., Tian F., Yang D.-C., Meng Y.-Q., Kong L., Luo J., et al. (2016). PlantTFDB 4.0: toward a central hub for transcription factors and regulatory interactions in plants. Nucleic Acids Res. 45, D1040–D1045. 10.1093/nar/gkw982 PubMed DOI PMC

Joshi T., Fitzpatrick M. R., Chen S., Liu Y., Zhang H., Endacott R. Z., et al. (2013). Soybean knowledge base (SoyKB): a web resource for integration of soybean translational genomics and molecular breeding. Nucleic Acids Res. 42, D1245–D1252. 10.1093/nar/gkt905 PubMed DOI PMC

Joshi T., Patil K., Fitzpatrick M. R., Franklin L. D., Yao Q., Cook J. R., et al. (2012). Soybean Knowledge Base (SoyKB): a web resource for soybean translational genomics. BMC Genomics 13, S15. 10.1186/1471-2164-13-S1-S15 PubMed DOI PMC

Joshi T., Wang J., Zhang H., Chen S., Zeng S., Xu B., et al. (2017). “The Evolution of Soybean Knowledge Base (SoyKB),” in Plant genomics databases: methods and protocols. Editor Van Dijk A. D. J. (New York, NY: Springer New York; ), 149–159. PubMed

Kim M. Y., Lee S., Van K., Kim T.-H., Jeong S.-C., Choi I.-Y., et al. (2010). Whole-genome sequencing and intensive analysis of the undomesticated soybean (Glycine soja Sieb. and Zucc.) genome. Proc. Natl. Acad. Sci. 107, 22032–22037. 10.1073/pnas.1009526107 PubMed DOI PMC

Klambauer G., Schwarzbauer K., Mayr A., Clevert D. A., Mitterecker A., Bodenhofer U., et al. (2012). cn.MOPS: mixture of Poissons for discovering copy number variations in next-generation sequencing data with a low false discovery rate. Nucleic Acids Res. 40, e69. 10.1093/nar/gks003 PubMed DOI PMC

Li A., Liu A., Wu S., Qu K., Hu H., Yang J., et al. (2022). Comparison of structural variants in the whole genome sequences of two Medicago truncatula ecotypes: jemalong a17 and r108. BMC Plant Biol. 22, 77. 10.1186/s12870-022-03469-0 PubMed DOI PMC

Li H., Durbin R. (2009). Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760. 10.1093/bioinformatics/btp324 PubMed DOI PMC

Liu Y., Du H., Li P., Shen Y., Peng H., Liu S., et al. (2020). Pan-Genome of Wild and Cultivated Soybeans. Cell 182, 162–176. 10.1016/j.cell.2020.05.023 PubMed DOI

Liu Y., Khan S. M., Wang J., Rynge M., Zhang Y., Zeng S., et al. (2016). PGen: large-scale genomic variations analysis workflow and browser in SoyKB. BMC Bioinforma. 17, 337. 10.1186/s12859-016-1227-y PubMed DOI PMC

Mckenna A., Hanna M., Banks E., Sivachenko A., Cibulskis K., Kernytsky A., et al. (2010). The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 1297–1303. 10.1101/gr.107524.110 PubMed DOI PMC

Merchant N., Lyons E., Goff S., Vaughn M., Ware D., Micklos D., et al. (2016). The iPlant Collaborative: cyberinfrastructure for enabling data to discovery for the life sciences. PLoS Biol. 14, e1002342. 10.1371/journal.pbio.1002342 PubMed DOI PMC

Périer R. C., Praz V., Junier T., Bonnard C., Bucher P. (2000). The eukaryotic promoter database (EPD). Nucleic Acids Res. 28, 302–303. 10.1093/nar/28.1.302 PubMed DOI PMC

Sakai H., Lee S. S., Tanaka T., Numa H., Kim J., Kawahara Y., et al. (2013). Rice Annotation Project Database (RAP-DB): an integrative and interactive database for rice genomics. Plant Cell Physiol. 54, e6. 10.1093/pcp/pcs183 PubMed DOI PMC

Samarakoon P. S., Sorte H. S., Stray-Pedersen A., Rødningen O. K., Rognes T., Lyle R. (2016). cnvScan: a CNV screening and annotation tool to improve the clinical utility of computational CNV prediction from exome sequencing data. BMC Genomics 17, 51. 10.1186/s12864-016-2374-2 PubMed DOI PMC

Schneider T. D., Stephens R. M. (1990). Sequence logos: a new way to display consensus sequences. Nucleic Acids Res. 18, 6097–6100. 10.1093/nar/18.20.6097 PubMed DOI PMC

The 3,000 rice genomes project (2014). The 3,000 rice genomes project. GigaScience 3, 7. 10.1186/2047-217X-3-7 PubMed DOI PMC

Thomas S. G., Phillips A. L., Hedden P. (1999). Molecular cloning and functional expression of gibberellin 2- oxidases, multifunctional enzymes involved in gibberellin deactivation. Proc. Natl. Acad. Sci. 96, 4698–4703. 10.1073/pnas.96.8.4698 PubMed DOI PMC

Tian F., Yang D.-C., Meng Y.-Q., Jin J., Gao G. (2019). PlantRegMap: charting functional regulatory maps in plants. Nucleic Acids Res. 48, D1104-D1113–D1113. 10.1093/nar/gkz1020 PubMed DOI PMC

Valliyodan B., Brown A. V., Wang J., Patil G., Liu Y., Otyama P. I., et al. (2021). Genetic variation among 481 diverse soybean accessions, inferred from genomic re-sequencing. Sci. Data 8, 50. 10.1038/s41597-021-00834-w PubMed DOI PMC

Valliyodan B., Nguyen H. T. (2006). Understanding regulatory networks and engineering for enhanced drought tolerance in plants. Curr. Opin. Plant Biol. 9, 189–195. 10.1016/j.pbi.2006.01.019 PubMed DOI

Wang X., Li M.-W., Wong F.-L., Luk C.-Y., Chung C.Y.-L., Yung W.-S., et al. (2021). Increased copy number of gibberellin 2-oxidase 8 genes reduced trailing growth and shoot length during soybean domestication. Plant J. 107, 1739–1755. 10.1111/tpj.15414 PubMed DOI

Xie C., Tammi M. T. (2009). CNV-seq, a new method to detect copy number variation using high-throughput sequencing. BMC Bioinforma. 10, 80. 10.1186/1471-2105-10-80 PubMed DOI PMC

Yevshin I., Sharipov R., Valeev T., Kel A., Kolpakov F. (2017). GTRD: a database of transcription factor binding sites identified by ChIP-seq experiments. Nucleic Acids Res. 45, D61-D67–d67. 10.1093/nar/gkw951 PubMed DOI PMC

Zeng S., Lyu Z., Narisetti S. R. K., Xu D., Joshi T. (2018). “Knowledge Base Commons (KBCommons) v1.0: A multi OMICS' web-based data integration framework for biological discoveries,” in 2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Madrid, Spain, December 6 2018, 589–594.

Zeng S., Lyu Z., Narisetti S. R. K., Xu D., Joshi T. (2019). Knowledge Base Commons (KBCommons) v1.1: a universal framework for multi-omics data integration and biological discoveries. BMC Genomics 20, 947. 10.1186/s12864-019-6287-8 PubMed DOI PMC

Zhou Z., Jiang Y., Wang Z., Gou Z., Lyu J., Li W., et al. (2015). Resequencing 302 wild and cultivated accessions identifies genes related to domestication and improvement in soybean. Nat. Biotechnol. 33, 408–414. 10.1038/nbt.3096 PubMed DOI

Żmieńko A., Samelak A., Kozłowski P., Figlerowicz M. (2014). Copy number polymorphism in plant genomes. Theor. Appl. Genet. 127, 1–18. 10.1007/s00122-013-2177-7 PubMed DOI PMC

Najít záznam

v BMČ

Citační ukazatele

Pouze přihlášení uživatelé

Genomic Variations Explorer (GenVarX): a toolset for annotating promoter and CNV regions using genotypic and phenotypic differences

Najít záznam

Citační ukazatele

Možnosti archivace