PENGUINN: Precise Exploration of Nuclear G-Quadruplexes Using Interpretable Neural Networks
Status PubMed-not-MEDLINE Jazyk angličtina Země Švýcarsko Médium electronic-ecollection
Typ dokumentu časopisecké články
PubMed
33193663
PubMed Central
PMC7653191
DOI
10.3389/fgene.2020.568546
Knihovny.cz E-zdroje
- Klíčová slova
- G quadruplex, bioinformatics and computational biology, deep neural network, genomic, imbalanced data classification, machine learning, web application,
- Publikační typ
- časopisecké články MeSH
G-quadruplexes (G4s) are a class of stable structural nucleic acid secondary structures that are known to play a role in a wide spectrum of genomic functions, such as DNA replication and transcription. The classical understanding of G4 structure points to four variable length guanine strands joined by variable length nucleotide stretches. Experiments using G4 immunoprecipitation and sequencing experiments have produced a high number of highly probable G4 forming genomic sequences. The expense and technical difficulty of experimental techniques highlights the need for computational approaches of G4 identification. Here, we present PENGUINN, a machine learning method based on Convolutional neural networks, that learns the characteristics of G4 sequences and accurately predicts G4s outperforming state-of-the-art methods. We provide both a standalone implementation of the trained model, and a web application that can be used to evaluate sequences for their G4 potential.
Zobrazit více v PubMed
Bailey T. L., Williams N., Misleh C., Li W. W. (2006). MEME: discovering and analyzing DNA and protein sequence motifs. PubMed PMC
Barshai M., Orenstein Y. (2019). “Predicting G-quadruplexes from DNA sequences using multi-kernel convolutional neural networks,” in
Bedrat A., Lacroix L., Mergny J. L. (2016). Re-Evaluation of G-Quadruplex Propensity with G4Hunter. PubMed DOI PMC
Chambers V. S., Marsico G., Boutell J. M., Di Antonio M., Smith G. P., Balasubramanian S. (2015). High-throughput sequencing of DNA G-quadruplex structures in the human genome. PubMed DOI
Emmert-Streib F., Yang Z., Feng H., Tripathi S., Dehmer M. (2020). An introductory review of deep learning for prediction models with big data. PubMed DOI PMC
Fitch F. B. (1944).
Gellert M., Lipsett M. N., Davies D. R. (1962). Helix formation by guanylic acid. PubMed DOI PMC
Georgakilas G. K., Grioni A., Liakos K. G., Chalupova E., Plessas F. C., Alexiou P. (2020). Multi-branch convolutional neural network for identification of small non-coding RNA genomic loci. PubMed PMC
Georgakilas G. K., Grioni A., Liakos K. G., Malanikova E., Plessas F. C., Alexiou P. (n.d.). MuStARD: deep learning for intra- and inter-species scanning of functional genomic patterns. DOI
Hänsel-Hertsch R., Beraldi D., Lensing S. V., Marsico G., Zyner K., Parry A., et al. (2016). G-quadruplex structures mark human regulatory chromatin. PubMed DOI
Hon J., Martínek T., Zendulka J., Lexa M. (2017). Pqsfinder: an exhaustive and imperfection-tolerant search tool for potential quadruplex-forming sequences in R. PubMed DOI
Huppert J. L. (2005). Prevalence of quadruplexes in the human genome. PubMed DOI PMC
LeCun Y., Bengio Y., Hinton G. (2015). Deep learning. PubMed
Lombardi E. P., Londoño-Vallejo A. (2020). A guide to computational methods for G-quadruplex prediction. PubMed DOI PMC
Marsico G., Chambers V. S., Sahakyan A. B., McCauley P., Boutell J. M., Di Antonio M., et al. (2019). Whole genome experimental maps of DNA G-quadruplexes in multiple species. PubMed DOI PMC
Quinlan A. R., Hall I. M. (2010). BEDTools: a flexible suite of utilities for comparing genomic features. PubMed DOI PMC
Sahakyan A. B., Chambers V. S., Marsico G., Santner T., Di Antonio M., Balasubramanian S. (2017). Machine learning model for sequence-driven DNA G-quadruplex formation. PubMed PMC
Sen D., Gilbert W. (1988). Formation of parallel four-stranded complexes by guanine-rich motifs in DNA and its implications for meiosis. PubMed DOI
Spiegel J., Adhikari S., Balasubramanian S. (2020). The structure and function of DNA G-quadruplexes. PubMed DOI PMC
Tang B., Pan Z., Yin K., Khateeb A. (2019). Recent advances of deep learning in bioinformatics and computational biology. PubMed DOI PMC