PENGUINN: Precise Exploration of Nuclear G-Quadruplexes Using Interpretable Neural Networks

. 2020 ; 11 () : 568546. [epub] 20201027

Status PubMed-not-MEDLINE Jazyk angličtina Země Švýcarsko Médium electronic-ecollection

Typ dokumentu časopisecké články

Perzistentní odkaz   https://www.medvik.cz/link/pmid33193663

G-quadruplexes (G4s) are a class of stable structural nucleic acid secondary structures that are known to play a role in a wide spectrum of genomic functions, such as DNA replication and transcription. The classical understanding of G4 structure points to four variable length guanine strands joined by variable length nucleotide stretches. Experiments using G4 immunoprecipitation and sequencing experiments have produced a high number of highly probable G4 forming genomic sequences. The expense and technical difficulty of experimental techniques highlights the need for computational approaches of G4 identification. Here, we present PENGUINN, a machine learning method based on Convolutional neural networks, that learns the characteristics of G4 sequences and accurately predicts G4s outperforming state-of-the-art methods. We provide both a standalone implementation of the trained model, and a web application that can be used to evaluate sequences for their G4 potential.

Zobrazit více v PubMed

Bailey T. L., Williams N., Misleh C., Li W. W. (2006). MEME: discovering and analyzing DNA and protein sequence motifs. PubMed PMC

Barshai M., Orenstein Y. (2019). “Predicting G-quadruplexes from DNA sequences using multi-kernel convolutional neural networks,” in

Bedrat A., Lacroix L., Mergny J. L. (2016). Re-Evaluation of G-Quadruplex Propensity with G4Hunter. PubMed DOI PMC

Chambers V. S., Marsico G., Boutell J. M., Di Antonio M., Smith G. P., Balasubramanian S. (2015). High-throughput sequencing of DNA G-quadruplex structures in the human genome. PubMed DOI

Emmert-Streib F., Yang Z., Feng H., Tripathi S., Dehmer M. (2020). An introductory review of deep learning for prediction models with big data. PubMed DOI PMC

Fitch F. B. (1944).

Gellert M., Lipsett M. N., Davies D. R. (1962). Helix formation by guanylic acid. PubMed DOI PMC

Georgakilas G. K., Grioni A., Liakos K. G., Chalupova E., Plessas F. C., Alexiou P. (2020). Multi-branch convolutional neural network for identification of small non-coding RNA genomic loci. PubMed PMC

Georgakilas G. K., Grioni A., Liakos K. G., Malanikova E., Plessas F. C., Alexiou P. (n.d.). MuStARD: deep learning for intra- and inter-species scanning of functional genomic patterns. DOI

Hänsel-Hertsch R., Beraldi D., Lensing S. V., Marsico G., Zyner K., Parry A., et al. (2016). G-quadruplex structures mark human regulatory chromatin. PubMed DOI

Hon J., Martínek T., Zendulka J., Lexa M. (2017). Pqsfinder: an exhaustive and imperfection-tolerant search tool for potential quadruplex-forming sequences in R. PubMed DOI

Huppert J. L. (2005). Prevalence of quadruplexes in the human genome. PubMed DOI PMC

LeCun Y., Bengio Y., Hinton G. (2015). Deep learning. PubMed

Lombardi E. P., Londoño-Vallejo A. (2020). A guide to computational methods for G-quadruplex prediction. PubMed DOI PMC

Marsico G., Chambers V. S., Sahakyan A. B., McCauley P., Boutell J. M., Di Antonio M., et al. (2019). Whole genome experimental maps of DNA G-quadruplexes in multiple species. PubMed DOI PMC

Quinlan A. R., Hall I. M. (2010). BEDTools: a flexible suite of utilities for comparing genomic features. PubMed DOI PMC

Sahakyan A. B., Chambers V. S., Marsico G., Santner T., Di Antonio M., Balasubramanian S. (2017). Machine learning model for sequence-driven DNA G-quadruplex formation. PubMed PMC

Sen D., Gilbert W. (1988). Formation of parallel four-stranded complexes by guanine-rich motifs in DNA and its implications for meiosis. PubMed DOI

Spiegel J., Adhikari S., Balasubramanian S. (2020). The structure and function of DNA G-quadruplexes. PubMed DOI PMC

Tang B., Pan Z., Yin K., Khateeb A. (2019). Recent advances of deep learning in bioinformatics and computational biology. PubMed DOI PMC

Nejnovějších 20 citací...

Zobrazit více v
Medvik | PubMed

Genomic benchmarks: a collection of datasets for genomic sequence classification

. 2023 May 01 ; 24 (1) : 25. [epub] 20230501

Using Attribution Sequence Alignment to Interpret Deep Learning Models for miRNA Binding Site Prediction

. 2023 Feb 26 ; 12 (3) : . [epub] 20230226

Najít záznam

Citační ukazatele

Pouze přihlášení uživatelé

Možnosti archivace

Nahrávání dat ...