• Something wrong with this record ?

Multi-branch Convolutional Neural Network for Identification of Small Non-coding RNA genomic loci

GK. Georgakilas, A. Grioni, KG. Liakos, E. Chalupova, FC. Plessas, P. Alexiou,

. 2020 ; 10 (1) : 9486. [pub] 20200611

Language English Country Great Britain

Document type Journal Article, Research Support, Non-U.S. Gov't

Genomic regions that encode small RNA genes exhibit characteristic patterns in their sequence, secondary structure, and evolutionary conservation. Convolutional Neural Networks are a family of algorithms that can classify data based on learned patterns. Here we present MuStARD an application of Convolutional Neural Networks that can learn patterns associated with user-defined sets of genomic regions, and scan large genomic areas for novel regions exhibiting similar characteristics. We demonstrate that MuStARD is a generic method that can be trained on different classes of human small RNA genomic loci, without need for domain specific knowledge, due to the automated feature and background selection processes built into the model. We also demonstrate the ability of MuStARD for inter-species identification of functional elements by predicting mouse small RNAs (pre-miRNAs and snoRNAs) using models trained on the human genome. MuStARD can be used to filter small RNA-Seq datasets for identification of novel small RNA loci, intra- and inter- species, as demonstrated in three use cases of human, mouse, and fly pre-miRNA prediction. MuStARD is easy to deploy and extend to a variety of genomic classification questions. Code and trained models are freely available at gitlab.com/RBP_Bioinformatics/mustard.

References provided by Crossref.org

000      
00000naa a2200000 a 4500
001      
bmc20028098
003      
CZ-PrNML
005      
20210114152957.0
007      
ta
008      
210105s2020 xxk f 000 0|eng||
009      
AR
024    7_
$a 10.1038/s41598-020-66454-3 $2 doi
035    __
$a (PubMed)32528107
040    __
$a ABA008 $b cze $d ABA008 $e AACR2
041    0_
$a eng
044    __
$a xxk
100    1_
$a Georgakilas, Georgios K $u Central European Institute of Technology, Brno, Czech Republic.
245    10
$a Multi-branch Convolutional Neural Network for Identification of Small Non-coding RNA genomic loci / $c GK. Georgakilas, A. Grioni, KG. Liakos, E. Chalupova, FC. Plessas, P. Alexiou,
520    9_
$a Genomic regions that encode small RNA genes exhibit characteristic patterns in their sequence, secondary structure, and evolutionary conservation. Convolutional Neural Networks are a family of algorithms that can classify data based on learned patterns. Here we present MuStARD an application of Convolutional Neural Networks that can learn patterns associated with user-defined sets of genomic regions, and scan large genomic areas for novel regions exhibiting similar characteristics. We demonstrate that MuStARD is a generic method that can be trained on different classes of human small RNA genomic loci, without need for domain specific knowledge, due to the automated feature and background selection processes built into the model. We also demonstrate the ability of MuStARD for inter-species identification of functional elements by predicting mouse small RNAs (pre-miRNAs and snoRNAs) using models trained on the human genome. MuStARD can be used to filter small RNA-Seq datasets for identification of novel small RNA loci, intra- and inter- species, as demonstrated in three use cases of human, mouse, and fly pre-miRNA prediction. MuStARD is easy to deploy and extend to a variety of genomic classification questions. Code and trained models are freely available at gitlab.com/RBP_Bioinformatics/mustard.
650    _2
$a algoritmy $7 D000465
650    _2
$a zvířata $7 D000818
650    _2
$a výpočetní biologie $x metody $7 D019295
650    _2
$a genomika $x metody $7 D023281
650    _2
$a lidé $7 D006801
650    _2
$a myši $7 D051379
650    _2
$a mikro RNA $x genetika $7 D035683
650    _2
$a neuronové sítě (počítačové) $7 D016571
650    _2
$a malá jadérková RNA $x genetika $7 D020537
650    _2
$a nekódující RNA $x genetika $7 D022661
650    _2
$a software $7 D012984
655    _2
$a časopisecké články $7 D016428
655    _2
$a práce podpořená grantem $7 D013485
700    1_
$a Grioni, Andrea $u Central European Institute of Technology, Brno, Czech Republic.
700    1_
$a Liakos, Konstantinos G $u Department of Electrical and Computer Engineering, School of Engineering, University of Thessaly, Volos, Greece.
700    1_
$a Chalupova, Eliska $u Faculty of Science, National Centre for Biomolecular Research, Masaryk University, Brno, Czech Republic.
700    1_
$a Plessas, Fotis C $u Department of Electrical and Computer Engineering, School of Engineering, University of Thessaly, Volos, Greece.
700    1_
$a Alexiou, Panagiotis $u Central European Institute of Technology, Brno, Czech Republic. panagiotis.alexiou@ceitec.muni.cz.
773    0_
$w MED00182195 $t Scientific reports $x 2045-2322 $g Roč. 10, č. 1 (2020), s. 9486
856    41
$u https://pubmed.ncbi.nlm.nih.gov/32528107 $y Pubmed
910    __
$a ABA008 $b sig $c sign $y a $z 0
990    __
$a 20210105 $b ABA008
991    __
$a 20210114152955 $b ABA008
999    __
$a ok $b bmc $g 1608433 $s 1119278
BAS    __
$a 3
BAS    __
$a PreBMC
BMC    __
$a 2020 $b 10 $c 1 $d 9486 $e 20200611 $i 2045-2322 $m Scientific reports $n Sci Rep $x MED00182195
LZP    __
$a Pubmed-20210105

Find record

Citation metrics

Loading data ...

Archiving options

Loading data ...