-
Je něco špatně v tomto záznamu ?
TAREAN: a computational tool for identification and characterization of satellite DNA from unassembled short reads
P. Novák, L. Ávila Robledillo, A. Koblížková, I. Vrbová, P. Neumann, J. Macas,
Jazyk angličtina Země Velká Británie
Typ dokumentu časopisecké články
NLK
Directory of Open Access Journals
od 2005
Free Medical Journals
od 1996
PubMed Central
od 1974
Europe PubMed Central
od 1974
Open Access Digital Library
od 1996-01-01 do 2030-12-31
Open Access Digital Library
od 1974-01-01
Open Access Digital Library
od 1996-01-01
Open Access Digital Library
od 1996-01-01
Medline Complete (EBSCOhost)
od 1996-01-01
Oxford Journals Open Access Collection
od 1996-01-01
ROAD: Directory of Open Access Scholarly Resources
od 1974
PubMed
28402514
DOI
10.1093/nar/gkx257
Knihovny.cz E-zdroje
- MeSH
- DNA rostlinná genetika MeSH
- genom rostlinný * MeSH
- hrách setý genetika MeSH
- hybridizace in situ fluorescenční MeSH
- konsenzuální sekvence MeSH
- kukuřice setá genetika MeSH
- Magnoliopsida genetika MeSH
- mapování chromozomů metody MeSH
- metafáze MeSH
- počítačová grafika MeSH
- šáchorovité genetika MeSH
- satelitní DNA klasifikace genetika MeSH
- sekvence nukleotidů MeSH
- sekvenční analýza DNA MeSH
- shluková analýza MeSH
- software * MeSH
- Vicia faba genetika MeSH
- Publikační typ
- časopisecké články MeSH
Satellite DNA is one of the major classes of repetitive DNA, characterized by tandemly arranged repeat copies that form contiguous arrays up to megabases in length. This type of genomic organization makes satellite DNA difficult to assemble, which hampers characterization of satellite sequences by computational analysis of genomic contigs. Here, we present tandem repeat analyzer (TAREAN), a novel computational pipeline that circumvents this problem by detecting satellite repeats directly from unassembled short reads. The pipeline first employs graph-based sequence clustering to identify groups of reads that represent repetitive elements. Putative satellite repeats are subsequently detected by the presence of circular structures in their cluster graphs. Consensus sequences of repeat monomers are then reconstructed from the most frequent k-mers obtained by decomposing read sequences from corresponding clusters. The pipeline performance was successfully validated by analyzing low-pass genome sequencing data from five plant species where satellite DNA was previously experimentally characterized. Moreover, novel satellite repeats were predicted for the genome of Vicia faba and three of these repeats were verified by detecting their sequences on metaphase chromosomes using fluorescence in situ hybridization.
Citace poskytuje Crossref.org
- 000
- 00000naa a2200000 a 4500
- 001
- bmc18016665
- 003
- CZ-PrNML
- 005
- 20180515103609.0
- 007
- ta
- 008
- 180515s2017 xxk f 000 0|eng||
- 009
- AR
- 024 7_
- $a 10.1093/nar/gkx257 $2 doi
- 035 __
- $a (PubMed)28402514
- 040 __
- $a ABA008 $b cze $d ABA008 $e AACR2
- 041 0_
- $a eng
- 044 __
- $a xxk
- 100 1_
- $a Novák, Petr $u Institute of Plant Molecular Biology, Biology Centre CAS, Ceské Budejovice CZ-37005, Czech Republic.
- 245 10
- $a TAREAN: a computational tool for identification and characterization of satellite DNA from unassembled short reads / $c P. Novák, L. Ávila Robledillo, A. Koblížková, I. Vrbová, P. Neumann, J. Macas,
- 520 9_
- $a Satellite DNA is one of the major classes of repetitive DNA, characterized by tandemly arranged repeat copies that form contiguous arrays up to megabases in length. This type of genomic organization makes satellite DNA difficult to assemble, which hampers characterization of satellite sequences by computational analysis of genomic contigs. Here, we present tandem repeat analyzer (TAREAN), a novel computational pipeline that circumvents this problem by detecting satellite repeats directly from unassembled short reads. The pipeline first employs graph-based sequence clustering to identify groups of reads that represent repetitive elements. Putative satellite repeats are subsequently detected by the presence of circular structures in their cluster graphs. Consensus sequences of repeat monomers are then reconstructed from the most frequent k-mers obtained by decomposing read sequences from corresponding clusters. The pipeline performance was successfully validated by analyzing low-pass genome sequencing data from five plant species where satellite DNA was previously experimentally characterized. Moreover, novel satellite repeats were predicted for the genome of Vicia faba and three of these repeats were verified by detecting their sequences on metaphase chromosomes using fluorescence in situ hybridization.
- 650 _2
- $a sekvence nukleotidů $7 D001483
- 650 _2
- $a mapování chromozomů $x metody $7 D002874
- 650 _2
- $a shluková analýza $7 D016000
- 650 _2
- $a počítačová grafika $7 D003196
- 650 _2
- $a konsenzuální sekvence $7 D016384
- 650 _2
- $a šáchorovité $x genetika $7 D029785
- 650 _2
- $a DNA rostlinná $x genetika $7 D018744
- 650 _2
- $a satelitní DNA $x klasifikace $x genetika $7 D004276
- 650 12
- $a genom rostlinný $7 D018745
- 650 _2
- $a hybridizace in situ fluorescenční $7 D017404
- 650 _2
- $a Magnoliopsida $x genetika $7 D019684
- 650 _2
- $a metafáze $7 D008677
- 650 _2
- $a hrách setý $x genetika $7 D018532
- 650 _2
- $a sekvenční analýza DNA $7 D017422
- 650 12
- $a software $7 D012984
- 650 _2
- $a Vicia faba $x genetika $7 D031307
- 650 _2
- $a kukuřice setá $x genetika $7 D003313
- 655 _2
- $a časopisecké články $7 D016428
- 700 1_
- $a Ávila Robledillo, Laura $u Institute of Plant Molecular Biology, Biology Centre CAS, Ceské Budejovice CZ-37005, Czech Republic.
- 700 1_
- $a Koblížková, Andrea $u Institute of Plant Molecular Biology, Biology Centre CAS, Ceské Budejovice CZ-37005, Czech Republic.
- 700 1_
- $a Vrbová, Iva $u Institute of Plant Molecular Biology, Biology Centre CAS, Ceské Budejovice CZ-37005, Czech Republic.
- 700 1_
- $a Neumann, Pavel $u Institute of Plant Molecular Biology, Biology Centre CAS, Ceské Budejovice CZ-37005, Czech Republic.
- 700 1_
- $a Macas, Jirí $u Institute of Plant Molecular Biology, Biology Centre CAS, Ceské Budejovice CZ-37005, Czech Republic.
- 773 0_
- $w MED00003554 $t Nucleic acids research $x 1362-4962 $g Roč. 45, č. 12 (2017), s. e111
- 856 41
- $u https://pubmed.ncbi.nlm.nih.gov/28402514 $y Pubmed
- 910 __
- $a ABA008 $b sig $c sign $y a $z 0
- 990 __
- $a 20180515 $b ABA008
- 991 __
- $a 20180515103743 $b ABA008
- 999 __
- $a ok $b bmc $g 1300289 $s 1013505
- BAS __
- $a 3
- BAS __
- $a PreBMC
- BMC __
- $a 2017 $b 45 $c 12 $d e111 $i 1362-4962 $m Nucleic acids research $n Nucleic Acids Res $x MED00003554
- LZP __
- $a Pubmed-20180515