• This record comes from PubMed

A combined de novo assembly approach increases the quality of prokaryotic draft genomes

. 2022 Oct ; 67 (5) : 801-810. [epub] 20220606

Language English Country United States Media print-electronic

Document type Journal Article

Links

PubMed 35668290
DOI 10.1007/s12223-022-00980-7
PII: 10.1007/s12223-022-00980-7
Knihovny.cz E-resources

Next-generation sequencing methods provide comprehensive data for the analysis of structural and functional analysis of the genome. The draft genomes with low contig number and high N50 value can give insight into the structure of the genome as well as provide information on the annotation of the genome. In this study, we designed a pipeline that can be used to assemble prokaryotic draft genomes with low number of contigs and high N50 value. We aimed to use combination of two de novo assembly tools (SPAdes and IDBA-Hybrid) and evaluate the impact of this approach on the quality metrics of the assemblies. The followed pipeline was tested with the raw sequence data with short reads (< 300) for a total of 10 species from four different genera. To obtain the final draft genomes, we firstly assembled the sequences using SPAdes to find closely related organism using the extracted 16 s rRNA from it. IDBA-Hybrid assembler was used to obtain the second assembly data using the closely related organism genome. SPAdes assembler tool was implemented using the second assembly, produced by IDBA-hybrid as a hint. The results were evaluated using QUAST and BUSCO. The pipeline was successful for the reduction of the contig numbers and increasing the N50 statistical values in the draft genome assemblies while preserving the coverage of the draft genomes.

See more in PubMed

Andrews S (2010) FASTQC A quality control tool for high throughput sequence data. In: Babraham Inst. http://www.bioinformatics.babraham.ac.uk/projects/fastqc

Bankevich A, Nurk S, Antipov D et al (2012) SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J Comput Biol. https://doi.org/10.1089/cmb.2012.0021 PubMed DOI PMC

Bolger AM, Lohse M, Usadel B (2014) Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. https://doi.org/10.1093/bioinformatics/btu170 PubMed DOI PMC

Bradnam KR, Fass JN, Alexandrov A et al (2013) Assemblathon 2: evaluating de novo methods of genome assembly in three vertebrate species. Gigascience. https://doi.org/10.1186/2047-217X-2-10 PubMed DOI PMC

Bugrysheva JV, Cherney B, Sue D et al (2016) Complete genome sequences for three chromosomes of the Burkholderia stabilis type strain (ATCC BAA-67). Genome Announc. https://doi.org/10.1128/genomeA.01294-16 PubMed DOI PMC

Earl D, Bradnam K, St. John J et al (2011) Assemblathon 1: a competitive assessment of de novo short read assembly methods. Genome Res 21

Esmaeel Q, Issa A, Sanchez L et al (2018) Draft genome sequence of Burkholderia reimsis BE51, a plant-associated bacterium isolated from agricultural rhizosphere. Microbiol Resour Announc. https://doi.org/10.1128/mra.00978-18 PubMed DOI PMC

Goris J, Konstantinidis KT, Klappenbach JA et al (2007) DNA-DNA hybridization values and their relationship to whole-genome sequence similarities. Int J Syst Evol Microbiol. https://doi.org/10.1099/ijs.0.64483-0 PubMed DOI

Guizelini D, Raittz RT, Cruz LM et al (2016) GFinisher: a new strategy to refine and finish bacterial genome assemblies. Sci Rep. https://doi.org/10.1038/srep34963 PubMed DOI PMC

Gurevich A, Saveliev V, Vyahhi N, Tesler G (2013) QUAST: quality assessment tool for genome assemblies. Bioinformatics. https://doi.org/10.1093/bioinformatics/btt086 PubMed DOI PMC

Hollmann J, Brinks E, Schwake-Anduschus C et al (2019) Draft genome sequences of Pseudomonas sp. strains isolated from wheat in Germany. Microbiol Resour Announc  https://doi.org/10.1128/mra.00178-19

Hunt M, Kikuchi T, Sanders M et al (2013) REAPR: a universal tool for genome assembly evaluation. Genome Biol. https://doi.org/10.1186/gb-2013-14-5-r47 PubMed DOI PMC

Kim M, Oh HS, Park SC, Chun J (2014) Towards a taxonomic coherence between average nucleotide identity and 16S rRNA gene sequence similarity for species demarcation of prokaryotes. Int J Syst Evol Microbiol. https://doi.org/10.1099/ijs.0.059774-0 PubMed DOI

Kolmogorov M, Raney B, Paten B, Pham S (2014) Ragout - a reference-assisted assembly tool for bacterial genomes. Bioinformatics. https://doi.org/10.1093/bioinformatics/btu280 PubMed DOI PMC

Kunst F, Ogasawara N, Moszer I et al (1997) The complete genome sequence of the gram-positive bacterium Bacillus subtilis. Nature 390:249–256. https://doi.org/10.1038/36786 PubMed DOI

Leong LEX, Lagana D, Carter GP et al (2018) Burkholderia lata infections from intrinsically contaminated chlorhexidine Mouthwash, Australia, 2016. Emerg Infect Dis 24

Liao X, Li M, Zou Y et al (2019) Current challenges and solutions of de novo assembly. Quant Biol

Lischer HEL, Shimizu KK (2017) Reference-guided de novo assembly approach improves genome reconstruction for related species. BMC Bioinformatics. https://doi.org/10.1186/s12859-017-1911-6 PubMed DOI PMC

National Center for Biotechnology Information (NCBI) (1988) Bethesda (MD): National Library of Medicine (US), National Center for Biotechnology Information. https://www.ncbi.nlm.nih.gov/genome . Accessed 2 Sep 2020

Økstad OA, Tourasse NJ, Stabell FB et al (2004) The bcr1 DNA repeat element is specific to the Bacillus cereus group and exhibits mobile element characteristics. J Bacteriol 186:7714–7725. https://doi.org/10.1128/JB.186.22.7714-7725.2004 PubMed DOI PMC

Owusu-Darko R, Allam M, de Oliveira SD et al (2019) Genome sequences of Bacillus sporothermodurans strains isolated from ultra-high-temperature milk. Microbiol Resour Announc. https://doi.org/10.1128/mra.00145-19 PubMed DOI PMC

Page AJ, De Silva N, Hunt M et al (2016) Robust high-throughput prokaryote de novo assembly and improvement pipeline for Illumina data. Microb Genomics. https://doi.org/10.1099/mgen.0.000083 DOI

Palevich N, Palevich FP, Maclean PH et al (2019) Draft genome sequence of Clostridium estertheticum subsp. laramiense DSM 14864T, isolated from spoiled uncooked beef. Microbiol Resour Announc. https://doi.org/10.1128/mra.01275-19

Peng Y, Leung HCM, Yiu SM, Chin FYL (2012) IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth. Bioinformatics. https://doi.org/10.1093/bioinformatics/bts174 PubMed DOI PMC

Prjibelski A, Antipov D, Meleshko D et al (2020) Using SPAdes de novo assembler. Curr Protoc Bioinforma. https://doi.org/10.1002/cpbi.102 DOI

Ramasamy KP, Telatin A, Mozzicafreddo M et al (2019) Draft genome sequence of a new Pseudomonas sp. Strain, ef1, associated with the psychrophilic antarctic ciliate Euplotes focardii. Microbiol Resour Announc.  https://doi.org/10.1128/mra.00867-19

Ricker N, Qian H, Fulthorpe RR (2012) The limitations of draft assemblies for understanding prokaryotic adaptation and evolution. Genomics. https://doi.org/10.1016/j.ygeno.2012.06.009 PubMed DOI

Seemann T (2013) barrnap 0.9 : rapid ribosomal RNA prediction. Github.Com

Simão FA, Waterhouse RM, Ioannidis P et al (2015) BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics. https://doi.org/10.1093/bioinformatics/btv351 PubMed DOI

Utturkar SM, Klingeman DM, Hurt RA, Brown SD (2017) A case study into microbial genome assembly gap sequences and finishing strategies. Front Microbiol. https://doi.org/10.3389/fmicb.2017.01272 PubMed DOI PMC

Find record

Citation metrics

Loading data ...

Archiving options

Loading data ...