-
Something wrong with this record ?
Recovery of 197 eukaryotic bins reveals major challenges for eukaryote genome reconstruction from terrestrial metagenomes
JP. Saraiva, A. Bartholomäus, RB. Toscan, P. Baldrian, U. Nunes da Rocha
Language English Country England, Great Britain
Document type Journal Article
Grant support
460129525
Deutsche Forschungsgemeinschaft
VH-NG-1248 Micro' Big Data'
Helmholtz-Gemeinschaft
- MeSH
- Ecosystem MeSH
- Eukaryota * genetics MeSH
- Genome, Microbial MeSH
- Fungi genetics MeSH
- Metagenome * MeSH
- Metagenomics MeSH
- Publication type
- Journal Article MeSH
As most eukaryotic genomes are yet to be sequenced, the mechanisms underlying their contribution to different ecosystem processes remain untapped. Although approaches to recovering Prokaryotic genomes have become common in genome biology, few studies have tackled the recovery of eukaryotic genomes from metagenomes. This study assessed the reconstruction of microbial eukaryotic genomes using 6000 metagenomes from terrestrial and some transition environments using the EukRep pipeline. Only 215 metagenomic libraries yielded eukaryotic bins. From a total of 447 eukaryotic bins recovered 197 were classified at the phylum level. Streptophytes and fungi were the most represented clades with 83 and 73 bins, respectively. More than 78% of the obtained eukaryotic bins were recovered from samples whose biomes were classified as host-associated, aquatic, and anthropogenic terrestrial. However, only 93 bins were taxonomically assigned at the genus level and 17 bins at the species level. Completeness and contamination estimates were obtained for a total of 193 bins and consisted of 44.64% (σ = 27.41%) and 3.97% (σ = 6.53%), respectively. Micromonas commoda was the most frequent taxon found while Saccharomyces cerevisiae presented the highest completeness, probably because more reference genomes are available. Current measures of completeness are based on the presence of single-copy genes. However, mapping of the contigs from the recovered eukaryotic bins to the chromosomes of the reference genomes showed many gaps, suggesting that completeness measures should also include chromosome coverage. Recovering eukaryotic genomes will benefit significantly from long-read sequencing, development of tools for dealing with repeat-rich genomes, and improved reference genomes databases.
References provided by Crossref.org
- 000
- 00000naa a2200000 a 4500
- 001
- bmc23010951
- 003
- CZ-PrNML
- 005
- 20230801132723.0
- 007
- ta
- 008
- 230718s2023 enk f 000 0|eng||
- 009
- AR
- 024 7_
- $a 10.1111/1755-0998.13776 $2 doi
- 035 __
- $a (PubMed)36847735
- 040 __
- $a ABA008 $b cze $d ABA008 $e AACR2
- 041 0_
- $a eng
- 044 __
- $a enk
- 100 1_
- $a Saraiva, Joao Pedro $u Department of Environmental Microbiology, Helmholtz Centre for Environmental Research-UFZ GmbH, Leipzig, Germany
- 245 10
- $a Recovery of 197 eukaryotic bins reveals major challenges for eukaryote genome reconstruction from terrestrial metagenomes / $c JP. Saraiva, A. Bartholomäus, RB. Toscan, P. Baldrian, U. Nunes da Rocha
- 520 9_
- $a As most eukaryotic genomes are yet to be sequenced, the mechanisms underlying their contribution to different ecosystem processes remain untapped. Although approaches to recovering Prokaryotic genomes have become common in genome biology, few studies have tackled the recovery of eukaryotic genomes from metagenomes. This study assessed the reconstruction of microbial eukaryotic genomes using 6000 metagenomes from terrestrial and some transition environments using the EukRep pipeline. Only 215 metagenomic libraries yielded eukaryotic bins. From a total of 447 eukaryotic bins recovered 197 were classified at the phylum level. Streptophytes and fungi were the most represented clades with 83 and 73 bins, respectively. More than 78% of the obtained eukaryotic bins were recovered from samples whose biomes were classified as host-associated, aquatic, and anthropogenic terrestrial. However, only 93 bins were taxonomically assigned at the genus level and 17 bins at the species level. Completeness and contamination estimates were obtained for a total of 193 bins and consisted of 44.64% (σ = 27.41%) and 3.97% (σ = 6.53%), respectively. Micromonas commoda was the most frequent taxon found while Saccharomyces cerevisiae presented the highest completeness, probably because more reference genomes are available. Current measures of completeness are based on the presence of single-copy genes. However, mapping of the contigs from the recovered eukaryotic bins to the chromosomes of the reference genomes showed many gaps, suggesting that completeness measures should also include chromosome coverage. Recovering eukaryotic genomes will benefit significantly from long-read sequencing, development of tools for dealing with repeat-rich genomes, and improved reference genomes databases.
- 650 12
- $a metagenom $7 D054892
- 650 12
- $a Eukaryota $x genetika $7 D056890
- 650 _2
- $a ekosystém $7 D017753
- 650 _2
- $a genom mikrobiální $7 D064349
- 650 _2
- $a houby $x genetika $7 D005658
- 650 _2
- $a metagenomika $7 D056186
- 655 _2
- $a časopisecké články $7 D016428
- 700 1_
- $a Bartholomäus, Alexander $u GFZ German Research Centre for Geosciences, Section Geomicrobiology, Potsdam, Germany $1 https://orcid.org/0000000309707304
- 700 1_
- $a Toscan, Rodolfo Brizola $u Department of Environmental Microbiology, Helmholtz Centre for Environmental Research-UFZ GmbH, Leipzig, Germany
- 700 1_
- $a Baldrian, Petr $u Laboratory of Environmental Microbiology, Institute of Microbiology of the Czech Academy of Sciences, Praha, Czech Republic $1 https://orcid.org/0000000289832721 $7 xx0098074
- 700 1_
- $a Nunes da Rocha, Ulisses $u Department of Environmental Microbiology, Helmholtz Centre for Environmental Research-UFZ GmbH, Leipzig, Germany $1 https://orcid.org/0000000169726692
- 773 0_
- $w MED00180393 $t Molecular ecology resources $x 1755-0998 $g Roč. 23, č. 5 (2023), s. 1066-1076
- 856 41
- $u https://pubmed.ncbi.nlm.nih.gov/36847735 $y Pubmed
- 910 __
- $a ABA008 $b sig $c sign $y p $z 0
- 990 __
- $a 20230718 $b ABA008
- 991 __
- $a 20230801132720 $b ABA008
- 999 __
- $a ok $b bmc $g 1963398 $s 1197216
- BAS __
- $a 3
- BAS __
- $a PreBMC-MEDLINE
- BMC __
- $a 2023 $b 23 $c 5 $d 1066-1076 $e 20230320 $i 1755-0998 $m Molecular ecology resources $n Mol. ecol. resour. $x MED00180393
- GRA __
- $a 460129525 $p Deutsche Forschungsgemeinschaft
- GRA __
- $a VH-NG-1248 Micro' Big Data' $p Helmholtz-Gemeinschaft
- LZP __
- $a Pubmed-20230718