• Something wrong with this record ?

Recovery of 197 eukaryotic bins reveals major challenges for eukaryote genome reconstruction from terrestrial metagenomes

JP. Saraiva, A. Bartholomäus, RB. Toscan, P. Baldrian, U. Nunes da Rocha

. 2023 ; 23 (5) : 1066-1076. [pub] 20230320

Language English Country England, Great Britain

Document type Journal Article

Grant support
460129525 Deutsche Forschungsgemeinschaft
VH-NG-1248 Micro' Big Data' Helmholtz-Gemeinschaft

As most eukaryotic genomes are yet to be sequenced, the mechanisms underlying their contribution to different ecosystem processes remain untapped. Although approaches to recovering Prokaryotic genomes have become common in genome biology, few studies have tackled the recovery of eukaryotic genomes from metagenomes. This study assessed the reconstruction of microbial eukaryotic genomes using 6000 metagenomes from terrestrial and some transition environments using the EukRep pipeline. Only 215 metagenomic libraries yielded eukaryotic bins. From a total of 447 eukaryotic bins recovered 197 were classified at the phylum level. Streptophytes and fungi were the most represented clades with 83 and 73 bins, respectively. More than 78% of the obtained eukaryotic bins were recovered from samples whose biomes were classified as host-associated, aquatic, and anthropogenic terrestrial. However, only 93 bins were taxonomically assigned at the genus level and 17 bins at the species level. Completeness and contamination estimates were obtained for a total of 193 bins and consisted of 44.64% (σ = 27.41%) and 3.97% (σ = 6.53%), respectively. Micromonas commoda was the most frequent taxon found while Saccharomyces cerevisiae presented the highest completeness, probably because more reference genomes are available. Current measures of completeness are based on the presence of single-copy genes. However, mapping of the contigs from the recovered eukaryotic bins to the chromosomes of the reference genomes showed many gaps, suggesting that completeness measures should also include chromosome coverage. Recovering eukaryotic genomes will benefit significantly from long-read sequencing, development of tools for dealing with repeat-rich genomes, and improved reference genomes databases.

References provided by Crossref.org

000      
00000naa a2200000 a 4500
001      
bmc23010951
003      
CZ-PrNML
005      
20230801132723.0
007      
ta
008      
230718s2023 enk f 000 0|eng||
009      
AR
024    7_
$a 10.1111/1755-0998.13776 $2 doi
035    __
$a (PubMed)36847735
040    __
$a ABA008 $b cze $d ABA008 $e AACR2
041    0_
$a eng
044    __
$a enk
100    1_
$a Saraiva, Joao Pedro $u Department of Environmental Microbiology, Helmholtz Centre for Environmental Research-UFZ GmbH, Leipzig, Germany
245    10
$a Recovery of 197 eukaryotic bins reveals major challenges for eukaryote genome reconstruction from terrestrial metagenomes / $c JP. Saraiva, A. Bartholomäus, RB. Toscan, P. Baldrian, U. Nunes da Rocha
520    9_
$a As most eukaryotic genomes are yet to be sequenced, the mechanisms underlying their contribution to different ecosystem processes remain untapped. Although approaches to recovering Prokaryotic genomes have become common in genome biology, few studies have tackled the recovery of eukaryotic genomes from metagenomes. This study assessed the reconstruction of microbial eukaryotic genomes using 6000 metagenomes from terrestrial and some transition environments using the EukRep pipeline. Only 215 metagenomic libraries yielded eukaryotic bins. From a total of 447 eukaryotic bins recovered 197 were classified at the phylum level. Streptophytes and fungi were the most represented clades with 83 and 73 bins, respectively. More than 78% of the obtained eukaryotic bins were recovered from samples whose biomes were classified as host-associated, aquatic, and anthropogenic terrestrial. However, only 93 bins were taxonomically assigned at the genus level and 17 bins at the species level. Completeness and contamination estimates were obtained for a total of 193 bins and consisted of 44.64% (σ = 27.41%) and 3.97% (σ = 6.53%), respectively. Micromonas commoda was the most frequent taxon found while Saccharomyces cerevisiae presented the highest completeness, probably because more reference genomes are available. Current measures of completeness are based on the presence of single-copy genes. However, mapping of the contigs from the recovered eukaryotic bins to the chromosomes of the reference genomes showed many gaps, suggesting that completeness measures should also include chromosome coverage. Recovering eukaryotic genomes will benefit significantly from long-read sequencing, development of tools for dealing with repeat-rich genomes, and improved reference genomes databases.
650    12
$a metagenom $7 D054892
650    12
$a Eukaryota $x genetika $7 D056890
650    _2
$a ekosystém $7 D017753
650    _2
$a genom mikrobiální $7 D064349
650    _2
$a houby $x genetika $7 D005658
650    _2
$a metagenomika $7 D056186
655    _2
$a časopisecké články $7 D016428
700    1_
$a Bartholomäus, Alexander $u GFZ German Research Centre for Geosciences, Section Geomicrobiology, Potsdam, Germany $1 https://orcid.org/0000000309707304
700    1_
$a Toscan, Rodolfo Brizola $u Department of Environmental Microbiology, Helmholtz Centre for Environmental Research-UFZ GmbH, Leipzig, Germany
700    1_
$a Baldrian, Petr $u Laboratory of Environmental Microbiology, Institute of Microbiology of the Czech Academy of Sciences, Praha, Czech Republic $1 https://orcid.org/0000000289832721 $7 xx0098074
700    1_
$a Nunes da Rocha, Ulisses $u Department of Environmental Microbiology, Helmholtz Centre for Environmental Research-UFZ GmbH, Leipzig, Germany $1 https://orcid.org/0000000169726692
773    0_
$w MED00180393 $t Molecular ecology resources $x 1755-0998 $g Roč. 23, č. 5 (2023), s. 1066-1076
856    41
$u https://pubmed.ncbi.nlm.nih.gov/36847735 $y Pubmed
910    __
$a ABA008 $b sig $c sign $y p $z 0
990    __
$a 20230718 $b ABA008
991    __
$a 20230801132720 $b ABA008
999    __
$a ok $b bmc $g 1963398 $s 1197216
BAS    __
$a 3
BAS    __
$a PreBMC-MEDLINE
BMC    __
$a 2023 $b 23 $c 5 $d 1066-1076 $e 20230320 $i 1755-0998 $m Molecular ecology resources $n Mol. ecol. resour. $x MED00180393
GRA    __
$a 460129525 $p Deutsche Forschungsgemeinschaft
GRA    __
$a VH-NG-1248 Micro' Big Data' $p Helmholtz-Gemeinschaft
LZP    __
$a Pubmed-20230718

Find record

Citation metrics

Loading data ...

Archiving options

Loading data ...