Spiders are a hyperdiverse taxon and among the most abundant predators in nearly all terrestrial habitats. Their success is often attributed to key developments in their evolution such as silk and venom production and major apomorphies such as a whole-genome duplication. Resolving deep relationships within the spider tree of life has been historically challenging, making it difficult to measure the relative importance of these novelties for spider evolution. Whole-genome data offer an essential resource in these efforts, but also for functional genomic studies. Here, we present de novo assemblies for three spider species: Ryuthela nishihirai (Liphistiidae), a representative of the ancient Mesothelae, the suborder that is sister to all other extant spiders; Uloborus plumipes (Uloboridae), a cribellate orbweaver whose phylogenetic placement is especially challenging; and Cheiracanthium punctorium (Cheiracanthiidae), which represents only the second family to be sequenced in the hyperdiverse Dionycha clade. These genomes fill critical gaps in the spider tree of life. Using these novel genomes along with 25 previously published ones, we examine the evolutionary history of spidroin gene and structural hox cluster diversity. Our assemblies provide critical genomic resources to facilitate deeper investigations into spider evolution. The near chromosome-level genome of the 'living fossil' R. nishihirai represents an especially important step forward, offering new insights into the origins of spider traits.
- Klíčová slova
- Hi‐C, Mesothelae, assembly, chromosome, karyotype, spider silk,
- MeSH
- fylogeneze * MeSH
- genom genetika MeSH
- hedvábí genetika MeSH
- jedovatá zvířata MeSH
- pavouci * genetika klasifikace MeSH
- zvířata MeSH
- Check Tag
- zvířata MeSH
- Publikační typ
- časopisecké články MeSH
- Názvy látek
- hedvábí MeSH
Molecular techniques like metabarcoding, while promising for exploring diversity of communities, are often impeded by the lack of reference DNA sequences available for taxonomic annotation. Our study explores the benefits of combining targeted DNA barcoding and morphological taxonomy to improve metabarcoding efficiency, using beach meiofauna as a case study. Beaches are globally important ecosystems and are inhabited by meiofauna, microscopic animals living in the interstitial space between the sand grains, which play a key role in coastal biodiversity and ecosystem dynamics. However, research on meiofauna faces challenges due to limited taxonomic expertise and sparse sampling. We generated 775 new cytochrome c oxidase I DNA barcodes from meiofauna specimens collected along the Netherlands' west coast and combined them with the NCBI GenBank database. We analysed alpha and beta diversity in 561 metabarcoding samples from 24 North Sea beaches, a region extensively studied for meiofauna, using both the enriched reference database and the NCBI database without the additional reference barcodes. Our results show a 2.5-fold increase in sequence annotation and a doubling of species-level Operational Taxonomic Units (OTUs) identification when annotating the metabarcoding data with the enhanced database. Additionally, our analyses revealed a bell-shaped curve of OTU richness across the intertidal zone, aligning more closely with morphological analysis patterns, and more defined community dissimilarity patterns between supralittoral and intertidal sites. Our research highlights the importance of expanding molecular reference databases and combining morphological taxonomy with molecular techniques for biodiversity assessments, ultimately improving our understanding of coastal ecosystems.
- Klíčová slova
- DNA barcoding, Molecular reference database, community ecology, invertebrates,
- MeSH
- bezobratlí genetika klasifikace MeSH
- biodiverzita MeSH
- ekosystém MeSH
- koupací pláže MeSH
- metagenomika metody MeSH
- respirační komplex IV * genetika MeSH
- taxonomické DNA čárové kódování * metody MeSH
- zvířata MeSH
- Check Tag
- zvířata MeSH
- Publikační typ
- časopisecké články MeSH
- Geografické názvy
- Nizozemsko MeSH
- Severní moře MeSH
- Názvy látek
- respirační komplex IV * MeSH
A major aim of evolutionary biology is to understand why patterns of genomic diversity vary within taxa and space. Large-scale genomic studies of widespread species are useful for studying how environment and demography shape patterns of genomic divergence. Here, we describe one of the most geographically comprehensive surveys of genomic variation in a wild vertebrate to date; the great tit (Parus major) HapMap project. We screened ca 500,000 SNP markers across 647 individuals from 29 populations, spanning ~30 degrees of latitude and 40 degrees of longitude - almost the entire geographical range of the European subspecies. Genome-wide variation was consistent with a recent colonisation across Europe from a South-East European refugium, with bottlenecks and reduced genetic diversity in island populations. Differentiation across the genome was highly heterogeneous, with clear 'islands of differentiation', even among populations with very low levels of genome-wide differentiation. Low local recombination rates were a strong predictor of high local genomic differentiation (FST), especially in island and peripheral mainland populations, suggesting that the interplay between genetic drift and recombination causes highly heterogeneous differentiation landscapes. We also detected genomic outlier regions that were confined to one or more peripheral great tit populations, probably as a result of recent directional selection at the species' range edges. Haplotype-based measures of selection were related to recombination rate, albeit less strongly, and highlighted population-specific sweeps that likely resulted from positive selection. Our study highlights how comprehensive screens of genomic variation in wild organisms can provide unique insights into spatio-temporal evolutionary dynamics.
- Klíčová slova
- adaptation, birds, ecological genetics, genomics/proteomics, molecular evolution, population genetics – empirical,
- MeSH
- genetická variace * MeSH
- haplotypy genetika MeSH
- jednonukleotidový polymorfismus * MeSH
- Passeriformes genetika klasifikace MeSH
- populační genetika metody MeSH
- rekombinace genetická MeSH
- selekce (genetika) MeSH
- zpěvní ptáci * genetika klasifikace MeSH
- zvířata MeSH
- Check Tag
- zvířata MeSH
- Publikační typ
- časopisecké články MeSH
- Geografické názvy
- Evropa MeSH
Environmental DNA (eDNA) metabarcoding has gained growing attention as a strategy for monitoring biodiversity in ecology. However, taxa identifications produced through metabarcoding require sophisticated processing of high-throughput sequencing data from taxonomically informative DNA barcodes. Various sets of universal and taxon-specific primers have been developed, extending the usability of metabarcoding across archaea, bacteria and eukaryotes. Accordingly, a multitude of metabarcoding data analysis tools and pipelines have also been developed. Often, several developed workflows are designed to process the same amplicon sequencing data, making it somewhat puzzling to choose one among the plethora of existing pipelines. However, each pipeline has its own specific philosophy, strengths and limitations, which should be considered depending on the aims of any specific study, as well as the bioinformatics expertise of the user. In this review, we outline the input data requirements, supported operating systems and particular attributes of thirty-two amplicon processing pipelines with the goal of helping users to select a pipeline for their metabarcoding projects.
- Klíčová slova
- amplicon data analysis, bioinformatics, environmental DNA, metabarcoding, pipeline, review,
- MeSH
- analýza dat MeSH
- Archaea genetika klasifikace MeSH
- Bacteria genetika klasifikace MeSH
- environmentální DNA genetika MeSH
- Eukaryota genetika klasifikace MeSH
- metagenomika metody MeSH
- software * MeSH
- taxonomické DNA čárové kódování * metody MeSH
- výpočetní biologie * metody MeSH
- vysoce účinné nukleotidové sekvenování metody MeSH
- Publikační typ
- časopisecké články MeSH
- přehledy MeSH
- Názvy látek
- environmentální DNA MeSH
The outcome of species delimitation depends on many factors, including conceptual framework, study design, data availability, methodology employed and subjective decision making. Obtaining sufficient taxon sampling in endangered or rare taxa might be difficult, particularly when non-lethal tissue collection cannot be utilized. The need to avoid overexploitation of the natural populations may thus limit methodological framework available for downstream data analyses and bias the results. We test species boundaries in rare North American trapdoor spider genus Cyclocosmia Ausserer (1871) inhabiting the Southern Coastal Plain biodiversity hotspot with the use of genomic data and two multispecies coalescent model methods. We evaluate the performance of each methodology within a limited sampling framework. To mitigate the risk of species over splitting, common in taxa with highly structured populations, we subsequently implement a species validation step via genealogical diversification index (gdi), which accounts for both genetic isolation and gene flow. We delimited eight geographically restricted lineages within sampled North American Cyclocosmia, suggesting that major river drainages in the region are likely barriers to dispersal. Our results suggest that utilizing BPP in the species discovery step might be a good option for datasets comprising hundreds of loci, but fewer individuals, which may be a common scenario for rare taxa. However, we also show that such results should be validated via gdi, in order to avoid over splitting.
- Klíčová slova
- gdi, BPP, Bayes factor delimitation, Southern Coastal Plain biodiversity hotspot, genomic data,
- MeSH
- Bayesova věta MeSH
- biodiverzita MeSH
- druhová specificita MeSH
- fylogeneze MeSH
- genomika MeSH
- lidé MeSH
- pavouci * genetika MeSH
- tok genů MeSH
- zvířata MeSH
- Check Tag
- lidé MeSH
- zvířata MeSH
- Publikační typ
- časopisecké články MeSH
Several computational frameworks and workflows that recover genomes from prokaryotes, eukaryotes and viruses from metagenomes exist. Yet, it is difficult for scientists with little bioinformatics experience to evaluate quality, annotate genes, dereplicate, assign taxonomy and calculate relative abundance and coverage of genomes belonging to different domains. MuDoGeR is a user-friendly tool tailored for those familiar with Unix command-line environment that makes it easy to recover genomes of prokaryotes, eukaryotes and viruses from metagenomes, either alone or in combination. We tested MuDoGeR using 24 individual-isolated genomes and 574 metagenomes, demonstrating the applicability for a few samples and high throughput. While MuDoGeR can recover eukaryotic viral sequences, its characterization is predominantly skewed towards bacterial and archaeal viruses, reflecting the field's current state. However, acting as a dynamic wrapper, the MuDoGeR is designed to constantly incorporate updates and integrate new tools, ensuring its ongoing relevance in the rapidly evolving field. MuDoGeR is open-source software available at https://github.com/mdsufz/MuDoGeR. Additionally, MuDoGeR is also available as a Singularity container.
- Klíčová slova
- genome reconstruction, metagenome-assembled genomes, metagenomics, multi-domain, uncultivated viral genomes,
- MeSH
- Bacteria genetika MeSH
- fylogeneze MeSH
- metagenom * MeSH
- metagenomika MeSH
- software MeSH
- viry * genetika MeSH
- Publikační typ
- časopisecké články MeSH
Metagenomics provides a tool to assess the functional potential of environmental and host-associated microbiomes based on the analysis of environmental DNA: assembly, gene prediction and annotation. While gene prediction is straightforward for most bacterial and archaeal taxa, it has limited applicability in the majority of eukaryotic organisms, including fungi that contain introns in gene coding sequences. As a consequence, eukaryotic genes are underrepresented in metagenomics datasets and our understanding of the contribution of fungi and other eukaryotes to microbiome functioning is limited. Here, we developed a machine intelligence-based algorithm that predicts fungal introns in environmental DNA with reasonable precision and used it to improve the annotation of environmental metagenomes. Intron removal increased the number of predicted genes by up to 9.1% and improved the annotation of several others. The proportion of newly predicted genes increased with the share of eukaryotic genes in the metagenome and-within fungal taxa-increased with the number of introns per gene. Our approach provides a tool named SVMmycointron for improved metagenome annotation, especially of microbiomes with a high proportion of eukaryotes. The scripts described in the paper are made publicly available and can be readily utilized by microbiome researchers analysing metagenomics data.
- Klíčová slova
- artificial intelligence, eukaryote, fungi, gene prediction, intron, metagenomics,
- Publikační typ
- časopisecké články MeSH
As whole-genome sequencing has become pervasive, some have suggested that reduced genomic representation approaches, for example, sequence capture, are becoming obsolete. In the present study, we argue that these techniques still provide excellent tools in terms of price and quality of data as well as in their ability to provide markers with specific features, as required, for example, in phylogenomics. A potential drawback of the wide-scale application of reduced representation approaches could be their drop in efficiency with increasing phylogenetic distance from the reference species. While some studies have focused on the degree and performance of reduced representation techniques in such situations, to our knowledge, none of them evaluated their applicability to inter-specific hybrids and polyploids. This highlights a significant gap in current knowledge since there is increasing evidence for the frequent occurrence of natural hybrids and polyploids, as well as for the major importance of both phenomena in evolution. The main aim of the present study was to carry out a thorough validation of SEQcap applicability to (1) a set of non-model taxa with a wide range of phylogenetic relatedness and (2) inter-specific hybrids of various ploidies and genomic compositions. Considering the latter point, we especially focused on mechanisms causing allelic bias and consequent allelic dropout, as these could have confounding effects with respect to the evolutionary genomic dynamics of hybrids, especially in asexuals, which virtually reproduce as a frozen F1 generation.
- Klíčová slova
- Cobitis, allelic drop-out, allopolyploids, hybrids, phylogenomics, sequence capture,
- MeSH
- fylogeneze MeSH
- genom * MeSH
- genomika MeSH
- lidé MeSH
- ploidie MeSH
- polyploidie * MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
As most eukaryotic genomes are yet to be sequenced, the mechanisms underlying their contribution to different ecosystem processes remain untapped. Although approaches to recovering Prokaryotic genomes have become common in genome biology, few studies have tackled the recovery of eukaryotic genomes from metagenomes. This study assessed the reconstruction of microbial eukaryotic genomes using 6000 metagenomes from terrestrial and some transition environments using the EukRep pipeline. Only 215 metagenomic libraries yielded eukaryotic bins. From a total of 447 eukaryotic bins recovered 197 were classified at the phylum level. Streptophytes and fungi were the most represented clades with 83 and 73 bins, respectively. More than 78% of the obtained eukaryotic bins were recovered from samples whose biomes were classified as host-associated, aquatic, and anthropogenic terrestrial. However, only 93 bins were taxonomically assigned at the genus level and 17 bins at the species level. Completeness and contamination estimates were obtained for a total of 193 bins and consisted of 44.64% (σ = 27.41%) and 3.97% (σ = 6.53%), respectively. Micromonas commoda was the most frequent taxon found while Saccharomyces cerevisiae presented the highest completeness, probably because more reference genomes are available. Current measures of completeness are based on the presence of single-copy genes. However, mapping of the contigs from the recovered eukaryotic bins to the chromosomes of the reference genomes showed many gaps, suggesting that completeness measures should also include chromosome coverage. Recovering eukaryotic genomes will benefit significantly from long-read sequencing, development of tools for dealing with repeat-rich genomes, and improved reference genomes databases.
- Klíčová slova
- Hypocreales, Mamiellales, Saccharomycetales, eukaryotes, genome-resolved metagenomics,
- MeSH
- ekosystém MeSH
- Eukaryota * genetika MeSH
- genom mikrobiální MeSH
- houby genetika MeSH
- metagenom * MeSH
- metagenomika MeSH
- Publikační typ
- časopisecké články MeSH
The analysis of target enrichment data in phylogenetics lacks optimization toward using paralogues for phylogenetic reconstruction. We developed a novel approach of detecting paralogues and utilizing them for phylogenetic tree inference, by retrieving both ortho- and paralogous copies and creating orthologous alignments, from which the gene trees are built. We implemented this approach in ParalogWizard and demonstrate its performance in plant groups that underwent a whole genome duplication relatively recently: the subtribe Malinae (family Rosaceae), using Angiosperms353 as well as Malinae481 probes, the genus Oritrophium (family Asteraceae), using Compositae1061 probes, and the genus Amomum (family Zingiberaceae), using Zingiberaceae1180 probes. Discriminating between orthologues and paralogues reduced gene tree discordance and increased the species tree support in the case of the Malinae, but not for Oritrophium and Amomum. This may relate to the difference in the proportion of paralogous loci between the data sets, which was highest for the Malinae. Overall, retrieving paralogues for phylogenetic reconstruction following ParalogWizard has the potential to increase the species tree support and reduce gene tree discordance in target enrichment data, particularly if the proportion of paralogous loci is high.
- Klíčová slova
- angiosperms, bioinfomatics/phyloinfomatics, paralogy, species tree,
- MeSH
- fylogeneze MeSH
- genom * MeSH
- Publikační typ
- časopisecké články MeSH