pangenome Dotaz Zobrazit nápovědu
There is an increasing understanding that variation in gene presence-absence plays an important role in the heritability of agronomic traits; however, there have been relatively few studies on variation in gene presence-absence in crop species. Hexaploid wheat is one of the most important food crops in the world and intensive breeding has reduced the genetic diversity of elite cultivars. Major efforts have produced draft genome assemblies for the cultivar Chinese Spring, but it is unknown how well this represents the genome diversity found in current modern elite cultivars. In this study we build an improved reference for Chinese Spring and explore gene diversity across 18 wheat cultivars. We predict a pangenome size of 140 500 ± 102 genes, a core genome of 81 070 ± 1631 genes and an average of 128 656 genes in each cultivar. Functional annotation of the variable gene set suggests that it is enriched for genes that may be associated with important agronomic traits. In addition to variation in gene presence, more than 36 million intervarietal single nucleotide polymorphisms were identified across the pangenome. This study of the wheat pangenome provides insight into genome diversity in elite wheat as a basis for genomics-based improvement of this important crop. A wheat pangenome, GBrowse, is available at http://appliedbioinformatics.com.au/cgi-bin/gb2/gbrowse/WheatPan/, and data are available to download from http://wheatgenome.info/wheat_genome_databases.php.
The HXB/BXH family of recombinant inbred rat strains is a unique genetic resource that has been extensively phenotyped over 25 years, resulting in a vast dataset of quantitative molecular and physiological phenotypes. We built a pangenome graph from 10x Genomics Linked-Read data for 31 recombinant inbred rats to study genetic variation and association mapping. The pangenome includes 0.2Gb of sequence that is not present the reference mRatBN7.2, confirming the capture of substantial additional variation. We validated variants in challenging regions, including complex structural variants resolving into multiple haplotypes. Phenome-wide association analysis of validated SNPs uncovered variants associated with glucose/insulin levels and hippocampal gene expression. We propose an interaction between Pirl1l1, chromogranin expression, TNF-α levels, and insulin regulation. This study demonstrates the utility of linked-read pangenomes for comprehensive variant detection and mapping phenotypic diversity in a widely used rat genetic reference panel.
- Publikační typ
- časopisecké články MeSH
Polyploidy, the result of whole-genome duplication (WGD), is a major driver of eukaryote evolution. Yet WGDs are hugely disruptive mutations, and we still lack a clear understanding of their fitness consequences. Here, we study whether WGDs result in greater diversity of genomic structural variants (SVs) and how they influence evolutionary dynamics in a plant genus, Cochlearia (Brassicaceae). By using long-read sequencing and a graph-based pangenome, we find both negative and positive interactions between WGDs and SVs. Masking of recessive mutations due to WGDs leads to a progressive accumulation of deleterious SVs across four ploidal levels (from diploids to octoploids), likely reducing the adaptive potential of polyploid populations. However, we also discover putative benefits arising from SV accumulation, as more ploidy-specific SVs harbor signals of local adaptation in polyploids than in diploids. Together, our results suggest that SVs play diverse and contrasting roles in the evolutionary trajectories of young polyploids.
Genetic variations in protein expression are implicated in a broad spectrum of common diseases and complex traits but remain less explored compared to mRNA and classical phenotypes. This study systematically analyzed brain proteomes in a rat family using tandem mass tag (TMT)-based quantitative mass spectrometry. We quantified 8,119 proteins across two parental strains (SHR/Olalpcv and BN-Lx/Cub) and 29 HXB/BXH recombinant inbred (RI) strains, identifying 597 proteins with differential expression and 464 proteins linked to cis-acting quantitative trait loci (pQTLs). Proteogenomics identified 95 variant peptides, and sex-specific analyses revealed both shared and distinct cis-pQTLs. We improved the ability to pinpoint candidate genes underlying pQTLs by utilizing the rat pangenome and explored the connections between pQTLs in rats and human disorders. Collectively, this study highlights the value of large proteo-genetic datasets in elucidating protein modulation in the brain and its links to complex central nervous system (CNS) traits.
- Publikační typ
- časopisecké články MeSH
Species belonging to the Mycobacterium kansasii complex (MKC) are frequently isolated from humans and the environment and can cause serious diseases. The most common MKC infections are caused by the species M. kansasii (sensu stricto), leading to tuberculosis-like disease. However, a broad spectrum of virulence, antimicrobial resistance and pathogenicity of these non-tuberculous mycobacteria (NTM) are observed across the MKC. Many genomic aspects of the MKC that relate to these broad phenotypes are not well elucidated. Here, we performed genomic analyses from a collection of 665 MKC strains, isolated from environmental, animal and human sources. We inferred the MKC pangenome, mobilome, resistome, virulome and defence systems and show that the MKC species harbours unique and shared genomic signatures. High frequency of presence of prophages and different types of defence systems were observed. We found that the M. kansasii species splits into four lineages, of which three are lowly represented and mainly in Brazil, while one lineage is dominant and globally spread. Moreover, we show that four sub-lineages of this most distributed M. kansasii lineage emerged during the twentieth century. Further analysis of the M. kansasii genomes revealed almost 300 regions of difference contributing to genomic diversity, as well as fixed mutations that may explain the M. kansasii's increased virulence and drug resistance.
- MeSH
- atypické mykobakteriální infekce * mikrobiologie MeSH
- fylogeneze * MeSH
- genom bakteriální * MeSH
- genomika * MeSH
- lidé MeSH
- Mycobacterium kansasii * genetika klasifikace izolace a purifikace MeSH
- virulence genetika MeSH
- zvířata MeSH
- Check Tag
- lidé MeSH
- zvířata MeSH
- Publikační typ
- časopisecké články MeSH
Pseudomonas fluorescens is a well-known food spoiler, able to cause serious economic losses in the food industry due to its ability to produce many extracellular, and often thermostable, compounds. The most outstanding spoilage events involving P. fluorescens were blue discoloration of several food stuffs, mainly dairy products. The bacteria involved in such high-profile cases have been identified as belonging to a clearly distinct phylogenetic cluster of the P. fluorescens group. Although the blue pigment has recently been investigated in several studies, the biosynthetic pathway leading to the pigment formation, as well as its chemical nature, remain challenging and unsolved points. In the present paper, genomic and transcriptomic data of 4 P. fluorescens strains (2 blue-pigmenting strains and 2 non-pigmenting strains) were analyzed to evaluate the presence and the expression of blue strain-specific genes. In particular, the pangenome analysis showed the presence in the blue-pigmenting strains of two copies of genes involved in the tryptophan biosynthesis pathway (including trpABCDF). The global expression profiling of blue-pigmenting strains versus non-pigmenting strains showed a general up-regulation of genes involved in iron uptake and a down-regulation of genes involved in primary metabolism. Chromogenic reaction of the blue-pigmenting bacterial cells with Kovac's reagent indicated an indole-derivative as the precursor of the blue pigment. Finally, solubility tests and MALDI-TOF mass spectrometry analysis of the isolated pigment suggested that its molecular structure is very probably a hydrophobic indigo analog.
- MeSH
- biologické pigmenty genetika MeSH
- down regulace MeSH
- energetický metabolismus genetika MeSH
- fenotyp MeSH
- fylogeneze MeSH
- genomika MeSH
- mléčné výrobky mikrobiologie MeSH
- oxidoreduktasy genetika MeSH
- potravinářská mikrobiologie * MeSH
- Pseudomonas fluorescens genetika metabolismus MeSH
- spotřeba kyslíku genetika MeSH
- stanovení celkové genové exprese MeSH
- transkriptom genetika MeSH
- tryptofan biosyntéza MeSH
- upregulace MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
Two Gram-stain-positive, coagulase-negative staphylococcal strains were isolated from abiotic sources comprising stone fragments and sandy soil in James Ross Island, Antarctica. Here, we describe properties of a novel species of the genus Staphylococcus that has a 16S rRNA gene sequence nearly identical to that of Staphylococcus saprophyticus However, compared to S. saprophyticus and the next closest relatives, the new species demonstrates considerable phylogenetic distance at the whole-genome level, with an average nucleotide identity of <85% and inferred DNA-DNA hybridization of <30%. It forms a separate branch in the S. saprophyticus phylogenetic clade as confirmed by multilocus sequence analysis of six housekeeping genes, rpoB, hsp60, tuf, dnaJ, gap, and sod Matrix-assisted laser desorption ionization-time of flight mass spectrometry (MALDI-TOF MS) and key biochemical characteristics allowed these bacteria to be distinguished from their nearest phylogenetic neighbors. In contrast to S. saprophyticus subsp. saprophyticus, the novel strains are pyrrolidonyl arylamidase and β-glucuronidase positive and β-galactosidase negative, nitrate is reduced, and acid produced aerobically from d-mannose. Whole-genome sequencing of the 2.69-Mb large chromosome revealed the presence of a number of mobile genetic elements, including the 27-kb pseudo-staphylococcus cassette chromosome mec of strain P5085T (ψSCCmecP5085), harboring the mecC gene, two composite phage-inducible chromosomal islands probably essential to adaptation to extreme environments, and one complete and one defective prophage. Both strains are resistant to penicillin G, ampicillin, ceftazidime, methicillin, cefoxitin, and fosfomycin. We hypothesize that antibiotic resistance might represent an evolutionary advantage against beta-lactam producers, which are common in a polar environment. Based on these results, a novel species of the genus Staphylococcus is described and named Staphylococcus edaphicus sp. nov. The type strain is P5085T (= CCM 8730T = DSM 104441T).IMPORTANCE The description of Staphylococcus edaphicus sp. nov. enables the comparison of multidrug-resistant staphylococci from human and veterinary sources evolved in the globalized world to their geographically distant relative from the extreme Antarctic environment. Although this new species was not exposed to the pressure of antibiotic treatment in human or veterinary practice, mobile genetic elements carrying antimicrobial resistance genes were found in the genome. The genomic characteristics presented here elucidate the evolutionary relationships in the Staphylococcus genus with a special focus on antimicrobial resistance, pathogenicity, and survival traits. Genes encoded on mobile genetic elements were arranged in unique combinations but retained conserved locations for the integration of mobile genetic elements. These findings point to enormous plasticity of the staphylococcal pangenome, shaped by horizontal gene transfer. Thus, S. edaphicus can act not only as a reservoir of antibiotic resistance in a natural environment but also as a mediator for the spread and evolution of resistance genes.
- MeSH
- bakteriální geny fyziologie MeSH
- biologická adaptace genetika MeSH
- extrémně chladné počasí * MeSH
- extrémní prostředí * MeSH
- genomové ostrovy fyziologie MeSH
- Staphylococcus klasifikace genetika fyziologie MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Geografické názvy
- Antarktida MeSH
Ever since the introduction of high-throughput sequencing following the human genome project, assembling short reads into a reference of sufficient quality posed a significant problem as a large portion of the human genome-estimated 50-69%-is repetitive. As a result, a sizable proportion of sequencing reads is multi-mapping, i.e., without a unique placement in the genome. The two key parameters for whether or not a read is multi-mapping are the read length and genome complexity. Long reads are now able to span difficult, heterochromatic regions, including full centromeres, and characterize chromosomes from "telomere to telomere". Moreover, identical reads or repeat arrays can be differentiated based on their epigenetic marks, such as methylation patterns, aiding in the assembly process. This is despite the fact that long reads still contain a modest percentage of sequencing errors, disorienting the aligners and assemblers both in accuracy and speed. Here, I review the proposed and implemented solutions to the repeat resolution and the multi-mapping read problem, as well as the downstream consequences of reference choice, repeat masking, and proper representation of sex chromosomes. I also consider the forthcoming challenges and solutions with regards to long reads, where we expect the shift from the problem of repeat localization within a single individual to the problem of repeat positioning within pangenomes.
- MeSH
- centromera chemie MeSH
- délka genomu MeSH
- genom lidský * MeSH
- lidé MeSH
- mapování chromozomů metody MeSH
- metylace DNA MeSH
- mikrosatelitní repetice * MeSH
- pohlavní chromozomy chemie MeSH
- telomery chemie MeSH
- výpočetní biologie metody MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- přehledy MeSH
BACKGROUND: The prognostic impact of segmental chromosome alterations (SCAs) in children older than 1 year, diagnosed with localised unresectable neuroblastoma (NB) without MYCN amplification enrolled in the European Unresectable Neuroblastoma (EUNB) protocol is still to be clarified, while, for other group of patients, the presence of SCAs is associated with poor prognosis. METHODS: To understand the role of SCAs we performed multilocus/pangenomic analysis of 98 tumour samples from patients enrolled in the EUNB protocol. RESULTS: Age at diagnosis was categorised into two groups using 18 months as the age cutoff. Significant difference in the presence of SCAs was seen in tumours of patients between 12 and 18 months and over 18 months of age at diagnosis, respectively (P=0.04). A significant correlation (P=0.03) was observed between number of SCAs per tumour and age. Event-free (EFS) and overall survival (OS) were calculated in both age groups, according to both the presence and number of SCAs. In older patients, a poorer survival was associated with the presence of SCAs (EFS=46% vs 75%, P=0.023; OS=66.8% vs 100%, P=0.003). Moreover, OS of older patients inversely correlated with number of SCAs (P=0.002). Finally, SCAs provided additional prognostic information beyond histoprognosis, as their presence was associated with poorer OS in patients over 18 months with unfavourable International Neuroblastoma Pathology Classification (INPC) histopathology (P=0.018). CONCLUSIONS: The presence of SCAs is a negative prognostic marker that impairs outcome of patients over the age of 18 months with localised unresectable NB without MYCN amplification, especially when more than one SCA is present. Moreover, in older patients with unfavourable INPC tumour histoprognosis, the presence of SCAs significantly affects OS.
- MeSH
- amplifikace genu MeSH
- chromozomální aberace MeSH
- jaderné proteiny genetika MeSH
- Kaplanův-Meierův odhad MeSH
- kojenec MeSH
- lidé MeSH
- nádory periferního nervového systému diagnóza genetika mortalita MeSH
- neuroblastom diagnóza genetika mortalita MeSH
- onkogenní proteiny genetika MeSH
- přežití bez známek nemoci MeSH
- prognóza MeSH
- srovnávací genomová hybridizace MeSH
- Check Tag
- kojenec MeSH
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH