integrative clustering
Dotaz
Zobrazit nápovědu
Markov Random Walks (MRW) has proven to be an effective way to understand spectral clustering and embedding. However, due to less global structural measure, conventional MRW (e.g., the Gaussian kernel MRW) cannot be applied to handle data points drawn from a mixture of subspaces. In this paper, we introduce a regularized MRW learning model, using a low-rank penalty to constrain the global subspace structure, for subspace clustering and estimation. In our framework, both the local pairwise similarity and the global subspace structure can be learnt from the transition probabilities of MRW. We prove that under some suitable conditions, our proposed local/global criteria can exactly capture the multiple subspace structure and learn a low-dimensional embedding for the data, in which giving the true segmentation of subspaces. To improve robustness in real situations, we also propose an extension of the MRW learning model based on integrating transition matrix learning and error correction in a unified framework. Experimental results on both synthetic data and real applications demonstrate that our proposed MRW learning model and its robust extension outperform the state-of-the-art subspace clustering methods.
- Klíčová slova
- Dimensionality reduction, Markov random walks, Spectral clustering, Subspace clustering and estimation, Transition probability learning,
- MeSH
- algoritmy MeSH
- emoce fyziologie MeSH
- lidé MeSH
- limbický systém fyziologie MeSH
- modely neurologické MeSH
- neuronové sítě * MeSH
- rozpoznávání automatizované metody MeSH
- shluková analýza MeSH
- teoretické modely MeSH
- učení MeSH
- umělá inteligence MeSH
- zvířata MeSH
- Check Tag
- lidé MeSH
- zvířata MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- přehledy MeSH
Two new genera (Streptosarcina and Streptofilum) and three new species (Streptosarcina arenaria, S. costaricana and Streptofilum capillatum) of streptophyte algae were detected in cultures isolated from terrestrial habitats of Europe and Central America and described using an integrative approach. Additionally, a strain isolated from soil in North America was identified as Hormidiella parvula and proposed as an epitype of this species. The molecular phylogeny based on 18S rRNA and rbcL genes, secondary structure of ITS-2, as well as the morphology of vegetative and reproductive stages, cell ultrastructure, ecology and distribution of the investigated strains were assessed. The new genus Streptosarcina forms a sister lineage to the genus Hormidiella (Klebsormidiophyceae). Streptosarcina is characterized by packet-like (sarcinoid) and filamentous thalli with true branching and a cell organization typical for Klebsormidiophyceae. Streptofilum forms a separate lineage within Streptophyta. This genus represents an easily disintegrating filamentous alga which exhibits a cell coverage of unique structure: layers of submicroscopic scales of piliform shape covering the plasmalemma and exfoliate inside the mucilage envelope surrounding cells. The implications of the discovery of the new taxa for understanding evolutionary tendencies in the Streptophyta, a group of great evolutionary interest, are discussed.
- Klíčová slova
- Hormidiella, Streptofilum, Streptophyta, Streptosarcina, integrative approach, ultrastructure.,
- MeSH
- DNA rostlinná chemie genetika MeSH
- ekosystém * MeSH
- fylogeneze * MeSH
- konformace nukleové kyseliny MeSH
- mezerníky ribozomální DNA chemie genetika MeSH
- mikroskopie MeSH
- půdní mikrobiologie MeSH
- ribozomální DNA chemie genetika MeSH
- ribulosa-1,5-bisfosfát-karboxylasa genetika MeSH
- RNA ribozomální 18S genetika MeSH
- sekvenční analýza DNA MeSH
- shluková analýza MeSH
- Streptophyta klasifikace genetika ultrastruktura MeSH
- transmisní elektronová mikroskopie MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Geografické názvy
- Evropa MeSH
- Severní Amerika MeSH
- Střední Amerika MeSH
- Názvy látek
- DNA rostlinná MeSH
- mezerníky ribozomální DNA MeSH
- RbcL protein, plastid MeSH Prohlížeč
- ribozomální DNA MeSH
- ribulosa-1,5-bisfosfát-karboxylasa MeSH
- RNA ribozomální 18S MeSH
By merging advanced dimensionality reduction (DR) and clustering algorithm (CA) techniques, our study advances the sampling procedure for predicting NMR chemical shifts (CS) in intrinsically disordered proteins (IDPs), making a significant leap forward in the field of protein analysis/modeling. We enhance NMR CS sampling by generating clustered ensembles that accurately reflect the different properties and phenomena encapsulated by the IDP trajectories. This investigation critically assessed different rapid CS predictors, both neural network (e.g., Sparta+ and ShiftX2) and database-driven (ProCS-15), and highlighted the need for more advanced quantum calculations and the subsequent need for more tractable-sized conformational ensembles. Although neural network CS predictors outperformed ProCS-15 for all atoms, all tools showed poor agreement with HN CSs, and the neural network CS predictors were unable to capture the influence of phosphorylated residues, highly relevant for IDPs. This study also addressed the limitations of using direct clustering with collective variables, such as the widespread implementation of the GROMOS algorithm. Clustered ensembles (CEs) produced by this algorithm showed poor performance with chemical shifts compared to sequential ensembles (SEs) of similar size. Instead, we implement a multiscale DR and CA approach and explore the challenges and limitations of applying these algorithms to obtain more robust and tractable CEs. The novel feature of this investigation is the use of solvent-accessible surface area (SASA) as one of the fingerprints for DR alongside previously investigated α carbon distance/angles or ϕ/ψ dihedral angles. The ensembles produced with SASA tSNE DR produced CEs better aligned with the experimental CS of between 0.17 and 0.36 r2 (0.18-0.26 ppm) depending on the system and replicate. Furthermore, this technique produced CEs with better agreement than traditional SEs in 85.7% of all ensemble sizes. This study investigates the quality of ensembles produced based on different input features, comparing latent spaces produced by linear vs nonlinear DR techniques and a novel integrated silhouette score scanning protocol for tSNE DR.
Large-scale next-generation sequencing (NGS) studies revealed extensive genetic heterogeneity, driving a highly variable clinical course of chronic lymphocytic leukaemia (CLL). The evolution of subclonal populations contributes to diverse therapy responses and disease refractoriness. Besides, the dynamics and impact of subpopulations before therapy initiation are not well understood. We examined changes in genomic defects in serial samples of 100 untreated CLL patients, spanning from indolent to aggressive disease. A comprehensive NGS panel LYNX, which provides targeted mutational analysis and genome-wide chromosomal defect assessment, was employed. We observed dynamic changes in the composition and/or proportion of genomic aberrations in most patients (62%). Clonal evolution of gene variants prevailed over the chromosomal alterations. Unsupervised clustering based on aberration dynamics revealed four groups of patients with different clinical behaviour. An adverse cluster was associated with fast progression and early therapy need, characterized by the expansion of TP53 defects, ATM mutations, and 18p- alongside dynamic SF3B1 mutations. Our results show that clonal evolution is active even without therapy pressure and that repeated genetic testing can be clinically relevant during long-term patient monitoring. Moreover, integrative NGS testing contributes to the consolidated evaluation of results and accurate assessment of individual patient prognosis.
- Klíčová slova
- chronic lymphocytic leukaemia, clonal evolution, genomic aberration, integrative NGS testing, prognosis,
- MeSH
- chronická lymfatická leukemie * genetika farmakoterapie MeSH
- genomika MeSH
- lidé MeSH
- mutace MeSH
- prognóza MeSH
- vysoce účinné nukleotidové sekvenování MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
BACKGROUND: The speckled-pelage brush-furred rats (Lophuromys flavopunctatus group) have been difficult to define given conflicting genetic, morphological, and distributional records that combine to obscure meaningful accounts of its taxonomic diversity and evolution. In this study, we inferred the systematics, phylogeography, and evolutionary history of the L. flavopunctatus group using maximum likelihood and Bayesian phylogenetic inference, divergence times, historical biogeographic reconstruction, and morphometric discriminant tests. We compiled comprehensive datasets of three loci (two mitochondrial [mtDNA] and one nuclear) and two morphometric datasets (linear and geometric) from across the known range of the genus Lophuromys. RESULTS: The mtDNA phylogeny supported the division of the genus Lophuromys into three primary groups with nearly equidistant pairwise differentiation: one group corresponding to the subgenus Kivumys (Kivumys group) and two groups corresponding to the subgenus Lophuromys (L. sikapusi group and L. flavopunctatus group). The L. flavopunctatus group comprised the speckled-pelage brush-furred Lophuromys endemic to Ethiopia (Ethiopian L. flavopunctatus members [ETHFLAVO]) and the non-Ethiopian ones (non-Ethiopian L. flavopunctatus members [NONETHFLAVO]) in deeply nested relationships. There were distinctly geographically structured mtDNA clades among the NONETHFLAVO, which were incongruous with the nuclear tree where several clades were unresolved. The morphometric datasets did not systematically assign samples to meaningful taxonomic units or agree with the mtDNA clades. The divergence dating and ancestral range reconstructions showed the NONETHFLAVO colonized the current ranges over two independent dispersal events out of Ethiopia in the early Pleistocene. CONCLUSION: The phylogenetic associations and divergence times of the L. flavopunctatus group support the hypothesis that paleoclimatic impacts and ecosystem refugia during the Pleistocene impacted the evolutionary radiation of these rodents. The overlap in craniodental variation between distinct mtDNA clades among the NONETHFLAVO suggests unraveling underlying ecomorphological drivers is key to reconciling taxonomically informative morphological characters. The genus Lophuromys requires a taxonomic reassessment based on extensive genomic evidence to elucidate the patterns and impacts of genetic isolation at clade contact zones.
- Klíčová slova
- Biogeography, East Africa, Integrative systematics, Kivumys, Lophuromys, Lophuromys flavopunctatus group,
- MeSH
- Bayesova věta MeSH
- ekosystém * MeSH
- fylogeneze MeSH
- krysa rodu Rattus MeSH
- mitochondriální DNA * genetika MeSH
- zvířata MeSH
- Check Tag
- krysa rodu Rattus MeSH
- zvířata MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Geografické názvy
- Etiopie MeSH
- Názvy látek
- mitochondriální DNA * MeSH
Our understanding of fungal diversity is far from complete. Species descriptions generally focus on morphological features, but this approach may underestimate true diversity. Using the morphological species concept, Hesperomyces virescens (Ascomycota, Laboulbeniales) is a single species with global distribution and wide host range. Since its description 120 years ago, this fungal parasite has been reported from 30 species of ladybird hosts on all continents except Antarctica. These host usage patterns suggest that H. virescens could be made up of many different species, each adapted to individual host species. Using sequence data from three gene regions, we found evidence for distinct clades within Hesperomyces virescens, each clade corresponding to isolates from a single host species. We propose that these lineages represent separate species, driven by adaptation to different ladybird hosts. Our combined morphometric, molecular phylogenetic and ecological data provide support for a unified species concept and an integrative taxonomy approach.
- MeSH
- analýza hlavních komponent MeSH
- Ascomycota klasifikace genetika izolace a purifikace fyziologie MeSH
- brouci parazitologie MeSH
- DNA fungální chemie genetika metabolismus MeSH
- fylogeneze MeSH
- mezerníky ribozomální DNA chemie genetika metabolismus MeSH
- sekvenční analýza DNA MeSH
- zvířata MeSH
- Check Tag
- zvířata MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Názvy látek
- DNA fungální MeSH
- mezerníky ribozomální DNA MeSH
Nuclear ribosomal RNA (rRNA) genes represent the oldest repetitive fraction universal to all eukaryotic genomes. Their deeply anchored universality and omnipresence during eukaryotic evolution reflects in multiple roles and functions reaching far beyond ribosomal synthesis. Merely the copy number of non-transcribed rRNA genes is involved in mechanisms governing e.g. maintenance of genome integrity and control of cellular aging. Their copy number can vary in response to environmental cues, in cellular stress sensing, in development of cancer and other diseases. While reaching hundreds of copies in humans, there are records of up to 20,000 copies in fish and frogs and even 400,000 copies in ciliates forming thus a literal subgenome or an rDNAome within the genome. From the compositional and evolutionary dynamics viewpoint, the precursor 45S rDNA represents universally GC-enriched, highly recombining and homogenized regions. Hence, it is not accidental that both rDNA sequence and the corresponding rRNA secondary structure belong to established phylogenetic markers broadly used to infer phylogeny on multiple taxonomical levels including species delimitation. However, these multiple roles of rDNAs have been treated and discussed as being separate and independent from each other. Here, I aim to address nuclear rDNAs in an integrative approach to better assess the complexity of rDNA importance in the evolutionary context.
- Klíčová slova
- GC-content, nuclear rDNA, nucleolus, rRNA, secondary structure,
- MeSH
- buněčné jadérko genetika MeSH
- Eukaryota genetika MeSH
- fylogeneze MeSH
- genom MeSH
- lidé MeSH
- molekulární evoluce MeSH
- ribozomální DNA genetika MeSH
- RNA ribozomální genetika MeSH
- variabilita počtu kopií segmentů DNA MeSH
- zvířata MeSH
- Check Tag
- lidé MeSH
- zvířata MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- přehledy MeSH
- Názvy látek
- ribozomální DNA MeSH
- RNA ribozomální MeSH
The freshwaters of Iraq harbour a high diversity of endemic and phylogenetically unique species. One of the most diversified fish groups in this region is cyprinoids, and although their distribution is relatively well known, their monogenean parasites have only rarely been investigated. Herein, we applied an integrative approach, combining morphology with molecular data, to assess the diversity and phylogeny of cyprinoid-associated monogenean parasites. A total of 33 monogenean species were collected and identified from 13 endemic cyprinoid species. The highest species diversity was recorded for Dactylogyrus (Dactylogyridae, 16 species) and Gyrodactylus (Gyrodactylidae, 12 species). Four species of Dactylogyrus and 12 species of Gyrodactylus were identified as new to science and described. Two other genera, Dogielius (Dactylogyridae) and Paradiplozoon (Diplozoidae), were represented only by 4 and 1 species, respectively. Phylogenetic analyses of the Dactylogyrus and Gyrodactylus species revealed that the local congeners do not form a monophyletic group and are phylogenetically closely related to species from other regions (i.e. Europe, North Africa and Eastern Asia). These findings support the assumption that the Middle East served as an important historical crossroads for the interchange of fauna between these 3 geographic regions.
- Klíčová slova
- Cyprinoidei, Dactylogyrus, Dogielius, Gyrodactylus, Middle East, Paradiplozoon, phylogeny, species diversity,
- MeSH
- fylogeneze MeSH
- ryby MeSH
- Trematoda * genetika MeSH
- zvířata MeSH
- Check Tag
- zvířata MeSH
- Publikační typ
- časopisecké články MeSH
- Geografické názvy
- Irák epidemiologie MeSH
- severní Afrika MeSH
- Střední východ MeSH
While molecular subgrouping has revolutionized medulloblastoma classification, the extent of heterogeneity within subgroups is unknown. Similarity network fusion (SNF) applied to genome-wide DNA methylation and gene expression data across 763 primary samples identifies very homogeneous clusters of patients, supporting the presence of medulloblastoma subtypes. After integration of somatic copy-number alterations, and clinical features specific to each cluster, we identify 12 different subtypes of medulloblastoma. Integrative analysis using SNF further delineates group 3 from group 4 medulloblastoma, which is not as readily apparent through analyses of individual data types. Two clear subtypes of infants with Sonic Hedgehog medulloblastoma with disparate outcomes and biology are identified. Medulloblastoma subtypes identified through integrative clustering have important implications for stratification of future clinical trials.
- Klíčová slova
- copy number, gene expression, integrative clustering, medulloblastoma, methylation, subgroups,
- MeSH
- genomika MeSH
- individualizovaná medicína * MeSH
- kohortové studie MeSH
- lidé MeSH
- meduloblastom klasifikace genetika terapie MeSH
- metylace DNA MeSH
- shluková analýza MeSH
- stanovení celkové genové exprese MeSH
- variabilita počtu kopií segmentů DNA MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Research Support, N.I.H., Extramural MeSH
BACKGROUND: Ethiopia is affected by human leishmaniasis caused by several Leishmania species and transmitted by a variety of sand fly vectors of the genus Phlebotomus. The sand fly fauna in Ethiopia is highly diverse and some species are closely related and similar in morphology, resulting in difficulties with species identification that requires deployment of molecular techniques. DNA barcoding entails high costs, requires time and lacks reference sequences for many Ethiopian species. Yet, proper species identification is pivotal for epidemiological surveillance as species differ in their actual involvement in transmission cycles. Recently, protein profiling using MALDI-TOF mass spectrometry has been introduced as a promising technique for sand fly identification. METHODS: In our study, we used an integrative taxonomic approach to identify most of the important sand fly vectors of leishmaniasis in Ethiopia, applying three complementary methods: morphological assessment, sequencing analysis of two genetic markers, and MALDI-TOF MS protein profiling. RESULTS: Although morphological assessment resulted in some inconclusive identifications, both DNA- and protein-based techniques performed well, providing a similar hierarchical clustering pattern for the analyzed species. Both methods generated species-specific sequences or protein patterns for all species except for Phlebotomus pedifer and P. longipes, the two presumed vectors of Leishmania aethiopica, suggesting that they may represent a single species, P. longipes Parrot & Martin. All three approaches also revealed that the collected specimens of Adlerius sp. differ from P. (Adlerius) arabicus, the only species of Adlerius currently reported in Ethiopia, and molecular comparisons indicate that it may represent a yet undescribed new species. CONCLUSIONS: Our study uses three complementary taxonomical methods for species identification of taxonomically challenging and yet medically import Ethiopian sand flies. The generated MALDI-TOF MS protein profiles resulted in unambiguous identifications, hence showing suitability of this technique for sand fly species identification. Furthermore, our results contribute to the still inadequate knowledge of the sand fly fauna of Ethiopia, a country severely burdened with human leishmaniasis.
- Klíčová slova
- DNA barcoding, Ethiopia, MALDI-TOF mass spectrometry, Morphology, Phlebotomus, Protein profiling, Sand flies,
- MeSH
- druhová specificita MeSH
- fylogeneze MeSH
- hmyz - vektory klasifikace MeSH
- leishmanióza MeSH
- Psychodidae klasifikace MeSH
- spektrometrie hmotnostní - ionizace laserem za účasti matrice metody MeSH
- zvířata MeSH
- Check Tag
- mužské pohlaví MeSH
- ženské pohlaví MeSH
- zvířata MeSH
- Publikační typ
- časopisecké články MeSH
- Geografické názvy
- Etiopie epidemiologie MeSH