large-scale data set
Dotaz
Zobrazit nápovědu
Several molecular clonality assays have been developed to assess canine B cell proliferations. These assays were based on different sequence data, utilized different assay designs and employed different testing strategies. This has resulted in a complex body of literature and complicates evidence-based selection of primer sets. In addition, further refinement of primer sets is difficult because it is unknown how well current primer sets cover the expressed sequence repertoire. The objectives of this study were 1) to provide an overview of published IGH clonality assays that highlights key differences in assay design and testing strategy and 2) to propose a novel method for optimizing primer sets that leverages large-scale sequencing data. A review of previously published assays highlighted confounding factors that hamper a direct comparison of performance metrics between studies. These findings illustrate the need for a multi-institutional effort to harmonize veterinary clonality testing. A novel in silico analysis of primer sequences using a large dataset of expressed sequences identified shortfalls of existing primer sets and was used to guide primer optimization. Three optimized primer sets were tested and yielded qualitative sensitivity values between 80-90%. The qualitative sensitivity ranged from 1% to over 50% and was dependent on the size of the neoplastic clone and the sample DNA used. These findings illustrate that inclusion of high-throughput sequencing data for primer design can be a useful tool to guide primer design and optimization. This strategy could be applied to other antigen receptor loci or species to further improve veterinary clonality assays.
- MeSH
- B-lymfocyty cytologie MeSH
- buněčné klony * MeSH
- DNA primery * MeSH
- psi genetika imunologie MeSH
- těžké řetězce imunoglobulinů genetika MeSH
- zvířata MeSH
- Check Tag
- psi genetika imunologie MeSH
- zvířata MeSH
- Publikační typ
- časopisecké články MeSH
- přehledy MeSH
MOTIVATION: Satellite DNA makes up significant portion of many eukaryotic genomes, yet it is relatively poorly characterized even in extensively sequenced species. This is, in part, due to methodological limitations of traditional methods of satellite repeat analysis, which are based on multiple alignments of monomer sequences. Therefore, we employed an alternative, alignment-free, approach utilizing k-mer frequency statistics, which is in principle more suitable for analyzing large sets of satellite repeat data, including sequence reads from next generation sequencing technologies. RESULTS: k-mer frequency spectra were determined for two sets of rice centromeric satellite CentO sequences, including 454 reads from ChIP-sequencing of CENH3-bound DNA (7.6 Mb) and the whole genome Sanger sequencing reads (5.8 Mb). k-mer frequencies were used to identify the most conserved sequence regions and to reconstruct consensus sequences of complete monomers. Reconstructed consensus sequences as well as the assessment of overall divergence of k-mer spectra revealed high similarity of the two datasets, suggesting that CentO sequences associated with functional centromeres (CENH3-bound) do not significantly differ from the total population of CentO, which includes both centromeric and pericentromeric repeat arrays. On the other hand, considerable differences were revealed when these methods were used for comparison of CentO populations between individual chromosomes of the rice genome assembly, demonstrating preferential sequence homogenization of the clusters within the same chromosome. k-mer frequencies were also successfully used to identify and characterize smRNAs derived from CentO repeats.
- MeSH
- centromera genetika MeSH
- chromozomy rostlin genetika MeSH
- DNA rostlinná genetika MeSH
- konzervovaná sekvence genetika MeSH
- molekulární sekvence - údaje MeSH
- rýže (rod) genetika MeSH
- satelitní DNA genetika MeSH
- sekvence nukleotidů MeSH
- sekvenční analýza DNA metody MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Research Support, U.S. Gov't, Non-P.H.S. MeSH
BACKGROUND: By 2030, over 50% of individuals living with bipolar disorder (BD) are expected to be aged ≥50 years. However, older age bipolar disorder (OABD) remains understudied. There are limited large-scale prospectively collected data organized in key dimensions capable of addressing several fundamental questions about BD affecting this subgroup of patients. METHODS: We developed initial recommendations for the essential dimensions for OABD data collection, based on (1) a systematic review of measures used in OABD studies, (2) a Delphi consensus of international OABD experts, (3) experience with harmonizing OABD data in the Global Aging & Geriatric Experiments in Bipolar Disorder Database (GAGE-BD, n ≥ 4500 participants), and (4) critical feedback from 34 global experts in geriatric mental health. RESULTS: We identified 15 key dimensions and variables within each that are relevant for the investigation of OABD: (1) demographics, (2) core symptoms of depression and (3) mania, (4) cognition screening and subjective cognitive function, (5) elements for BD diagnosis, (6) descriptors of course of illness, (7) treatment, (8) suicidality, (9) current medication, (10) psychiatric comorbidity, (11) psychotic symptoms, (12) general medical comorbidities, (13) functioning, (14) family history, and (15) other. We also recommend particular instruments for capturing some of the dimensions and variables. CONCLUSION: The essential data dimensions we present should be of use to guide future international data collection in OABD and clinical practice. In the longer term, we aim to establish a prospective consortium using this core set of dimensions and associated variables to answer research questions relevant to OABD.
- MeSH
- bipolární porucha * diagnóza epidemiologie terapie MeSH
- kognice MeSH
- lidé MeSH
- prospektivní studie MeSH
- sběr dat MeSH
- senioři MeSH
- stárnutí psychologie MeSH
- Check Tag
- lidé MeSH
- senioři MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- směrnice pro lékařskou praxi MeSH
- systematický přehled MeSH
Abandoning daylight saving time in Europe raises the topical issue of proper setting of yearlong social time, which needs mapping of various socio-demographic factors, including chronotype, in specific geographic regions. This study represents the first detailed large scale chronotyping in the Czech Republic based on data collected in the complex panel socio-demographic survey in households (total 8760 respondents) and the socio-physiological survey, in which chronotyped participants also provided blood samples (n = 1107). Chronotype assessment based on sleep phase (MCTQ questions and/or time-use diary) correlated with a self-assessed interval of best alertness. The mean chronotype of the Czech population defined as mid sleep phase (MSFsc) was 3.13 ± 0.02 h. Chronotype exhibited significant east-to-westward, north-to-southward, and settlement size-dependent gradients and was associated with age, sex, partnership, and time spent outdoors as previously demonstrated. Moreover, for subjects younger than 40 years, childcare was highly associated with earlier chronotype, while dog care was associated with later chronotype. Body mass index correlated with later chronotype in women whose extreme chronotype was also associated with lower plasma levels of protective HDL cholesterol. Based on the chronotype prevalence the results favour yearlong Standard Time as the best choice for this geographic region.
- MeSH
- časové faktory MeSH
- chronobiologie (obor) statistika a číselné údaje MeSH
- cirkadiánní hodiny fyziologie MeSH
- demografie statistika a číselné údaje MeSH
- dítě MeSH
- dospělí MeSH
- fotoperioda * MeSH
- lidé MeSH
- mladiství MeSH
- mladý dospělý MeSH
- průzkumy a dotazníky statistika a číselné údaje MeSH
- sexuální faktory MeSH
- spánek fyziologie MeSH
- věkové faktory MeSH
- Check Tag
- dítě MeSH
- dospělí MeSH
- lidé MeSH
- mladiství MeSH
- mladý dospělý MeSH
- mužské pohlaví MeSH
- ženské pohlaví MeSH
- Publikační typ
- časopisecké články MeSH
- Geografické názvy
- Česká republika MeSH
BACKGROUND AND AIMS: Wastewater-based epidemiology is an additional indicator of drug use that is gaining reliability to complement the current established panel of indicators. The aims of this study were to: (i) assess spatial and temporal trends of population-normalized mass loads of benzoylecgonine, amphetamine, methamphetamine and 3,4-methylenedioxymethamphetamine (MDMA) in raw wastewater over 7 years (2011-17); (ii) address overall drug use by estimating the average number of combined doses consumed per day in each city; and (iii) compare these with existing prevalence and seizure data. DESIGN: Analysis of daily raw wastewater composite samples collected over 1 week per year from 2011 to 2017. SETTING AND PARTICIPANTS: Catchment areas of 143 wastewater treatment plants in 120 cities in 37 countries. MEASUREMENTS: Parent substances (amphetamine, methamphetamine and MDMA) and the metabolites of cocaine (benzoylecgonine) and of Δ9 -tetrahydrocannabinol (11-nor-9-carboxy-Δ9 -tetrahydrocannabinol) were measured in wastewater using liquid chromatography-tandem mass spectrometry. Daily mass loads (mg/day) were normalized to catchment population (mg/1000 people/day) and converted to the number of combined doses consumed per day. Spatial differences were assessed world-wide, and temporal trends were discerned at European level by comparing 2011-13 drug loads versus 2014-17 loads. FINDINGS: Benzoylecgonine was the stimulant metabolite detected at higher loads in southern and western Europe, and amphetamine, MDMA and methamphetamine in East and North-Central Europe. In other continents, methamphetamine showed the highest levels in the United States and Australia and benzoylecgonine in South America. During the reporting period, benzoylecgonine loads increased in general across Europe, amphetamine and methamphetamine levels fluctuated and MDMA underwent an intermittent upsurge. CONCLUSIONS: The analysis of wastewater to quantify drug loads provides near real-time drug use estimates that globally correspond to prevalence and seizure data.
- MeSH
- amfetamin analýza MeSH
- časoprostorová analýza * MeSH
- chromatografie kapalinová MeSH
- internacionalita MeSH
- kokain analogy a deriváty analýza MeSH
- lidé MeSH
- methamfetamin analýza MeSH
- monitorování životního prostředí metody MeSH
- N-methyl-3,4-methylendioxyamfetamin analýza MeSH
- odhalování abúzu drog metody MeSH
- odpadní voda chemie MeSH
- tandemová hmotnostní spektrometrie MeSH
- zakázané drogy * MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
Séroprevalenční studie jsou významným nástrojem umožňujícím zjistit, jaká část populace již byla infikována (např. SARS-CoV-2). Bez těchto informací je pro státní instituce velmi obtížné přijímat správná protiepidemická opatření. Na jaře 2020 proběhla v jižních Čechách, v okresech Strakonice a Písek, středně velká séroprevalenční studie. Vyšetřeno bylo celkem 2011 lidí, a to ve skupině dobrovolníků z řad veřejnosti a vybraných profesních skupin. Stanovení protilátek IgA a IgG proti koronaviru bylo provedeno metodou ELISA Euroimmun. Zjistili jsme, že v květnu 2020 mělo protilátky proti koronaviru 2,9 % obyvatel okresu Strakonice a 1,9 % obyvatel okresu Písek. Na jednu osobu s pozitivním testem PCR tedy připadlo dalších nejméně 50 osob s protilátkami proti koronaviru. Při provádění séroprevalenčních studií je potřeba řešit 3 základní problémy. Je nutné dobře studii naplánovat a promyslet, jaké varianty výsledků lze očekávat. Také vybrat kvalitní testy známých parametrů, které umožnují kvalifikovaně odhadnout podíl falešně pozitivních a falešně negativních výsledků. V neposlední řadě musíme umět data rozumně vyhodnotit a vyvodit z nich správné závěry a zobecnění. Na konkrétním případě séroprevalenční studie z Jihočeského kraje dále ukazujeme, jak správně řešit případné komplikace, které mohou nastat při takto rozsáhlém testování.
Seroprevalence studies represent a very important tool to find out what fraction of population has already met with the new type of coronavirus (e.g. SARS-CoV-2). Without these data, it is almost impossible for the state authorities to manage the epidemic and adopt rational measures. This article brings the results of a medium-sized seroprevalence study which was carried out in the spring of 2020 in South Bohemia. In the Strakonice and Písek regions, the ELISA method was used to test the prevalence of IgA and IgG antibodies in 2011 subjects, volunteers from general public and selected professions working in areas with a higher exposure to the infection. The study showed that already in May 2020, 2.9% of inhabitants of the Strakonice region and 1.9% of inhabitants of the Písek region had antibodies against the coronavirus. These numbers imply that for each PCR positive person, there were at least fifty others who had probably already undergone the infection. The article points out three types of problems that might occur in such a study. First, the study must be planned correctly, and possible outcomes must be pre-assessed. Second, an appropriate test must be selected with known parameters. This enables us to correctly estimate the share of false positive and false negative results. Third, the data must be evaluated in a reasonable way and correct inference must be performed. We offer a set of recommendations how to manage these issues and how to solve problems that inevitably arise in such a large-scale testing.
- MeSH
- COVID-19 * MeSH
- ELISA MeSH
- imunoglobulin A krev MeSH
- imunoglobulin G krev MeSH
- lidé MeSH
- polymerázová řetězová reakce MeSH
- protilátky virové krev MeSH
- SARS-CoV-2 MeSH
- senzitivita a specificita MeSH
- séroepidemiologické studie * MeSH
- sérologické testování na COVID-19 MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- práce podpořená grantem MeSH
- Geografické názvy
- Česká republika MeSH
Considering the extensive data sets and statistical techniques, animal breeding embodies a branch of machine learning that has a constantly increasing impact on breeding. In our study, information regarding the potential of machine learning and data mining within a large set of horses and breeds is presented. The individual assignment methods and factors influencing the success rate of the procedure are compared at the Czech population scale. The fixation index values ranged from 0.057 (HMS1) to 0.144 (HTG6), and the overall genetic differentiation amounted to 8.9% among the breeds. The highest genetic divergence (FST = 0.378) was established between the Friesian and Equus przewalskii; the highest degree of gene migration was obtained between the Czech and Bavarian Warmblood (Nm = 14,302); and the overall global heterozygote deficit across the populations was 10.4%. The eight standard methods (Bayesian, frequency, and distance) using GeneClass software and almost all mainstream classification algorithms (Bayes Net, Naive Bayes, IB1, IB5, KStar, JRip, J48, Random Forest, Random Tree, PART, MLP, and SVM) from the WEKA machine learning workbench were compared by utilizing 314,874 real allelic data sets. The Bayesian method (GeneClass, 89.9%) and Bayesian network algorithm (WEKA, 84.8%) outperformed the other techniques. The breed genomic prediction accuracy reached the highest value in the cold-blooded horses. The overall proportion of individuals correctly assigned to a population depended mainly on the breed number and genetic divergence. These statistical tools could be used to assess breed traceability systems, and they exhibit the potential to assist managers in decision-making as regards breeding and registration.
- MeSH
- alely MeSH
- algoritmy MeSH
- chov * MeSH
- druhová specificita MeSH
- frekvence genu MeSH
- genetická variace MeSH
- genomika * MeSH
- genotyp MeSH
- heterozygot MeSH
- koně klasifikace genetika MeSH
- mikrosatelitní repetice genetika MeSH
- software MeSH
- zvířata MeSH
- Check Tag
- zvířata MeSH
- Publikační typ
- časopisecké články MeSH
Multiway array decomposition methods have been shown to be promising statistical tools for identifying neural activity in the EEG spectrum. They blindly decompose the EEG spectrum into spatial-temporal-spectral patterns by taking into account inherent relationships among signals acquired at different frequencies and sensors. Our study evaluates the stability of spatial-temporal-spectral patterns derived by one particular method, parallel factor analysis (PARAFAC). We focused on patterns' stability over time and in population and divided the complete data set containing data from 50 healthy subjects into several subsets. Our results suggest that the patterns are highly stable in time, as well as among different subgroups of subjects. Further, we show with simultaneously acquired fMRI data that power fluctuations of some patterns have stable correspondence to hemodynamic fluctuations in large-scale brain networks. We did not find such correspondence for power fluctuations in standard frequency bands, the common way of dealing with EEG data. Altogether, our results suggest that PARAFAC is a suitable method for research in the field of large-scale brain networks and their manifestation in EEG signal.
- MeSH
- akustická stimulace MeSH
- dospělí MeSH
- elektroencefalografie * MeSH
- faktorová analýza statistická MeSH
- kyslík krev MeSH
- lidé MeSH
- magnetická rezonanční tomografie MeSH
- mapování mozku MeSH
- mladý dospělý MeSH
- mozek diagnostické zobrazování fyziologie MeSH
- mozkové vlny fyziologie MeSH
- nervové dráhy diagnostické zobrazování fyziologie MeSH
- počítačové zpracování obrazu * MeSH
- světelná stimulace MeSH
- zvířata MeSH
- Check Tag
- dospělí MeSH
- lidé MeSH
- mladý dospělý MeSH
- mužské pohlaví MeSH
- ženské pohlaví MeSH
- zvířata MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
The crucial steps in biological invasions, related to the shaping of genetic architecture and the current evolution of adaptations to a novel environment, usually occur in small populations during the phases of introduction and establishment. However, these processes are difficult to track in nature due to invasion lag, large geographic and temporal scales compared with human observation capabilities, the frequent depletion of genetic variance, admixture and other phenomena. In this study, we compared genetic and historical evidence related to the invasion of the West European hedgehog to New Zealand to infer details about the introduction and establishment. Historical information indicates that the species was initially established on the South Island. A molecular assay of populations from Great Britain and New Zealand using mitochondrial sequences and nuclear microsatellite loci was performed based on a set of analyses including approximate Bayesian computation, a powerful approach for disentangling complex population demographies. According to these analyses, the population of the North Island was most similar to that of the native area and showed greatest reduction in genetic variation caused by founder demography and/or drift. This evidence indicated the location of the establishment phase. The hypothesis was corroborated by data on climate and urbanization. We discuss the contrasting results obtained by the molecular and historical approaches in the light of their different explanatory power and the possible biases influencing the description of particular aspects of invasions, and we advocate the integration of the two types of approaches in invasion biology.
- MeSH
- Bayesova věta MeSH
- hustota populace MeSH
- ježkovití genetika fyziologie MeSH
- lidé MeSH
- mikrosatelitní repetice genetika MeSH
- mitochondriální DNA genetika MeSH
- molekulární sekvence - údaje MeSH
- populační genetika * MeSH
- zavlečené druhy MeSH
- zvířata MeSH
- Check Tag
- lidé MeSH
- zvířata MeSH
- Publikační typ
- časopisecké články MeSH
- Geografické názvy
- Nový Zéland MeSH
Tunnels in enzymes with buried active sites are key structural features allowing the entry of substrates and the release of products, thus contributing to the catalytic efficiency. Targeting the bottlenecks of protein tunnels is also a powerful protein engineering strategy. However, the identification of functional tunnels in multiple protein structures is a non-trivial task that can only be addressed computationally. We present a pipeline integrating automated structural analysis with an in-house machine-learning predictor for the annotation of protein pockets, followed by the calculation of the energetics of ligand transport via biochemically relevant tunnels. A thorough validation using eight distinct molecular systems revealed that CaverDock analysis of ligand un/binding is on par with time-consuming molecular dynamics simulations, but much faster. The optimized and validated pipeline was applied to annotate more than 17,000 cognate enzyme-ligand complexes. Analysis of ligand un/binding energetics indicates that the top priority tunnel has the most favourable energies in 75% of cases. Moreover, energy profiles of cognate ligands revealed that a simple geometry analysis can correctly identify tunnel bottlenecks only in 50% of cases. Our study provides essential information for the interpretation of results from tunnel calculation and energy profiling in mechanistic enzymology and protein engineering. We formulated several simple rules allowing identification of biochemically relevant tunnels based on the binding pockets, tunnel geometry, and ligand transport energy profiles.Scientific contributionsThe pipeline introduced in this work allows for the detailed analysis of a large set of protein-ligand complexes, focusing on transport pathways. We are introducing a novel predictor for determining the relevance of binding pockets for tunnel calculation. For the first time in the field, we present a high-throughput energetic analysis of ligand binding and unbinding, showing that approximate methods for these simulations can identify additional mutagenesis hotspots in enzymes compared to purely geometrical methods. The predictor is included in the supplementary material and can also be accessed at https://github.com/Faranehhad/Large-Scale-Pocket-Tunnel-Annotation.git . The tunnel data calculated in this study has been made publicly available as part of the ChannelsDB 2.0 database, accessible at https://channelsdb2.biodata.ceitec.cz/ .
- Publikační typ
- časopisecké články MeSH