PURPOSE: Stratifying patients with cancer according to risk of relapse can personalize their care. In this work, we provide an answer to the following research question: How to use machine learning to estimate probability of relapse in patients with early-stage non-small-cell lung cancer (NSCLC)? MATERIALS AND METHODS: For predicting relapse in 1,387 patients with early-stage (I-II) NSCLC from the Spanish Lung Cancer Group data (average age 65.7 years, female 24.8%, male 75.2%), we train tabular and graph machine learning models. We generate automatic explanations for the predictions of such models. For models trained on tabular data, we adopt SHapley Additive exPlanations local explanations to gauge how each patient feature contributes to the predicted outcome. We explain graph machine learning predictions with an example-based method that highlights influential past patients. RESULTS: Machine learning models trained on tabular data exhibit a 76% accuracy for the random forest model at predicting relapse evaluated with a 10-fold cross-validation (the model was trained 10 times with different independent sets of patients in test, train, and validation sets, and the reported metrics are averaged over these 10 test sets). Graph machine learning reaches 68% accuracy over a held-out test set of 200 patients, calibrated on a held-out set of 100 patients. CONCLUSION: Our results show that machine learning models trained on tabular and graph data can enable objective, personalized, and reproducible prediction of relapse and, therefore, disease outcome in patients with early-stage NSCLC. With further prospective and multisite validation, and additional radiological and molecular data, this prognostic model could potentially serve as a predictive decision support tool for deciding the use of adjuvant treatments in early-stage lung cancer.
- MeSH
- lidé MeSH
- lokální recidiva nádoru diagnóza MeSH
- nádory plic * diagnóza terapie MeSH
- nemalobuněčný karcinom plic * diagnóza terapie MeSH
- prognóza MeSH
- senioři MeSH
- strojové učení MeSH
- Check Tag
- lidé MeSH
- mužské pohlaví MeSH
- senioři MeSH
- ženské pohlaví MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
The diagnosis of solid tumors of epithelial origin (carcinomas) represents a major part of the workload in clinical histopathology. Carcinomas consist of malignant epithelial cells arranged in more or less cohesive clusters of variable size and shape, together with stromal cells, extracellular matrix, and blood vessels. Distinguishing stroma from epithelium is a critical component of artificial intelligence (AI) methods developed to detect and analyze carcinomas. In this paper, we propose a novel automated workflow that enables large-scale guidance of AI methods to identify the epithelial component. The workflow is based on re-staining existing hematoxylin and eosin (H&E) formalin-fixed paraffin-embedded sections by immunohistochemistry for cytokeratins, cytoskeletal components specific to epithelial cells. Compared to existing methods, clinically available H&E sections are reused and no additional material, such as consecutive slides, is needed. We developed a simple and reliable method for automatic alignment to generate masks denoting cytokeratin-rich regions, using cell nuclei positions that are visible in both the original and the re-stained slide. The registration method has been compared to state-of-the-art methods for alignment of consecutive slides and shows that, despite being simpler, it provides similar accuracy and is more robust. We also demonstrate how the automatically generated masks can be used to train modern AI image segmentation based on U-Net, resulting in reliable detection of epithelial regions in previously unseen H&E slides. Through training on real-world material available in clinical laboratories, this approach therefore has widespread applications toward achieving AI-assisted tumor assessment directly from scanned H&E sections. In addition, the re-staining method will facilitate additional automated quantitative studies of tumor cell and stromal cell phenotypes.
- MeSH
- barvení a značení MeSH
- deep learning * MeSH
- eosin MeSH
- epitelové buňky MeSH
- hematoxylin MeSH
- keratiny * MeSH
- lidé MeSH
- umělá inteligence MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
The scarcity of high-quality annotations in many application scenarios has recently led to an increasing interest in devising learning techniques that combine unlabeled data with labeled data in a network. In this work, we focus on the label propagation problem in multilayer networks. Our approach is inspired by the heat diffusion model, which shows usefulness in machine learning problems such as classification and dimensionality reduction. We propose a novel boundary-based heat diffusion algorithm that guarantees a closed-form solution with an efficient implementation. We experimentally validated our method on synthetic networks and five real-world multilayer network datasets representing scientific coauthorship, spreading drug adoption among physicians, two bibliographic networks, and a movie network. The results demonstrate the benefits of the proposed algorithm, where our boundary-based heat diffusion dominates the performance of the state-of-the-art methods.
- MeSH
- algoritmy MeSH
- řízené strojové učení * MeSH
- strojové učení MeSH
- vysoká teplota * MeSH
- Publikační typ
- časopisecké články MeSH
Early-stage lung cancer is crucial clinically due to its insidious nature and rapid progression. Most of the prediction models designed to predict tumour recurrence in the early stage of lung cancer rely on the clinical or medical history of the patient. However, their performance could likely be improved if the input patient data contained genomic information. Unfortunately, such data is not always collected. This is the main motivation of our work, in which we have imputed and integrated specific type of genomic data with clinical data to increase the accuracy of machine learning models for prediction of relapse in early-stage, non-small cell lung cancer patients. Using a publicly available TCGA lung adenocarcinoma cohort of 501 patients, their aneuploidy scores were imputed into similar records in the Spanish Lung Cancer Group (SLCG) data, more specifically a cohort of 1348 early-stage patients. First, the tumor recurrence in those patients was predicted without the imputed aneuploidy scores. Then, the SLCG data were enriched with the aneuploidy scores imputed from TCGA. This integrative approach improved the prediction of the relapse risk, achieving area under the precision-recall curve (PR-AUC) score of 0.74, and area under the ROC (ROC-AUC) score of 0.79. Using the prediction explanation model SHAP (SHapley Additive exPlanations), we further explained the predictions performed by the machine learning model. We conclude that our explainable predictive model is a promising tool for oncologists that addresses an unmet clinical need of post-treatment patient stratification based on the relapse risk, while also improving the predictive power by incorporating proxy genomic data not available for the actual specific patients.
- MeSH
- genomika MeSH
- lidé MeSH
- lokální recidiva nádoru MeSH
- nádory plic * MeSH
- nemalobuněčný karcinom plic * genetika MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
At the time of the COVID-19 pandemic, providing access to data (properly optimised regarding personal data protection) plays a crucial role in providing the general public and media with up-to-date information. Open datasets also represent one of the means for evaluation of the pandemic on a global level. The primary aim of this paper is to describe the methodological and technical framework for publishing datasets describing characteristics related to the COVID-19 epidemic in the Czech Republic (epidemiology, hospital-based care, vaccination), including the use of these datasets in practice. Practical aspects and experience with data sharing are discussed. As a reaction to the epidemic situation, a new portal COVID-19: Current Situation in the Czech Republic (https://onemocneni-aktualne.mzcr.cz/covid-19) was developed and launched in March 2020 to provide a fully-fledged and trustworthy source of information for the public and media. The portal also contains a section for the publication of (i) public open datasets available for download in CSV and JSON formats and (ii) authorised-access-only section where the authorised persons can (through an online generated token) safely visualise or download regional datasets with aggregated data at the level of the individual municipalities and regions. The data are also provided to the local open data catalogue (covering only open data on healthcare, provided by the Ministry of Health) and to the National Catalogue of Open Data (covering all open data sets, provided by various authorities/publishers, and harversting all data from local catalogues). The datasets have been published in various authentication regimes and widely used by general public, scientists, public authorities and decision-makers. The total number of API calls since its launch in March 2020 to 15 December 2020 exceeded 13 million. The datasets have been adopted as an official and guaranteed source for outputs of third parties, including public authorities, non-governmental organisations, scientists and online news portals. Datasets currently published as open data meet the 3-star open data requirements, which makes them machine-readable and facilitates their further usage without restrictions. This is essential for making the data more easily understandable and usable for data consumers. In conjunction with the strategy of the MH in the field of data opening, additional datasets meeting the already implemented standards will be also released, both on COVID-19 related and unrelated topics.
- MeSH
- COVID-19 * epidemiologie MeSH
- lidé MeSH
- pandemie prevence a kontrola MeSH
- SARS-CoV-2 MeSH
- šíření informací MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Geografické názvy
- Česká republika MeSH
The degree of response to subthalamic nucleus deep brain stimulation (STN-DBS) is individual and hardly predictable. We hypothesized that DBS-related changes in cortical network organization are related to the clinical effect. Network analysis based on graph theory was used to evaluate the high-density electroencephalography (HDEEG) recorded during a visual three-stimuli paradigm in 32 Parkinson's disease (PD) patients treated by STN-DBS in stimulation "off" and "on" states. Preprocessed scalp data were reconstructed into the source space and correlated to the behavioral parameters. In the majority of patients (n = 26), STN-DBS did not lead to changes in global network organization in large-scale brain networks. In a subgroup of suboptimal responders (n = 6), identified according to reaction times (RT) and clinical parameters (lower Unified Parkinson's Disease Rating Scale [UPDRS] score improvement after DBS and worse performance in memory tests), decreased global connectivity in the 1-8 Hz frequency range and regional node strength in frontal areas were detected. The important role of the supplementary motor area for the optimal DBS response was demonstrated by the increased node strength and eigenvector centrality in good responders. This response was missing in the suboptimal responders. Cortical topologic architecture is modified by the response to STN-DBS leading to a dysfunction of the large-scale networks in suboptimal responders.
- MeSH
- elektroencefalografie MeSH
- hluboká mozková stimulace * MeSH
- hodnocení výsledků zdravotní péče MeSH
- lidé středního věku MeSH
- lidé MeSH
- mozková kůra patofyziologie MeSH
- nervová síť patofyziologie MeSH
- nucleus subthalamicus patofyziologie MeSH
- Parkinsonova nemoc patofyziologie terapie MeSH
- psychomotorický výkon fyziologie MeSH
- senioři MeSH
- Check Tag
- lidé středního věku MeSH
- lidé MeSH
- mužské pohlaví MeSH
- senioři MeSH
- ženské pohlaví MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
Enterotoxigenic Escherichia coli (ETEC) and Shiga toxin-producing E. coli (STEC) strains are the causative agents of severe foodborne diseases in both humans and animals. In this study, porcine pathogenic E. coli strains (n = 277) as well as porcine commensal strains (n = 188) were tested for their susceptibilities to 34 bacteriocin monoproducers to identify the most suitable bacteriocin types inhibiting porcine pathogens. Under in vitro conditions, the set of pathogenic E. coli strains was found to be significantly more susceptible to the majority of tested bacteriocins than commensal E. coli. Based on the production of bacteriocins with specific activity against pathogens, three potentially probiotic commensal E. coli strains of human origin were selected. These strains were found to be able to outcompete ETEC strains expressing F4 or F18 fimbriae in liquid culture and also decreased the severity and duration of diarrhea in piglets during experimental ETEC infection as well as pathogen numbers on the last day of in vivo experimentation. While the extents of the probiotic effect were different for each strain, the cocktail of all three strains showed the most pronounced beneficial effects, suggesting synergy between the tested E. coli strains. IMPORTANCE Increasing levels of antibiotic resistance among bacteria also increase the need for alternatives to conventional antibiotic treatment. Pathogenic Escherichia coli represents a major diarrheic infectious agent of piglets in their postweaning period; however, available measures to control these infections are limited. This study describes three novel E. coli strains producing antimicrobial compounds (bacteriocins) that actively inhibit a majority of toxigenic E. coli strains. The beneficial effect of three potentially probiotic E. coli strains was demonstrated under both in vitro and in vivo conditions. The novel probiotic candidates may be used as prophylaxis during piglets' postweaning period to overcome common infections caused by E. coli.
- MeSH
- bakteriální toxiny * metabolismus MeSH
- bakteriociny metabolismus terapeutické užití MeSH
- Escherichia coli * účinky léků genetika metabolismus MeSH
- faktory virulence genetika MeSH
- feces mikrobiologie MeSH
- infekce vyvolané Escherichia coli mikrobiologie prevence a kontrola veterinární MeSH
- nemoci prasat mikrobiologie prevence a kontrola MeSH
- prasata MeSH
- probiotika terapeutické užití MeSH
- zvířata MeSH
- Check Tag
- zvířata MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- randomizované kontrolované studie veterinární MeSH
SARS-CoV-2 is an intensively investigated virus from the order Nidovirales (Coronaviridae family) that causes COVID-19 disease in humans. Through enormous scientific effort, thousands of viral strains have been sequenced to date, thereby creating a strong background for deep bioinformatics studies of the SARS-CoV-2 genome. In this study, we inspected high-frequency mutations of SARS-CoV-2 and carried out systematic analyses of their overlay with inverted repeat (IR) loci and CpG islands. The main conclusion of our study is that SARS-CoV-2 hot-spot mutations are significantly enriched within both IRs and CpG island loci. This points to their role in genomic instability and may predict further mutational drive of the SARS-CoV-2 genome. Moreover, CpG islands are strongly enriched upstream from viral ORFs and thus could play important roles in transcription and the viral life cycle. We hypothesize that hypermethylation of these loci will decrease the transcription of viral ORFs and could therefore limit the progression of the disease.
- MeSH
- COVID-19 virologie MeSH
- CpG ostrůvky * MeSH
- genom virový MeSH
- lidé MeSH
- metylace DNA MeSH
- mutace * MeSH
- SARS-CoV-2 genetika MeSH
- vazba proteinů MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
Early detection and mitigation of disease recurrence in non-small cell lung cancer (NSCLC) patients is a nontrivial problem that is typically addressed either by rather generic follow-up screening guidelines, self-reporting, simple nomograms, or by models that predict relapse risk in individual patients using statistical analysis of retrospective data. We posit that machine learning models trained on patient data can provide an alternative approach that allows for more efficient development of many complementary models at once, superior accuracy, less dependency on the data collection protocols and increased support for explainability of the predictions. In this preliminary study, we describe an experimental suite of various machine learning models applied on a patient cohort of 2442 early stage NSCLC patients. We discuss the promising results achieved, as well as the lessons we learned while developing this baseline for further, more advanced studies in this area.
- MeSH
- lidé MeSH
- nádory plic * diagnóza MeSH
- nemalobuněčný karcinom plic * diagnóza patologie MeSH
- nomogramy MeSH
- prognóza MeSH
- retrospektivní studie MeSH
- staging nádorů MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
BACKGROUND: Air pollution has been linked to increased mortality and morbidity. The Program 4 of the Healthy Aging in Industrial Environment study investigates whether the health and wellbeing benefits of physical activity (PA) can be fully realized in individuals living in highly polluted environments. Herein, we introduce the behavioral, psychological and neuroimaging protocol of the study. METHODS: This is a prospective cohort study of N = 1500 individuals aged 18-65 years comparing: (1) individuals living in the highly polluted, industrial region surrounding the city of Ostrava (n = 750), and (2) controls from the comparison region with relative low pollution levels in Southern Bohemia (n = 750). Quota sampling is used to obtain samples balanced on age, gender, PA status (60% active runners vs. 40% insufficiently active). Participants are screened and complete baseline assessments through online questionnaires and in-person lab-based assessments of physiological, biomechanical, neuroimaging and cognitive function parameters. Prospective 12-month intensive monitoring of air pollution and behavioral parameters (PA, inactivity, and sleep) follows, with a focus on PA-related injuries and psychological factors through fitness trackers, smartphones, and mobile apps. Subsequently, there will be a 5-year follow-up of the study cohort. DISCUSSION: The design of the study will allow for (1) the assessment of both short-term variation and long-term change in behavioral parameters, (2) evaluation of the incidence of musculoskeletal injuries and psychological factors impacting behavior and injury recovery, and (3) the impact that air pollution status (and change) has on behavior, psychological resilience, and injury recovery. Furthermore, the integration of MRI techniques and cognitive assessment in combination with data on behavioral, biological and environmental variables will provide an opportunity to examine brain structure and cognitive function in relation to health behavior and air pollution, as well as other factors affecting resilience against and vulnerability to adverse changes in brain structure and cognitive aging. This study will help inform individuals about personal risk factors and decision-makers about the impact of environmental factors on negative health outcomes and potential underlying biological, behavioral and psychological mechanisms. Challenges and opportunities stemming from the timing of the study that coincided with the COVID-19 pandemic are also discussed.
- MeSH
- COVID-19 MeSH
- cvičení * MeSH
- dospělí MeSH
- kognice fyziologie MeSH
- látky znečišťující vzduch analýza MeSH
- lidé středního věku MeSH
- lidé MeSH
- mladiství MeSH
- mladý dospělý MeSH
- mozek diagnostické zobrazování fyziologie MeSH
- neurozobrazování MeSH
- prospektivní studie MeSH
- průzkumy a dotazníky MeSH
- psychická odolnost MeSH
- pyrimidiny chemie MeSH
- senioři MeSH
- výzkumný projekt MeSH
- zdravé chování MeSH
- zdravé stárnutí MeSH
- znečištění ovzduší škodlivé účinky MeSH
- Check Tag
- dospělí MeSH
- lidé středního věku MeSH
- lidé MeSH
- mladiství MeSH
- mladý dospělý MeSH
- mužské pohlaví MeSH
- senioři MeSH
- ženské pohlaví MeSH
- Publikační typ
- časopisecké články MeSH