BACKGROUND: The Big Multiple Sclerosis Data (BMSD) network ( https://bigmsdata.org ) was initiated in 2014 and includes the national multiple sclerosis (MS) registries of the Czech Republic, Denmark, France, Italy, and Sweden as well as the international MSBase registry. BMSD has addressed the ethical, legal, technical, and governance-related challenges for data sharing and so far, published three scientific papers on pooled datasets as proof of concept for its collaborative design. DATA COLLECTION: Although BMSD registries operate independently on different platforms, similarities in variables, definitions and data structure allow joint analysis of data. Certain coordinated modifications in how the registries collect adverse event data have been implemented after BMSD consensus decisions, showing the ability to develop together. DATA MANAGEMENT: Scientific projects can be proposed by external sponsors via the coordinating centre and each registry decides independently on participation, respecting its governance structure. Research datasets are established in a project-to-project fashion and a project-specific data model is developed, based on a unifying core data model. To overcome challenges in data sharing, BMSD has developed procedures for federated data analysis. FUTURE PERSPECTIVES: Presently, BMSD is seeking a qualification opinion from the European Medicines Agency (EMA) to conduct post-authorization safety studies (PASS) and aims to pursue a qualification opinion also for post-authorization effectiveness studies (PAES). BMSD aspires to promote the advancement of real-world evidence research in the MS field.
- Klíčová slova
- Multiple sclerosis, Patient data network, Patient registries, Real-world evidence,
- MeSH
- big data MeSH
- lidé MeSH
- mezinárodní spolupráce MeSH
- registrace * MeSH
- roztroušená skleróza * epidemiologie terapie MeSH
- šíření informací MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
Research on animal venoms and their components spans multiple disciplines, including biology, biochemistry, bioinformatics, pharmacology, medicine, and more. Manipulating and analyzing the diverse array of data required for venom research can be challenging, and relevant tools and resources are often dispersed across different online platforms, making them less accessible to nonexperts. In this article, we address the multifaceted needs of the scientific community involved in venom and toxin-related research by identifying and discussing web resources, databases, and tools commonly used in this field. We have compiled these resources into a comprehensive table available on the VenomZone website (https://venomzone.expasy.org/10897). Furthermore, we highlight the challenges currently faced by researchers in accessing and using these resources and emphasize the importance of community-driven interdisciplinary approaches. We conclude by underscoring the significance of enhancing standards, promoting interoperability, and encouraging data and method sharing within the venom research community.
- Klíčová slova
- antivenom, drug discovery, genomics, machine learning, peptidomics, proteomics, toxin databases, transcriptomics, venom resources,
- MeSH
- big data * MeSH
- databáze faktografické MeSH
- internet * MeSH
- výpočetní biologie * metody MeSH
- živočišné jedy * MeSH
- zvířata MeSH
- Check Tag
- zvířata MeSH
- Publikační typ
- časopisecké články MeSH
- Názvy látek
- živočišné jedy * MeSH
Large-scale genotype and phenotype data have been increasingly generated to identify genetic markers, understand gene function and evolution and facilitate genomic selection. These datasets hold immense value for both current and future studies, as they are vital for crop breeding, yield improvement and overall agricultural sustainability. However, integrating these datasets from heterogeneous sources presents significant challenges and hinders their effective utilization. We established the Genotype-Phenotype Working Group in November 2021 as a part of the AgBioData Consortium (https://www.agbiodata.org) to review current data types and resources that support archiving, analysis and visualization of genotype and phenotype data to understand the needs and challenges of the plant genomic research community. For 2021-22, we identified different types of datasets and examined metadata annotations related to experimental design/methods/sample collection, etc. Furthermore, we thoroughly reviewed publicly funded repositories for raw and processed data as well as secondary databases and knowledgebases that enable the integration of heterogeneous data in the context of the genome browser, pathway networks and tissue-specific gene expression. Based on our survey, we recommend a need for (i) additional infrastructural support for archiving many new data types, (ii) development of community standards for data annotation and formatting, (iii) resources for biocuration and (iv) analysis and visualization tools to connect genotype data with phenotype data to enhance knowledge synthesis and to foster translational research. Although this paper only covers the data and resources relevant to the plant research community, we expect that similar issues and needs are shared by researchers working on animals. Database URL: https://www.agbiodata.org.
BACKGROUND: Recent advances in data-driven computational approaches have been helpful in devising tools to objectively diagnose psychiatric disorders. However, current machine learning studies limited to small homogeneous samples, different methodologies, and different imaging collection protocols, limit the ability to directly compare and generalize their results. Here we aimed to classify individuals with PTSD versus controls and assess the generalizability using a large heterogeneous brain datasets from the ENIGMA-PGC PTSD Working group. METHODS: We analyzed brain MRI data from 3,477 structural-MRI; 2,495 resting state-fMRI; and 1,952 diffusion-MRI. First, we identified the brain features that best distinguish individuals with PTSD from controls using traditional machine learning methods. Second, we assessed the utility of the denoising variational autoencoder (DVAE) and evaluated its classification performance. Third, we assessed the generalizability and reproducibility of both models using leave-one-site-out cross-validation procedure for each modality. RESULTS: We found lower performance in classifying PTSD vs. controls with data from over 20 sites (60 % test AUC for s-MRI, 59 % for rs-fMRI and 56 % for d-MRI), as compared to other studies run on single-site data. The performance increased when classifying PTSD from HC without trauma history in each modality (75 % AUC). The classification performance remained intact when applying the DVAE framework, which reduced the number of features. Finally, we found that the DVAE framework achieved better generalization to unseen datasets compared with the traditional machine learning frameworks, albeit performance was slightly above chance. CONCLUSION: These results have the potential to provide a baseline classification performance for PTSD when using large scale neuroimaging datasets. Our findings show that the control group used can heavily affect classification performance. The DVAE framework provided better generalizability for the multi-site data. This may be more significant in clinical practice since the neuroimaging-based diagnostic DVAE classification models are much less site-specific, rendering them more generalizable.
- Klíčová slova
- Classification, Deep learning, Machine learning, Multimodal MRI, Posttraumatic stress disorder,
- MeSH
- big data MeSH
- lidé MeSH
- magnetická rezonanční tomografie metody MeSH
- mozek diagnostické zobrazování MeSH
- neurozobrazování MeSH
- posttraumatická stresová porucha * diagnostické zobrazování MeSH
- reprodukovatelnost výsledků MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
Lipidomics as a branch of metabolomics provides unique information on the complex lipid profile in biological materials. In clinically focused studies, hundreds of lipids together with available clinical information proved to be an effective tool in the discovery of biomarkers and understanding of pathobiochemistry. However, despite the introduction of lipidomics nearly twenty years ago, only dozens of big data studies using clinical lipidomics have been published to date. In this review, we discuss the lipidomics workflow, statistical tools, and the challenges of standartisation. The consequent summary divided into major clinical areas of cardiovascular disease, cancer, diabetes mellitus, neurodegenerative and liver diseases is demonstrating the importance of clinical lipidomics. In these publications, the potential of lipidomics for prediction, diagnosis or finding new targets for the treatment of selected diseases can be seen. The first of these results have already been implemented in clinical practice in the field of cardiovascular diseases, while in other areas we can expect the application of the results summarized in this review in the near future.
- Klíčová slova
- big data, clinical lipidomics, cohorts, large-scale,
- MeSH
- big data MeSH
- biologické markery metabolismus MeSH
- lidé MeSH
- lipidomika * MeSH
- metabolismus lipidů MeSH
- metabolomika metody MeSH
- nádory * diagnóza MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
- přehledy MeSH
- Názvy látek
- biologické markery MeSH
Among medical specialties, laboratory medicine is the largest producer of structured data and must play a crucial role for the efficient and safe implementation of big data and artificial intelligence in healthcare. The area of personalized therapies and precision medicine has now arrived, with huge data sets not only used for experimental and research approaches, but also in the "real world". Analysis of real world data requires development of legal, procedural and technical infrastructure. The integration of all clinical data sets for any given patient is important and necessary in order to develop a patient-centered treatment approach. Data-driven research comes with its own challenges and solutions. The Findability, Accessibility, Interoperability, and Reusability (FAIR) Guiding Principles provide guidelines to make data findable, accessible, interoperable and reusable to the research community. Federated learning, standards and ontologies are useful to improve robustness of artificial intelligence algorithms working on big data and to increase trust in these algorithms. When dealing with big data, the univariate statistical approach changes to multivariate statistical methods significantly shifting the potential of big data. Combining multiple omics gives previously unsuspected information and provides understanding of scientific questions, an approach which is also called the systems biology approach. Big data and artificial intelligence also offer opportunities for laboratories and the In Vitro Diagnostic industry to optimize the productivity of the laboratory, the quality of laboratory results and ultimately patient outcomes, through tools such as predictive maintenance and "moving average" based on the aggregate of patient results.
- Klíčová slova
- artificial intelligence, big data, data science, patient outcomes, personalized healthcare, precision medicine,
- MeSH
- algoritmy MeSH
- big data * MeSH
- individualizovaná medicína metody MeSH
- lidé MeSH
- poskytování zdravotní péče MeSH
- umělá inteligence * MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
Industrial Internet of Things (IIoT)-based systems have become an important part of industry consortium systems because of their rapid growth and wide-ranging application. Various physical objects that are interconnected in the IIoT network communicate with each other and simplify the process of decision-making by observing and analyzing the surrounding environment. While making such intelligent decisions, devices need to transfer and communicate data with each other. However, as devices involved in IIoT networks grow and the methods of connections diversify, the traditional security frameworks face many shortcomings, including vulnerabilities to attack, lags in data, sharing data, and lack of proper authentication. Blockchain technology has the potential to empower safe data distribution of big data generated by the IIoT. Prevailing data-sharing methods in blockchain only concentrate on the data interchanging among parties, not on the efficiency in sharing, and storing. Hence an element-based K-harmonic means clustering algorithm (CA) is proposed for the effective sharing of data among the entities along with an algorithm named underweight data block (UDB) for overcoming the obstacle of storage space. The performance metrics considered for the evaluation of the proposed framework are the sum of squared error (SSE), time complexity with respect to different m values, and storage complexity with CPU utilization. The results have experimented with MATLAB 2018a simulation environment. The proposed model has better sharing, and storing based on blockchain technology, which is appropriate IIoT.
Although machine learning (ML) has shown promise across disciplines, out-of-sample generalizability is concerning. This is currently addressed by sharing multi-site data, but such centralization is challenging/infeasible to scale due to various limitations. Federated ML (FL) provides an alternative paradigm for accurate and generalizable ML, by only sharing numerical model updates. Here we present the largest FL study to-date, involving data from 71 sites across 6 continents, to generate an automatic tumor boundary detector for the rare disease of glioblastoma, reporting the largest such dataset in the literature (n = 6, 314). We demonstrate a 33% delineation improvement for the surgically targetable tumor, and 23% for the complete tumor extent, over a publicly trained model. We anticipate our study to: 1) enable more healthcare studies informed by large diverse data, ensuring meaningful results for rare diseases and underrepresented populations, 2) facilitate further analyses for glioblastoma by releasing our consensus model, and 3) demonstrate the FL effectiveness at such scale and task-complexity as a paradigm shift for multi-site collaborations, alleviating the need for data-sharing.
- MeSH
- big data * MeSH
- glioblastom * MeSH
- lidé MeSH
- šíření informací MeSH
- strojové učení MeSH
- vzácné nemoci MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
- Research Support, N.I.H., Extramural MeSH
BACKGROUND: The clinical effects of smartphone-based interventions for bipolar disorder (BD) have yet to be established. OBJECTIVES: To examine the efficacy of smartphone-based interventions in BD and how the included studies reported user-engagement indicators. METHODS: We conducted a systematic search on January 24, 2022, in PubMed, Scopus, Embase, APA PsycINFO, and Web of Science. We used random-effects meta-analysis to calculate the standardized difference (Hedges' g) in pre-post change scores between smartphone intervention and control conditions. The study was pre-registered with PROSPERO (CRD42021226668). RESULTS: The literature search identified 6034 studies. Thirteen articles fulfilled the selection criteria. We included seven RCTs and performed meta-analyses comparing the pre-post change in depressive and (hypo)manic symptom severity, functioning, quality of life, and perceived stress between smartphone interventions and control conditions. There was significant heterogeneity among studies and no meta-analysis reached statistical significance. Results were also inconclusive regarding affective relapses and psychiatric readmissions. All studies reported positive user-engagement indicators. CONCLUSION: We did not find evidence to support that smartphone interventions may reduce the severity of depressive or manic symptoms in BD. The high heterogeneity of studies supports the need for expert consensus to establish ideally how studies should be designed and the use of more sensitive outcomes, such as affective relapses and psychiatric hospitalizations, as well as the quantification of mood instability. The ISBD Big Data Task Force provides preliminary recommendations to reduce the heterogeneity and achieve more valid evidence in the field.
- Klíčová slova
- bipolar disorder, efficacy, engagement, smartphone interventions, task force,
- MeSH
- big data MeSH
- bipolární porucha * psychologie MeSH
- chytrý telefon * MeSH
- kvalita života MeSH
- lidé MeSH
- recidiva MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
- metaanalýza MeSH
- práce podpořená grantem MeSH
- přehledy MeSH
- systematický přehled MeSH
Understanding animal movement is essential to elucidate how animals interact, survive, and thrive in a changing world. Recent technological advances in data collection and management have transformed our understanding of animal "movement ecology" (the integrated study of organismal movement), creating a big-data discipline that benefits from rapid, cost-effective generation of large amounts of data on movements of animals in the wild. These high-throughput wildlife tracking systems now allow more thorough investigation of variation among individuals and species across space and time, the nature of biological interactions, and behavioral responses to the environment. Movement ecology is rapidly expanding scientific frontiers through large interdisciplinary and collaborative frameworks, providing improved opportunities for conservation and insights into the movements of wild animals, and their causes and consequences.
- MeSH
- big data * MeSH
- časoprostorová analýza MeSH
- chování zvířat * MeSH
- divoká zvířata fyziologie MeSH
- ekologie * MeSH
- ekosystém MeSH
- migrace zvířat MeSH
- pohyb * MeSH
- sběr dat MeSH
- životní prostředí * MeSH
- zvířata MeSH
- Check Tag
- zvířata MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- přehledy MeSH