Statistical challenges of big data analysis in medicine
Dotaz
Zobrazit nápovědu
Among medical specialties, laboratory medicine is the largest producer of structured data and must play a crucial role for the efficient and safe implementation of big data and artificial intelligence in healthcare. The area of personalized therapies and precision medicine has now arrived, with huge data sets not only used for experimental and research approaches, but also in the "real world". Analysis of real world data requires development of legal, procedural and technical infrastructure. The integration of all clinical data sets for any given patient is important and necessary in order to develop a patient-centered treatment approach. Data-driven research comes with its own challenges and solutions. The Findability, Accessibility, Interoperability, and Reusability (FAIR) Guiding Principles provide guidelines to make data findable, accessible, interoperable and reusable to the research community. Federated learning, standards and ontologies are useful to improve robustness of artificial intelligence algorithms working on big data and to increase trust in these algorithms. When dealing with big data, the univariate statistical approach changes to multivariate statistical methods significantly shifting the potential of big data. Combining multiple omics gives previously unsuspected information and provides understanding of scientific questions, an approach which is also called the systems biology approach. Big data and artificial intelligence also offer opportunities for laboratories and the In Vitro Diagnostic industry to optimize the productivity of the laboratory, the quality of laboratory results and ultimately patient outcomes, through tools such as predictive maintenance and "moving average" based on the aggregate of patient results.
- Klíčová slova
- artificial intelligence, big data, data science, patient outcomes, personalized healthcare, precision medicine,
- MeSH
- algoritmy MeSH
- big data * MeSH
- individualizovaná medicína metody MeSH
- lidé MeSH
- poskytování zdravotní péče MeSH
- umělá inteligence * MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
The amount of available data relevant for clinical decision support is rising not only rapidly but at the same time much faster than our ability to analyze and interpret them. Thus, the potential of the data to contribute to determining the diagnosis, therapy and prognosis of an individual patient is not appropriately exploited. The hopes to obtain benefit from the data for an individual patient must be accompanied by a reliable and diligent biostatistical analysis which faces serious challenges not always clear to non-statisticians. The aim of this paper is to discuss principles of statistical analysis of big data in research and routine applications in clinical medicine, focusing on particular aspects of psychiatry. The paper brings arguments in favor of the idea that the biostatistical analysis of data in a specialty field requires different approaches and different experience compared to other clinical fields. This is illustrated by a description of common complications of the analysis of psychiatric data. Challenges of the analysis of big data in both psychiatric research and routine practice are explained, which are far from a routine service activity exploiting standard methods of multivariate statistics and/or machine learning. Important research questions, which are important in the current psychiatric research, are presented and discussed from the biostatistical point of view.
- Klíčová slova
- big data, biostatistics, psychiatry decision support.,
- MeSH
- analýza dat * MeSH
- lidé MeSH
- prognóza MeSH
- psychiatrie * MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
Achieving a reliable and accurate biomedical image segmentation is a long-standing problem. In order to train or adapt the segmentation methods or measure their performance, reference segmentation masks are required. Usually gold-standard annotations, i.e. human-origin reference annotations, are used as reference although they are very hard to obtain. The increasing size of the acquired image data, large dimensionality such as 3D or 3D + time, limited human expert time, and annotator variability, typically result in sparsely annotated gold-standard datasets. Reliable silver-standard annotations, i.e. computer-origin reference annotations, are needed to provide dense segmentation annotations by fusing multiple computer-origin segmentation results. The produced dense silver-standard annotations can then be either used as reference annotations directly, or converted into gold-standard ones with much lighter manual curation, which saves experts' time significantly. We propose a novel full-resolution multi-rater fusion convolutional neural network (CNN) architecture for biomedical image segmentation masks, called DeepFuse, which lacks any down-sampling layers. Staying everywhere at the full resolution enables DeepFuse to fully benefit from the enormous feature extraction capabilities of CNNs. DeepFuse outperforms the popular and commonly used fusion methods, STAPLE, SIMPLE and other majority-voting-based approaches with statistical significance on a wide range of benchmark datasets as demonstrated on examples of a challenging task of 2D and 3D cell and cell nuclei instance segmentation for a wide range of microscopy modalities, magnifications, cell shapes and densities. A remarkable feature of the proposed method is that it can apply specialized post-processing to the segmentation masks of each rater separately and recover under-segmented object parts during the refinement phase even if the majority of inputs vote otherwise. Thus, DeepFuse takes a big step towards obtaining fast and reliable computer-origin segmentation annotations for biomedical images.
- MeSH
- lidé MeSH
- neuronové sítě * MeSH
- počítačové zpracování obrazu * metody MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH