Dataset
Dotaz
Zobrazit nápovědu
Neurodegenerative pathologies as Parkinson's Disease (PD) show important distortions in speech, affecting fluency, prosody, articulation and phonation. Classically, measurements based on articulation gestures altering formant positions, as the Vocal Space Area (VSA) or the Formant Centralization Ratio (FCR) have been proposed to measure speech distortion, but these markers are based mainly on static positions of sustained vowels. The present study introduces a measurement based on the mutual information distance among probability density functions of kinematic correlates derived from formant dynamics. An absolute kinematic velocity associated to the position of the jaw and tongue articulation gestures is estimated and modeled statistically. The distribution of this feature may differentiate PD patients from normative speakers during sustained vowel emission. The study is based on a limited database of 53 male PD patients, contrasted to a very selected and stable set of eight normative speakers. In this sense, distances based on Kullback-Leibler divergence seem to be sensitive to PD articulation instability. Correlation studies show statistically relevant relationship between information contents based on articulation instability to certain motor and nonmotor clinical scores, such as freezing of gait, or sleep disorders. Remarkably, one of the statistically relevant correlations point out to the time interval passed since the first diagnostic. These results stress the need of defining scoring scales specifically designed for speech disability estimation and monitoring methodologies in degenerative diseases of neuromotor origin.
- MeSH
- biomechanika fyziologie MeSH
- čelisti patofyziologie MeSH
- datové soubory jako téma MeSH
- dysartrie etiologie patofyziologie MeSH
- jazyk patofyziologie MeSH
- lidé středního věku MeSH
- lidé MeSH
- Parkinsonova nemoc komplikace diagnóza MeSH
- poruchy artikulace etiologie patofyziologie MeSH
- senioři MeSH
- stupeň závažnosti nemoci MeSH
- Check Tag
- lidé středního věku MeSH
- lidé MeSH
- mužské pohlaví MeSH
- senioři MeSH
- Publikační typ
- časopisecké články MeSH
PURPOSE: Despite the improvement of therapeutic regimens, several patients with multiple myeloma (MM) still experience early relapse (ER). This subset of patients currently represents an unmet medical need. EXPERIMENTAL DESIGN: We pooled data from seven European multicenter phase II/III clinical trials enrolling 2,190 patients with newly diagnosed MM from 2003 to 2017. Baseline patient evaluation included 14 clinically relevant features. Patients with complete data (n = 1,218) were split into training (n = 844) and validation sets (n = 374). In the training set, a univariate analysis and a multivariate logistic regression model on ER within 18 months (ER18) were made. The most accurate model was selected on the validation set. We also developed a dynamic version of the score by including response to treatment. RESULTS: The Simplified Early Relapse in Multiple Myeloma (S-ERMM) score was modeled on six features weighted by a score: 5 points for high lactate dehydrogenase or t(4;14); 3 for del17p, abnormal albumin, or bone marrow plasma cells >60%; and 2 for λ free light chain. The S-ERMM identified three patient groups with different risks of ER18: Intermediate (Int) versus Low (OR = 2.39, P < 0.001) and High versus Low (OR = 5.59, P < 0.001). S-ERMM High/Int patients had significantly shorter overall survival (High vs. Low: HR = 3.24, P < 0.001; Int vs. Low: HR = 1.86, P < 0.001) and progression-free survival-2 (High vs. Low: HR = 2.89, P < 0.001; Int vs. Low: HR = 1.76, P < 0.001) than S-ERMM Low. The Dynamic S-ERMM (DS-ERMM) modulated the prognostic power of the S-ERMM. CONCLUSIONS: On the basis of simple, widely available baseline features, the S-ERMM and DS-ERMM properly identified patients with different risks of ER and survival outcomes.
- MeSH
- časové faktory MeSH
- datové soubory jako téma MeSH
- lidé středního věku MeSH
- lidé MeSH
- míra přežití MeSH
- mnohočetný myelom mortalita terapie MeSH
- prognóza MeSH
- recidiva MeSH
- senioři MeSH
- Check Tag
- lidé středního věku MeSH
- lidé MeSH
- senioři MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- validační studie MeSH
Retinopathy of prematurity (ROP) represents a vasoproliferative disease, especially in newborns and infants, which can potentially affect and damage the vision. Despite recent advances in neonatal care and medical guidelines, ROP still remains one of the leading causes of worldwide childhood blindness. The paper presents a unique dataset of 6,004 retinal images of 188 newborns, most of whom are premature infants. The dataset is accompanied by the anonymized patients' information from the ROP screening acquired at the University Hospital Ostrava, Czech Republic. Three digital retinal imaging camera systems are used in the study: Clarity RetCam 3, Natus RetCam Envision, and Phoenix ICON. The study is enriched by the software tool ReLeSeT which is aimed at automatic retinal lesion segmentation and extraction from retinal images. Consequently, this tool enables computing geometric and intensity features of retinal lesions. Also, we publish a set of pre-processing tools for feature boosting of retinal lesions and retinal blood vessels for building classification and segmentation models in ROP analysis.
- MeSH
- lidé MeSH
- novorozenec nedonošený * MeSH
- novorozenec MeSH
- počítačové zpracování obrazu MeSH
- retina * diagnostické zobrazování MeSH
- retinopatie nedonošených * diagnostické zobrazování MeSH
- Check Tag
- lidé MeSH
- novorozenec MeSH
- Publikační typ
- časopisecké články MeSH
- dataset MeSH
- Geografické názvy
- Česká republika MeSH
We performed a search to identify available wearable sensors systems that can collect patient health data and have data sharing capabilities. Findings available in "Wearable sensors with possibilities for data exchange: Analyzing status and needs of different actors in mobile health monitoring systems" [1]. We performed an initial search of the Vandrico wearable database, and supplemented the resulting device list with an internet search. In addition to relevant meta-data (i.e. name, description, manufacturer, web-link, etc.) for each device, we also collected data on 13 attributes related to data exchange. I.e. device type, communication interface, data transfer protocol, smartphone and/or PC integration, direct integration to open health platform, 3rd platform integration with open health platform, support for health care system/middleware connection, recorded health data types, integrated sensors, medical device certification, whether or not the use can access collected data, device developer access, and device availability on the market. In addition, we grouped each device into three groups of actors that these devices are relevant for: electronic health record providers, software developers, and patients. The collected data can be used as an overview of available devices for future researchers with interest in the mobile health (mHealth) area.
- Publikační typ
- časopisecké články MeSH
We present a novel system that leverages curators in the loop to develop a dataset and model for detecting structure features and functional annotations at residue-level from standard publication text. Our approach involves the integration of data from multiple resources, including PDBe, EuropePMC, PubMedCentral, and PubMed, combined with annotation guidelines from UniProt, and LitSuggest and HuggingFace models as tools in the annotation process. A team of seven annotators manually curated ten articles for named entities, which we utilized to train a starting PubmedBert model from HuggingFace. Using a human-in-the-loop annotation system, we iteratively developed the best model with commendable performance metrics of 0.90 for precision, 0.92 for recall, and 0.91 for F1-measure. Our proposed system showcases a successful synergy of machine learning techniques and human expertise in curating a dataset for residue-level functional annotations and protein structure features. The results demonstrate the potential for broader applications in protein research, bridging the gap between advanced machine learning models and the indispensable insights of domain experts.
- MeSH
- databáze proteinů MeSH
- lidé MeSH
- proteiny * chemie MeSH
- strojové učení * MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
- dataset MeSH
Background: Activities of Daily Living (ADLs) are essential tasks performed at home and used in healthcare to monitor sedentary behavior, track rehabilitation therapy, and monitor chronic obstructive pulmonary disease. The Barthel Index, used by healthcare professionals, has limitations due to its subjectivity. Human activity recognition (HAR) is a more accurate method using Information and Communication Technologies (ICTs) to assess ADLs more accurately. This work aims to create a singular, adaptable, and heterogeneous ADL dataset that integrates information from various sources, ensuring a rich representation of different individuals and environments. Methods: A literature review was conducted in Scopus, the University of California Irvine (UCI) Machine Learning Repository, Google Dataset Search, and the University of Cauca Repository to obtain datasets related to ADLs. Inclusion criteria were defined, and a list of dataset characteristics was made to integrate multiple datasets. Twenty-nine datasets were identified, including data from various accelerometers, gyroscopes, inclinometers, and heart rate monitors. These datasets were classified and analyzed from the review. Tasks such as dataset selection, categorization, analysis, cleaning, normalization, and data integration were performed. Results: The resulting unified dataset contained 238,990 samples, 56 activities, and 52 columns. The integrated dataset features a wealth of information from diverse individuals and environments, improving its adaptability for various applications. Conclusions: In particular, it can be used in various data science projects related to ADL and HAR, and due to the integration of diverse data sources, it is potentially useful in addressing bias in and improving the generalizability of machine learning models.
- Publikační typ
- časopisecké články MeSH
Registration of laser scans, or point clouds in general, is a crucial step of localization and mapping with mobile robots or in object modeling pipelines. A coarse alignment of the point clouds is generally needed before applying local methods such as the Iterative Closest Point (ICP) algorithm. We propose a feature-based approach to point cloud registration and evaluate the proposed method and its individual components on challenging real-world datasets. For a moderate overlap between the laser scans, the method provides a superior registration accuracy compared to state-of-the-art methods including Generalized ICP, 3D Normal-Distribution Transform, Fast Point-Feature Histograms, and 4-Points Congruent Sets. Compared to the surface normals, the points as the underlying features yield higher performance in both keypoint detection and establishing local reference frames. Moreover, sign disambiguation of the basis vectors proves to be an important aspect in creating repeatable local reference frames. A novel method for sign disambiguation is proposed which yields highly repeatable reference frames.
- MeSH
- algoritmy MeSH
- datové soubory jako téma * MeSH
- robotika MeSH
- Publikační typ
- časopisecké články MeSH
Alterations in DNA methylation profiles belong to important mechanisms in cancer development, and their assessment can be utilized for rapid and precise diagnostics. Therefore, establishing datasets of methylation profiles can improve and deepen our understanding of the role of epigenetic changes in cancer development as well as improve our diagnostic capabilities. In this dataset, we generated NGS data for 189 samples of pediatric CNS, soft tissue, and bone tumors. The sequencing libraries were prepared using methyl capture bisulfite sequencing, an effective compromise between whole-genome bisulfite sequencing and array-based methods with a more limited scope of target regions. The larger part of the cohort was processed with the Agilent SureSelectXT Human Methyl-Seq kit (149 samples) and the rest with the Illumina TruSeq Methyl Capture EPIC Library Prep Kit (40 samples). The data presented in this article may help other researchers further elucidate the importance of methylation in diagnosing pediatric CNS tumors, soft tissue, and bone tumors.
- Publikační typ
- časopisecké články MeSH
Parkinson's disease dysgraphia (PDYS), one of the earliest signs of Parkinson's disease (PD), has been researched as a promising biomarker of PD and as the target of a noninvasive and inexpensive approach to monitoring the progress of the disease. However, although several approaches to supportive PDYS diagnosis have been proposed (mainly based on handcrafted features (HF) extracted from online handwriting or the utilization of deep neural networks), it remains unclear which approach provides the highest discrimination power and how these approaches can be transferred between different datasets and languages. This study aims to compare classification performance based on two types of features: features automatically extracted by a pretrained convolutional neural network (CNN) and HF designed by human experts. Both approaches are evaluated on a multilingual dataset collected from 143 PD patients and 151 healthy controls in the Czech Republic, United States, Colombia, and Hungary. The subjects performed the spiral drawing task (SDT; a language-independent task) and the sentence writing task (SWT; a language-dependent task). Models based on logistic regression and gradient boosting were trained in several scenarios, specifically single language (SL), leave one language out (LOLO), and all languages combined (ALC). We found that the HF slightly outperformed the CNN-extracted features in all considered evaluation scenarios for the SWT. In detail, the following balanced accuracy (BACC) scores were achieved: SL-0.65 (HF), 0.58 (CNN); LOLO-0.65 (HF), 0.57 (CNN); and ALC-0.69 (HF), 0.66 (CNN). However, in the case of the SDT, features extracted by a CNN provided competitive results: SL-0.66 (HF), 0.62 (CNN); LOLO-0.56 (HF), 0.54 (CNN); and ALC-0.60 (HF), 0.60 (CNN). In summary, regarding the SWT, the HF outperformed the CNN-extracted features over 6% (mean BACC of 0.66 for HF, and 0.60 for CNN). In the case of the SDT, both feature sets provided almost identical classification performance (mean BACC of 0.60 for HF, and 0.58 for CNN).
- Publikační typ
- časopisecké články MeSH