Dataset optimization
Dotaz
Zobrazit nápovědu
The Differential Evolution (DE) is a widely used bioinspired optimization algorithm developed by Storn and Price. It is popular for its simplicity and robustness. This algorithm was primarily designed for real-valued problems and continuous functions, but several modified versions optimizing both integer and discrete-valued problems have been developed. The discrete-coded DE has been mostly used for combinatorial problems in a set of enumerative variants. However, the DE has a great potential in the spatial data analysis and pattern recognition. This paper formulates the problem as a search of a combination of distinct vertices which meet the specified conditions. It proposes a novel approach called the Multidimensional Discrete Differential Evolution (MDDE) applying the principle of the discrete-coded DE in discrete point clouds (PCs). The paper examines the local searching abilities of the MDDE and its convergence to the global optimum in the PCs. The multidimensional discrete vertices cannot be simply ordered to get a convenient course of the discrete data, which is crucial for good convergence of a population. A novel mutation operator utilizing linear ordering of spatial data based on the space filling curves is introduced. The algorithm is tested on several spatial datasets and optimization problems. The experiments show that the MDDE is an efficient and fast method for discrete optimizations in the multidimensional point clouds.
A variety of models are available for the estimation of parameters of the human growth curve. Several have been widely and successfully used with longitudinal data that are reasonably complete. On the other hand, the modeling of data for a limited number of observation points is problematic and requires the interpolation of the interval between points and often an extrapolation of the growth trajectory beyond the range of empirical limits (prediction). This study tested a new approach for fitting a relatively limited number of longitudinal data using the normal variation of human empirical growth curves. First, functional principal components analysis was done for curve phase and amplitude using complete and dense data sets for a reference sample (Brno Growth Study). Subsequently, artificial curves were generated with a combination of 12 of the principal components and applied for fitting to the newly analyzed data with the Levenberg-Marquardt optimization algorithm. The approach was tested on seven 5-points/year longitudinal data samples of adolescents extracted from the reference sample. The samples differed in their distance from the mean age at peak velocity for the sample and were tested by a permutation leave-one-out approach. The results indicated the potential of this method for growth modeling as a user-friendly application for practical applications in pediatrics, auxology and youth sport.
- Publikační typ
- časopisecké články MeSH
To identify patterns in big medical datasets and use Deep Learning and Machine Learning (ML) to reliably diagnose Cardio Vascular Disease (CVD), researchers are currently delving deeply into these fields. Training on large datasets and producing highly accurate validation results is exceedingly difficult. Furthermore, early and precise diagnosis is necessary due to the increased global prevalence of cardiovascular disease (CVD). However, the increasing complexity of healthcare datasets makes it challenging to detect feature connections and produce precise predictions. To address these issues, the Intelligent Cardiovascular Disease Diagnosis based on Ant Colony Optimisation with Enhanced Deep Learning (ICVD-ACOEDL) model was developed. This model employs feature selection (FS) and hyperparameter optimization to diagnose CVD. Applying a min-max scaler, medical data is first consistently prepared. The key feature that sets ICVD-ACOEDL apart is the use of Ant Colony Optimisation (ACO) to select an optimal feature subset, which in turn helps to upgrade the performance of the ensuring deep learning enhanced neural network (DLENN) classifier. The model reforms the hyperparameters of DLENN for CVD classification using Bayesian optimization. Comprehensive evaluations on benchmark medical datasets show that ICVD-ACOEDL exceeds existing techniques, indicating that it could have a significant impact on CVD diagnosis. The model furnishes a workable way to increase CVD classification efficiency and accuracy in real-world medical situations by incorporating ACO for feature selection, min-max scaling for data pre-processing, and Bayesian optimization for hyperparameter tweaking.
- MeSH
- Bayesova věta MeSH
- deep learning * MeSH
- diagnóza počítačová metody MeSH
- Formicidae MeSH
- kardiovaskulární nemoci * diagnóza MeSH
- lidé MeSH
- neuronové sítě * MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
Objective.The current practices of designing neural networks rely heavily on subjective judgment and heuristic steps, often dictated by the level of expertise possessed by architecture designers. To alleviate these challenges and streamline the design process, we propose an automatic method, a novel approach to enhance the optimization of neural network architectures for processing intracranial electroencephalogram (iEEG) data.Approach.We present a genetic algorithm, which optimizes neural network architecture and signal pre-processing parameters for iEEG classification.Main results.Our method improved the macroF1 score of the state-of-the-art model in two independent datasets, from St. Anne's University Hospital (Brno, Czech Republic) and Mayo Clinic (Rochester, MN, USA), from 0.9076 to 0.9673 and from 0.9222 to 0.9400 respectively.Significance.By incorporating principles of evolutionary optimization, our approach reduces the reliance on human intuition and empirical guesswork in architecture design, thus promoting more efficient and effective neural network models. The proposed method achieved significantly improved results when compared to the state-of-the-art benchmark model (McNemar's test,p≪ 0.01). The results indicate that neural network architectures designed through machine-based optimization outperform those crafted using the subjective heuristic approach of a human expert. Furthermore, we show that well-designed data preprocessing significantly affects the models' performance.
- MeSH
- elektroencefalografie metody MeSH
- elektrokortikografie * MeSH
- lidé MeSH
- neuronové sítě * MeSH
- počítačové zpracování signálu MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
Several molecular clonality assays have been developed to assess canine B cell proliferations. These assays were based on different sequence data, utilized different assay designs and employed different testing strategies. This has resulted in a complex body of literature and complicates evidence-based selection of primer sets. In addition, further refinement of primer sets is difficult because it is unknown how well current primer sets cover the expressed sequence repertoire. The objectives of this study were 1) to provide an overview of published IGH clonality assays that highlights key differences in assay design and testing strategy and 2) to propose a novel method for optimizing primer sets that leverages large-scale sequencing data. A review of previously published assays highlighted confounding factors that hamper a direct comparison of performance metrics between studies. These findings illustrate the need for a multi-institutional effort to harmonize veterinary clonality testing. A novel in silico analysis of primer sequences using a large dataset of expressed sequences identified shortfalls of existing primer sets and was used to guide primer optimization. Three optimized primer sets were tested and yielded qualitative sensitivity values between 80-90%. The qualitative sensitivity ranged from 1% to over 50% and was dependent on the size of the neoplastic clone and the sample DNA used. These findings illustrate that inclusion of high-throughput sequencing data for primer design can be a useful tool to guide primer design and optimization. This strategy could be applied to other antigen receptor loci or species to further improve veterinary clonality assays.
- MeSH
- B-lymfocyty cytologie MeSH
- buněčné klony * MeSH
- DNA primery * MeSH
- psi genetika imunologie MeSH
- těžké řetězce imunoglobulinů genetika MeSH
- zvířata MeSH
- Check Tag
- psi genetika imunologie MeSH
- zvířata MeSH
- Publikační typ
- časopisecké články MeSH
- přehledy MeSH
The performance of diagnostic tests in intervention trials of Helicobacter pylori (H.pylori) eradication is crucial, since even minor inaccuracies can have major impact. To determine the cut-off point for 13C-urea breath test (13C-UBT) and to assess if it can be further optimized by serologic testing, mathematic modeling, histopathology and serologic validation were applied. A finite mixture model (FMM) was developed in 21,857 subjects, and an independent validation by modified Giemsa staining was conducted in 300 selected subjects. H.pylori status was determined using recomLine H.pylori assay in 2,113 subjects with a borderline 13C-UBT results. The delta over baseline-value (DOB) of 3.8 was an optimal cut-off point by a FMM in modelling dataset, which was further validated as the most appropriate cut-off point by Giemsa staining (sensitivity = 94.53%, specificity = 92.93%). In the borderline population, 1,468 subjects were determined as H.pylori positive by recomLine (69.5%). A significant correlation between the number of positive H.pylori serum responses and DOB value was found (rs = 0.217, P < 0.001). A mathematical approach such as FMM might be an alternative measure in optimizing the cut-off point for 13C-UBT in community-based studies, and a second method to determine H.pylori status for subjects with borderline value of 13C-UBT was necessary and recommended.
- MeSH
- algoritmy * MeSH
- dechové testy metody MeSH
- diagnostické techniky molekulární normy MeSH
- dospělí MeSH
- infekce vyvolané Helicobacter pylori diagnóza MeSH
- izotopy uhlíku MeSH
- klinické zkoušky jako téma MeSH
- lidé středního věku MeSH
- lidé MeSH
- limita detekce MeSH
- močovina MeSH
- nádory žaludku diagnóza mikrobiologie MeSH
- teoretické modely MeSH
- Check Tag
- dospělí MeSH
- lidé středního věku MeSH
- lidé MeSH
- mužské pohlaví MeSH
- ženské pohlaví MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- validační studie MeSH
PURPOSE: The aims of this work are (1) to explore deep learning (DL) architectures, spectroscopic input types, and learning designs toward optimal quantification in MR spectroscopy of simulated pathological spectra; and (2) to demonstrate accuracy and precision of DL predictions in view of inherent bias toward the training distribution. METHODS: Simulated 1D spectra and 2D spectrograms that mimic an extensive range of pathological in vivo conditions are used to train and test 24 different DL architectures. Active learning through altered training and testing data distributions is probed to optimize quantification performance. Ensembles of networks are explored to improve DL robustness and reduce the variance of estimates. A set of scores compares performances of DL predictions and traditional model fitting (MF). RESULTS: Ensembles of heterogeneous networks that combine 1D frequency-domain and 2D time-frequency domain spectrograms as input perform best. Dataset augmentation with active learning can improve performance, but gains are limited. MF is more accurate, although DL appears to be more precise at low SNR. However, this overall improved precision originates from a strong bias for cases with high uncertainty toward the dataset the network has been trained with, tending toward its average value. CONCLUSION: MF mostly performs better compared to the faster DL approach. Potential intrinsic biases on training sets are dangerous in a clinical context that requires the algorithm to be unbiased to outliers (i.e., pathological data). Active learning and ensemble of networks are good strategies to improve prediction performances. However, data quality (sufficient SNR) has proven as a bottleneck for adequate unbiased performance-like in the case of MF.
INTRODUCTION: In 2020, Elekta Instrument AB, Stockholm, released a new dose optimizer for Leksell GammaPlan, which includes the possibility of inverse planning. This study aimed to compare the new software with the previous manual version of treatment planning for stereotactic radiosurgery and evaluate its performance. MATERIALS AND METHODS: Four types of diagnoses - vestibular schwannomas, pituitary adenomas, meningiomas, and single brain metastasis - along with 80 clinically approved challenging cases, were selected for testing the new software. Key parameters, including coverage, selectivity, target volume, and doses to critical structures, were collected and statistically analysed using a t test. These parameters were compared based on the Leksell Gamma Knife (LGK) Society standardization document for stereotactic radiosurgery, both for each diagnosis and for the entire dataset. RESULTS: The new software showed a clear advantage, particularly in sparing critical structures while maintaining or improving treatment plan conformity. Doses to critical structures such as the optic nerve, brainstem, cochlea, and pituitary gland decreased by an average of 13% (0.76 Gy), 7% (0.52 Gy), 7% (0.2 Gy), and 14% (1.04 Gy), respectively, reducing toxicity. Other plan parameters also showed significant improvements, except for the gradient index. Selectivity improved by 11% (0.03), the Shaw Conformity Index improved by 10% (0.1), and coverage improved by 0.01. Additionally, treatment time was reduced by 10% enhancing patient comfort. CONCLUSION: Overall, LGK Lightning is faster and produces treatment plans with superior parameters compared to manual planning.
- MeSH
- celková dávka radioterapie MeSH
- lidé MeSH
- meningeální nádory radioterapie MeSH
- meningeom radioterapie chirurgie MeSH
- nádory hypofýzy radioterapie chirurgie MeSH
- nádory mozku * radioterapie sekundární chirurgie MeSH
- plánování radioterapie pomocí počítače * metody MeSH
- radiochirurgie * metody MeSH
- software * MeSH
- vestibulární schwannom radioterapie chirurgie MeSH
- Check Tag
- lidé MeSH
- mužské pohlaví MeSH
- ženské pohlaví MeSH
- Publikační typ
- časopisecké články MeSH
- srovnávací studie MeSH
The accuracy of human leukocyte antigen (HLA)-matching algorithms is a prerequisite for the correct and efficient identification of optimal unrelated donors for patients requiring hematopoietic stem cell transplantation. The goal of this World Marrow Donor Association study was to validate established matching algorithms from different international donor registries by challenging them with simulated input data and subsequently comparing the output. This experiment addressed three specific aspects of HLA matching using different data sets for tasks of increasing complexity. The first two tasks targeted the traditional matching approach identifying discrepancies between patient and donor HLA genotypes by counting antigen and allele differences. Contemporary matching procedures predicting the probability for HLA identity using haplotype frequencies were addressed by the third task. In each task, the identified disparities between the results of the participating computer programs were analyzed, classified and quantified. This study led to a deep understanding of the algorithms participating and finally produced virtually identical results. The unresolved discrepancies total to less than 1%, 4% and 2% for the three tasks and are mostly because of individual decisions in the design of the programs. Based on these findings, reference results for the three input data sets were compiled that can be used to validate future matching algorithms and thus improve the quality of the global donor search process.
- MeSH
- alely * MeSH
- algoritmy * MeSH
- datové soubory jako téma MeSH
- frekvence genu MeSH
- haplotypy MeSH
- HLA antigeny klasifikace genetika imunologie MeSH
- homologní transplantace MeSH
- lidé MeSH
- nepříbuzný dárce MeSH
- příjemce transplantátu MeSH
- registrace * MeSH
- testování histokompatibility MeSH
- transplantace hematopoetických kmenových buněk * MeSH
- transplantace kmenových buněk z pupečníkové krve * MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
- Research Support, U.S. Gov't, Non-P.H.S. MeSH
- srovnávací studie MeSH
Siderophores represent important microbial virulence factors and infection biomarkers. Their monitoring in fermentation broths, bodily fluids, and tissues should be reproducible. Similar isolation, characterization, and quantitation studies can often have conflicting results, and without proper documentation of sample collection, data processing, and analysis methods, it is difficult to reexamine the data and reconcile these differences. In this Springer Nature Protocol, we present the procedure optimized for ferricrocin/triacetylfusarinine C extraction from biological material as well as for tissue fixation and cryosectioning for optical microscopy and for both elemental and molecular mass spectrometry imaging. Special attention is paid to siderophore data mining from conventional and product ion mass spectra, liquid chromatography, and mass spectrometry imaging datasets, performed here by our free software called CycloBranch.
- MeSH
- Aspergillus fumigatus metabolismus MeSH
- biologické markery analýza MeSH
- chromatografie kapalinová metody MeSH
- data mining metody MeSH
- datové soubory jako téma MeSH
- ferrichrom analogy a deriváty izolace a purifikace metabolismus MeSH
- fixace tkání metody MeSH
- hmotnostní spektrometrie metody MeSH
- invazivní plicní aspergilóza diagnóza mikrobiologie MeSH
- kryoultramikrotomie metody MeSH
- krysa rodu rattus MeSH
- kyseliny hydroxamové izolace a purifikace metabolismus MeSH
- lidé MeSH
- modely nemocí na zvířatech MeSH
- siderofory izolace a purifikace metabolismus MeSH
- software MeSH
- železité sloučeniny izolace a purifikace metabolismus MeSH
- zvířata MeSH
- Check Tag
- krysa rodu rattus MeSH
- lidé MeSH
- zvířata MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH