data mining Dotaz Zobrazit nápovědu
BACKGROUND: Online weight loss information is commonly sought by internet users, and it may impact their health decisions and behaviors. Previous studies examined a limited number of Google search queries and relied on manual approaches to retrieve online weight loss websites. OBJECTIVE: This study aimed to identify and describe the characteristics of the top weight loss websites on Google. METHODS: This study gathered 432 Google search queries collected from Google autocomplete suggestions, "People Also Ask" featured questions, and Google Trends data. A data-mining software tool was developed to retrieve the search results automatically, setting English and the United States as the default criteria for language and location, respectively. Domain classification and evaluation technologies were used to categorize the websites according to their content and determine their risk of cyberattack. In addition, the top 5 most frequent websites in nonadvertising (ie, nonsponsored) search results were inspected for quality. RESULTS: The results revealed that the top 5 nonadvertising websites were healthline.com, webmd.com, verywellfit.com, mayoclinic.org, and womenshealthmag.com. All provided accuracy statements and author credentials. The domain categorization taxonomy yielded a total of 101 unique categories. After grouping the websites that appeared less than 5 times, the most frequent categories involved "Health" (104/623, 16.69%), "Personal Pages and Blogs" (91/623, 14.61%), "Nutrition and Diet" (48/623, 7.7%), and "Exercise" (34/623, 5.46%). The risk of being a victim of a cyberattack was low. CONCLUSIONS: The findings suggested that while quality information is accessible, users may still encounter less reliable content among various online resources. Therefore, better tools and methods are needed to guide users toward trustworthy weight loss information.
- Klíčová slova
- Google, consumer health informatics, cyberattack risk, data mining, digital health, information seeking, internet search, online health information, website analysis, weight loss,
- MeSH
- data mining * metody MeSH
- hmotnostní úbytek * MeSH
- internet * MeSH
- lidé MeSH
- vyhledávač MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
- Geografické názvy
- Spojené státy americké MeSH
A major challenge in cancer treatment is predicting the clinical response to anti-cancer drugs on a personalized basis. The success of such a task largely depends on the ability to develop computational resources that integrate big "omic" data into effective drug-response models. Machine learning is both an expanding and an evolving computational field that holds promise to cover such needs. Here we provide a focused overview of: 1) the various supervised and unsupervised algorithms used specifically in drug response prediction applications, 2) the strategies employed to develop these algorithms into applicable models, 3) data resources that are fed into these frameworks and 4) pitfalls and challenges to maximize model performance. In this context we also describe a novel in silico screening process, based on Association Rule Mining, for identifying genes as candidate drivers of drug response and compare it with relevant data mining frameworks, for which we generated a web application freely available at: https://compbio.nyumc.org/drugs/. This pipeline explores with high efficiency large sample-spaces, while is able to detect low frequency events and evaluate statistical significance even in the multidimensional space, presenting the results in the form of easily interpretable rules. We conclude with future prospects and challenges of applying machine learning based drug response prediction in precision medicine.
- Klíčová slova
- Association Rule Mining, Data mining, Drug Response Prediction, Machine Learning, Precision Medicine,
- MeSH
- data mining * MeSH
- lidé MeSH
- nádory farmakoterapie MeSH
- počítačová simulace MeSH
- strojové učení * MeSH
- výsledek terapie MeSH
- zvířata MeSH
- Check Tag
- lidé MeSH
- zvířata MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- přehledy MeSH
- Research Support, N.I.H., Extramural MeSH
Water molecules represent an integral part of proteins and a key determinant of protein structure, dynamics and function. WatAA is a newly developed, web-based atlas of amino-acid hydration in proteins. The atlas provides information about the ordered first hydration shell of the most populated amino-acid conformers in proteins. The data presented in the atlas are drawn from two sources: experimental data and ab initio quantum-mechanics calculations. The experimental part is based on a data-mining study of a large set of high-resolution protein crystal structures. The crystal-derived data include 3D maps of water distribution around amino-acids and probability of occurrence of each of the identified hydration sites. The quantum mechanics calculations validate and extend this primary description by optimizing the water position for each hydration site, by providing hydrogen atom positions and by quantifying the interaction energy that stabilizes the water molecule at the particular hydration site position. The calculations show that the majority of experimentally derived hydration sites are positioned near local energy minima for water, and the calculated interaction energies help to assess the preference of water for the individual hydration sites. We propose that the atlas can be used to validate water placement in electron density maps in crystallographic refinement, to locate water molecules mediating protein-ligand interactions in drug design, and to prepare and evaluate molecular dynamics simulations. WatAA: Atlas of Protein Hydration is freely available without login at .
- MeSH
- data mining MeSH
- internet MeSH
- kvantová teorie MeSH
- proteiny chemie MeSH
- simulace molekulární dynamiky MeSH
- uživatelské rozhraní počítače * MeSH
- voda chemie MeSH
- Publikační typ
- časopisecké články MeSH
- Názvy látek
- proteiny MeSH
- voda MeSH
As the amount of genome information increases rapidly, there is a correspondingly greater need for methods that provide accurate and automated annotation of gene function. For example, many high-throughput technologies--e.g., next-generation sequencing--are being used today to generate lists of genes associated with specific conditions. However, their functional interpretation remains a challenge and many tools exist trying to characterize the function of gene-lists. Such systems rely typically in enrichment analysis and aim to give a quick insight into the underlying biology by presenting it in a form of a summary-report. While the load of annotation may be alleviated by such computational approaches, the main challenge in modern annotation remains to develop a systems form of analysis in which a pipeline can effectively analyze gene-lists quickly and identify aggregated annotations through computerized resources. In this article we survey some of the many such tools and methods that have been developed to automatically interpret the biological functions underlying gene-lists. We overview current functional annotation aspects from the perspective of their epistemology (i.e., the underlying theories used to organize information about gene function into a body of verified and documented knowledge) and find that most of the currently used functional annotation methods fall broadly into one of two categories: they are based either on 'known' formally-structured ontology annotations created by 'experts' (e.g., the GO terms used to describe the function of Entrez Gene entries), or--perhaps more adventurously--on annotations inferred from literature (e.g., many text-mining methods use computer-aided reasoning to acquire knowledge represented in natural languages). Overall however, deriving detailed and accurate insight from such gene lists remains a challenging task, and improved methods are called for. In particular, future methods need to (1) provide more holistic insight into the underlying molecular systems; (2) provide better follow-up experimental testing and treatment options, and (3) better manage gene lists derived from organisms that are not well-studied. We discuss some promising approaches that may help achieve these advances, especially the use of extended dictionaries of biomedical concepts and molecular mechanisms, as well as greater use of annotation benchmarks.
- Klíčová slova
- Benchmarks, Functional annotation, GO term enrichment, Keyword enhancement, Systems biology, Text mining,
- MeSH
- data mining metody trendy MeSH
- databáze genetické * trendy MeSH
- genová ontologie * trendy MeSH
- lidé MeSH
- zvířata MeSH
- Check Tag
- lidé MeSH
- zvířata MeSH
- Publikační typ
- časopisecké články MeSH
- přehledy MeSH
In order to analyze and improve the dental age estimation in children and adolescents for forensic purposes, 22 age estimation methods were compared to a sample of 976 orthopantomographs (662 males, 314 females) of healthy Czech children and adolescents aged between 2.7 and 20.5 years. All methods are compared in terms of the accuracy and complexity and are based on various data mining methods or on simple mathematical operations. The winning method is presented in detail. The comparison showed that only three methods provide the best accuracy while remaining user-friendly. These methods were used to build a tabular multiple linear regression model, an M5P tree model and support vector machine model with first-order polynomial kernel. All of them have mean absolute error (MAE) under 0.7 years for both males and females. The other well-performing data mining methods (RBF neural network, K-nearest neighbors, Kstar, etc.) have similar or slightly better accuracy, but they are not user-friendly as they require computing equipment and the implementation as computer program. The lowest estimation accuracy provides the traditional model based on age averages (MAE under 0.96 years). Different relevancy of various teeth for the age estimation was found. This finding also explains the lowest accuracy of the traditional averages-based model. In this paper, a technique for missing data replacement for the cases with missing teeth is presented in detail as well as the constrained tabular multiple regression model. Also, we provide free age prediction software based on this wining model.
- Klíčová slova
- Age estimation, Data mining, Model, Population-specific standards,
- MeSH
- data mining MeSH
- dentice trvalá * MeSH
- dítě MeSH
- lidé MeSH
- lineární modely MeSH
- mladiství MeSH
- mladý dospělý MeSH
- neuronové sítě (počítačové) MeSH
- předškolní dítě MeSH
- rentgendiagnostika panoramatická MeSH
- rozhodovací stromy MeSH
- software MeSH
- support vector machine MeSH
- určení zubního věku metody MeSH
- zuby růst a vývoj MeSH
- Check Tag
- dítě MeSH
- lidé MeSH
- mladiství MeSH
- mladý dospělý MeSH
- mužské pohlaví MeSH
- předškolní dítě MeSH
- ženské pohlaví MeSH
- Publikační typ
- časopisecké články MeSH
- srovnávací studie MeSH
Lipidomics and metabolomics communities comprise various informatics tools; however, software programs handling multimodal mass spectrometry (MS) data with structural annotations guided by the Lipidomics Standards Initiative are limited. Here, we provide MS-DIAL 5 for in-depth lipidome structural elucidation through electron-activated dissociation (EAD)-based tandem MS and determining their molecular localization through MS imaging (MSI) data using a species/tissue-specific lipidome database containing the predicted collision-cross section values. With the optimized EAD settings using 14 eV kinetic energy, the program correctly delineated lipid structures for 96.4% of authentic standards, among which 78.0% had the sn-, OH-, and/or C = C positions correctly assigned at concentrations exceeding 1 μM. We showcased our workflow by annotating the sn- and double-bond positions of eye-specific phosphatidylcholines containing very-long-chain polyunsaturated fatty acids (VLC-PUFAs), characterized as PC n-3-VLC-PUFA/FA. Using MSI data from the eye and n-3-VLC-PUFA-supplemented HeLa cells, we identified glycerol 3-phosphate acyltransferase as an enzyme candidate responsible for incorporating n-3 VLC-PUFAs into the sn1 position of phospholipids in mammalian cells, which was confirmed using EAD-MS/MS and recombinant proteins in a cell-free system. Therefore, the MS-DIAL 5 environment, combined with optimized MS data acquisition methods, facilitates a better understanding of lipid structures and their localization, offering insights into lipid biology.
- MeSH
- data mining * metody MeSH
- fosfatidylcholiny metabolismus chemie MeSH
- HeLa buňky MeSH
- hmotnostní spektrometrie metody MeSH
- lidé MeSH
- lipidomika * metody MeSH
- lipidy chemie analýza MeSH
- metabolomika metody MeSH
- nenasycené mastné kyseliny metabolismus chemie MeSH
- software MeSH
- tandemová hmotnostní spektrometrie metody MeSH
- zvířata MeSH
- Check Tag
- lidé MeSH
- zvířata MeSH
- Publikační typ
- časopisecké články MeSH
- Názvy látek
- fosfatidylcholiny MeSH
- lipidy MeSH
- nenasycené mastné kyseliny MeSH
Dental development is frequently used to estimate age in many anthropological specializations. The aim of this study was to extract an accurate predictive age system for the Czech population and to discover any different predictive ability of various tooth types and their ontogenetic stability during infancy and adolescence. A cross-sectional panoramic X-ray study was based on developmental stages assessment of mandibular teeth (Moorrees et al. 1963) using 1393 individuals aged from 3 to 17 years. Data mining methods were used for dental age estimation. These are based on nonlinear relationships between the predicted age and data sets. Compared with other tested predictive models, the GAME method predicted age with the highest accuracy. Age-interval estimations between the 10th and 90th percentiles ranged from -1.06 to +1.01 years in girls and from -1.13 to +1.20 in boys. Accuracy was expressed by RMS error, which is the average deviation between estimated and chronological age. The predictive value of individual teeth changed during the investigated period from 3 to 17 years. When we evaluated the whole period, the second molars exhibited the best predictive ability. When evaluating partial age periods, we found that the accuracy of biological age prediction declines with increasing age (from 0.52 to 1.20 years in girls and from 0.62 to 1.22 years in boys) and that the predictive importance of tooth types changes, depending on variability and the number of developmental stages in the age interval. GAME is a promising tool for age-interval estimation studies as they can provide reliable predictive models.
- MeSH
- data mining metody MeSH
- dítě MeSH
- lidé MeSH
- mladiství MeSH
- předškolní dítě MeSH
- průřezové studie MeSH
- regresní analýza MeSH
- rentgendiagnostika panoramatická MeSH
- statistické modely MeSH
- support vector machine MeSH
- určení zubního věku metody MeSH
- zuby anatomie a histologie diagnostické zobrazování MeSH
- Check Tag
- dítě MeSH
- lidé MeSH
- mladiství MeSH
- mužské pohlaví MeSH
- předškolní dítě MeSH
- ženské pohlaví MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
Recently published studies showed that age assessment methods are population specific. Authors analyse the senescence changes in pubic symphysis and sacro-pelvic surface of a pelvic bone using data mining methods. The multi-ethnic data set consists of 956 adult individuals ranging from 19 to 100 years of age derived from 9 different populations with known age and sex. The results show that accurate and reliable age assessment is possible to three age classes (less than 30, 30-60, 60 and more). The study confirms that population specificity of the methods exists and the variable "sex" is not important in age classification.
- MeSH
- algoritmy MeSH
- data mining metody MeSH
- dospělí MeSH
- etnicita * MeSH
- lidé středního věku MeSH
- lidé MeSH
- os ilium anatomie a histologie MeSH
- rasové skupiny MeSH
- ROC křivka MeSH
- senioři nad 80 let MeSH
- senioři MeSH
- soudní antropologie MeSH
- symphysis pubica anatomie a histologie MeSH
- určení kostního věku metody MeSH
- Check Tag
- dospělí MeSH
- lidé středního věku MeSH
- lidé MeSH
- mužské pohlaví MeSH
- senioři nad 80 let MeSH
- senioři MeSH
- ženské pohlaví MeSH
- Publikační typ
- časopisecké články MeSH
Age-at-death estimation of adult skeletal remains is a key part of biological profile estimation, yet it remains problematic for several reasons. One of them may be the subjective nature of the evaluation of age-related changes, or the fact that the human eye is unable to detect all the relevant surface changes. We have several aims: (1) to validate already existing computer models for age estimation; (2) to propose our own expert system based on computational approaches to eliminate the factor of subjectivity and to use the full potential of surface changes on an articulation area; and (3) to determine what age range the pubic symphysis is useful for age estimation. A sample of 483 3D representations of the pubic symphyseal surfaces from the ossa coxae of adult individuals coming from four European (two from Portugal, one from Switzerland and Greece) and one Asian (Thailand) identified skeletal collections was used. A validation of published algorithms showed very high error in our dataset-the Mean Absolute Error (MAE) ranged from 16.2 and 25.1 years. Two completely new approaches were proposed in this paper: SASS (Simple Automated Symphyseal Surface-based) and AANNESS (Advanced Automated Neural Network-grounded Extended Symphyseal Surface-based), whose MAE values are 11.7 and 10.6 years, respectively. Lastly, it was demonstrated that our models could estimate the age-at-death using the pubic symphysis over the entire adult age range. The proposed models offer objective age estimates with low estimation error (compared to traditional visual methods) and are able to estimate age using the pubic symphysis across the entire adult age range.
- MeSH
- data mining MeSH
- dospělí MeSH
- lidé MeSH
- soudní antropologie metody MeSH
- symphysis pubica * MeSH
- určení kostního věku metody MeSH
- zobrazování trojrozměrné MeSH
- Check Tag
- dospělí MeSH
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
INTRODUCTION: Identification of new prognostic factors can help in designing future clinical studies. In the case of advanced non-small cell lung cancer, there might be good candidates - tumor markers CYFRA 21-1, CEA or NSE [1-8]. It is possible to evaluate the relationship between their expression and prognosis by data mining technique recursive partitioning and amalgamation. PATIENTS AND METHODS: We analyzed retrospective data of 162 patients of Oncology clinics in Trnava. All of these patients were admitted between 2008 and 2012 for the administration of first-line chemotherapy according to current recommendations. We evaluated the impact of known pretreatment prognostic markers - performance status, weight loss, smoking, age, sex, stage, histologic subtype, comorbidity and oncomarkers CYFRA 21-1, CEA or NSE, as well as combinations of these factors on survival. RESULTS: Our analyses showed that there are three subgroups of patients with good, intermediate and unfavorable prognosis. Oncomarkers played an important role in formation of a subgroup of 49 patients with good prognosis - including patients with no pretreatment weight loss and low levels of CEA ( 4.1 ng/ml) or NSE ( 11.1 ng/ml). In this subgroup, the median survival time was at least 16 months (not achieved) and the difference in survival compared to the rest of the group was highly statistically significant (risk ratio 5.21, 95% CI 1.41-19.28; p < 0.0001). CONCLUSION: We showed the prognostic significance of low levels of NSE and CEA oncomarkers in the group of patients with no pretreatment weight loss. Recursive partitioning and amalgamation is a useful data mining method, but the generated hypothesis needs to be confirmed by further clinical study designed for this purpose
- MeSH
- antigeny nádorové metabolismus MeSH
- data mining metody MeSH
- fosfopyruváthydratasa metabolismus MeSH
- karcinoembryonální antigen metabolismus MeSH
- keratin-19 metabolismus MeSH
- lidé MeSH
- nádorové biomarkery metabolismus MeSH
- nádory plic farmakoterapie metabolismus mortalita patologie MeSH
- nemalobuněčný karcinom plic farmakoterapie metabolismus mortalita patologie MeSH
- prognóza MeSH
- retrospektivní studie MeSH
- vzpírání MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- anglický abstrakt MeSH
- časopisecké články MeSH
- Názvy látek
- antigen CYFRA21.1 MeSH Prohlížeč
- antigeny nádorové MeSH
- fosfopyruváthydratasa MeSH
- karcinoembryonální antigen MeSH
- keratin-19 MeSH
- nádorové biomarkery MeSH