JavaScript NENÍ povolen !

imputation Dotaz Zobrazit nápovědu

Přesná shoda

86 záznamů v PubMed

Článek

Challenge of missing data in observational studies: investigating cross-sectional imputation methods for assessing disease activity in axial spondyloarthritis

... OBJECTIVES: We aimed to compare various methods for imputing disease activity in longitudinally collected ...

Georgiadis, Stylianos
Autor Georgiadis, Stylianos ORCID Copenhagen Center for Arthritis Research (COPECARE), Center for Rheumatology and Spine Diseases, Center of Head and Orthopaedics, Rigshospitalet, Glostrup, Denmark stylianos.georgiadis@regionh.dk
Pons, Marion
Autor Pons, Marion Copenhagen Center for Arthritis Research (COPECARE), Center for Rheumatology and Spine Diseases, Center of Head and Orthopaedics, Rigshospitalet, Glostrup, Denmark
Rasmussen, Simon
Autor Rasmussen, Simon Copenhagen Center for Arthritis Research (COPECARE), Center for Rheumatology and Spine Diseases, Center of Head and Orthopaedics, Rigshospitalet, Glostrup, Denmark
Hetland, Merete Lund
Autor Hetland, Merete Lund ORCID Copenhagen Center for Arthritis Research (COPECARE), Center for Rheumatology and Spine Diseases, Center of Head and Orthopaedics, Rigshospitalet, Glostrup, Denmark Department of Clinical Medicine, University of Copenhagen, Kobenhavn, Denmark
Linde, Louise
Autor Linde, Louise Copenhagen Center for Arthritis Research (COPECARE), Center for Rheumatology and Spine Diseases, Center of Head and Orthopaedics, Rigshospitalet, Glostrup, Denmark
di Giuseppe, Daniela
Autor di Giuseppe, Daniela Clinical Epidemiology Division, Department of Medicine Solna, Karolinska Institutet, Solna, Sweden
Michelsen, Brigitte
Autor Michelsen, Brigitte Copenhagen Center for Arthritis Research (COPECARE), Center for Rheumatology and Spine Diseases, Center of Head and Orthopaedics, Rigshospitalet, Glostrup, Denmark Center for Treatment of Rheumatic and Musculoskeletal Diseases (REMEDY), Diakonhjemmet Hospital, Oslo, Norway Research Unit, Sørlandet Hospital, Kristiansand, Norway
Wallman, Johan K
Autor Wallman, Johan K Department of Clinical Sciences Lund, Rheumatology, Skåne University Hospital, Lund University, Lund, Sweden
Olofsson, Tor
Autor Olofsson, Tor ORCID Department of Clinical Sciences Lund, Rheumatology, Skåne University Hospital, Lund University, Lund, Sweden
Zavada, Jakub
Autor Zavada, Jakub Institute of Rheumatology, Prague, Czech Republic Department of Rheumatology, First Faculty of Medicine, Charles University, Praha, Czech Republic

RMD open. 2025 Feb 20 ; 11 (1) : . [epub] 20250220

RMD Open
ISSN 2056-5933
Zdroj

OBJECTIVES: We aimed to compare various methods for imputing disease activity in longitudinally collected observational data of patients with axial spondyloarthritis (axSpA). METHODS: We conducted a simulation study on data from 8583 axSpA patients from ten European registries. Disease activity was assessed by the Axial Spondyloarthritis Disease Activity Score (ASDAS) and the corresponding low disease activity (LDA; ASDAS<2.1) state at baseline, 6 and 12 months. We focused on cross-sectional methods which impute missing values of an individual at a particular time point based on the available information from other individuals at that time point. We applied nine single and five multiple imputation methods, covering mean, regression and hot deck methods. The performance of each imputation method was evaluated via relative bias and coverage of 95% confidence intervals for the mean ASDAS and the derived proportion of patients in LDA. RESULTS: Hot deck imputation methods outperformed mean and regression methods, particularly when assessing LDA. Multiple imputation procedures provided better coverage than the corresponding single imputation ones. However, none of the evaluated methods produced unbiased estimates with adequate coverage across all time points, with performance for missing baseline data being worse than for missing follow-up data. Predictive mean and weighted predictive mean hot deck imputation procedures consistently provided results with low bias. CONCLUSIONS: This study contributes to the available methods for imputing disease activity in observational research. Hot deck imputation using predictive mean matching exhibited the highest robustness and is thus our suggested approach.

Klíčová slova
Axial Spondyloarthritis, Epidemiology, Interleukin-17, Tumour Necrosis Factor Inhibitors,
MeSH
axiální spondyloartritida * epidemiologie diagnóza MeSH
dospělí MeSH
lidé MeSH
pozorovací studie jako téma * MeSH
průřezové studie MeSH
registrace MeSH
spondylartritida * diagnóza MeSH
stupeň závažnosti nemoci MeSH
Check Tag
dospělí MeSH
lidé MeSH
mužské pohlaví MeSH
ženské pohlaví MeSH
Publikační typ
časopisecké články MeSH
Geografické názvy
Evropa epidemiologie MeSH

Článek

A comparison of genomic selection models across time in interior spruce (Picea engelmannii × glauca) using unordered SNP imputation methods

... was used for single nucleotide polymorphism (SNP) discovery in conjunction with three unordered imputation ...

Heredity. 2015 Dec ; 115 (6) : 547-55. [epub] 20150701

Heredity (Edinb)
ISSN 1365-2540 | 0018-067X
Zdroj

Genomic selection (GS) potentially offers an unparalleled advantage over traditional pedigree-based selection (TS) methods by reducing the time commitment required to carry out a single cycle of tree improvement. This quality is particularly appealing to tree breeders, where lengthy improvement cycles are the norm. We explored the prospect of implementing GS for interior spruce (Picea engelmannii × glauca) utilizing a genotyped population of 769 trees belonging to 25 open-pollinated families. A series of repeated tree height measurements through ages 3-40 years permitted the testing of GS methods temporally. The genotyping-by-sequencing (GBS) platform was used for single nucleotide polymorphism (SNP) discovery in conjunction with three unordered imputation methods applied to a data set with 60% missing information. Further, three diverse GS models were evaluated based on predictive accuracy (PA), and their marker effects. Moderate levels of PA (0.31-0.55) were observed and were of sufficient capacity to deliver improved selection response over TS. Additionally, PA varied substantially through time accordingly with spatial competition among trees. As expected, temporal PA was well correlated with age-age genetic correlation (r=0.99), and decreased substantially with increasing difference in age between the training and validation populations (0.04-0.47). Moreover, our imputation comparisons indicate that k-nearest neighbor and singular value decomposition yielded a greater number of SNPs and gave higher predictive accuracies than imputing with the mean. Furthermore, the ridge regression (rrBLUP) and BayesCπ (BCπ) models both yielded equal, and better PA than the generalized ridge regression heteroscedastic effect model for the traits evaluated.

Článek

Identification of Novel Associations and Localization of Signals in Idiopathic Inflammatory Myopathies Using Genome-Wide Imputation

... We imputed variants from the ImmunoChip array using a large reference panel to fine-map associations ...

Arthritis & rheumatology (Hoboken, N.J.). 2023 Jun ; 75 (6) : 1021-1027. [epub] 20230320

Arthritis Rheumatol
ISSN 2326-5205 | 2326-5191
Zdroj

OBJECTIVE: The idiopathic inflammatory myopathies (IIMs) are heterogeneous diseases thought to be initiated by immune activation in genetically predisposed individuals. We imputed variants from the ImmunoChip array using a large reference panel to fine-map associations and identify novel associations in IIM. METHODS: We analyzed 2,565 Caucasian IIM patient samples collected through the Myositis Genetics Consortium (MYOGEN) and 10,260 ethnically matched control samples. We imputed 1,648,116 variants from the ImmunoChip array using the Haplotype Reference Consortium panel and conducted association analysis on IIM and clinical and serologic subgroups. RESULTS: The HLA locus was consistently the most significantly associated region. Four non-HLA regions reached genome-wide significance, SDK2 and LINC00924 (both novel) and STAT4 in the whole IIM cohort, with evidence of independent variants in STAT4, and NAB1 in the polymyositis (PM) subgroup. We also found suggestive evidence of association with loci previously associated with other autoimmune rheumatic diseases (TEC and LTBR). We identified more significant associations than those previously reported in IIM for STAT4 and DGKQ in the total cohort, for NAB1 and FAM167A-BLK loci in PM, and for CCR5 in inclusion body myositis. We found enrichment of variants among DNase I hypersensitivity sites and histone marks associated with active transcription within blood cells. CONCLUSION: We found novel and strong associations in IIM and PM and localized signals to single genes and immune cell types.

MeSH
autoimunitní nemoci * genetika MeSH
genetická predispozice k nemoci MeSH
haplotypy MeSH
lidé MeSH
myozitida * genetika MeSH
polymyozitida * MeSH
Check Tag
lidé MeSH
Publikační typ
časopisecké články MeSH
práce podpořená grantem MeSH
Research Support, N.I.H., Intramural MeSH

Článek

Data processing pipeline for cardiogenic shock prediction using machine learning

... METHODS: We mainly focus on techniques for the imputation of missing data by generating a pipeline for ...

Frontiers in cardiovascular medicine. 2023 ; 10 () : 1132680. [epub] 20230323

Front Cardiovasc Med
ISSN 2297-055X
Zdroj

INTRODUCTION: Recent advances in machine learning provide new possibilities to process and analyse observational patient data to predict patient outcomes. In this paper, we introduce a data processing pipeline for cardiogenic shock (CS) prediction from the MIMIC III database of intensive cardiac care unit patients with acute coronary syndrome. The ability to identify high-risk patients could possibly allow taking pre-emptive measures and thus prevent the development of CS. METHODS: We mainly focus on techniques for the imputation of missing data by generating a pipeline for imputation and comparing the performance of various multivariate imputation algorithms, including k-nearest neighbours, two singular value decomposition (SVD)-based methods, and Multiple Imputation by Chained Equations. After imputation, we select the final subjects and variables from the imputed dataset and showcase the performance of the gradient-boosted framework that uses a tree-based classifier for cardiogenic shock prediction. RESULTS: We achieved good classification performance thanks to data cleaning and imputation (cross-validated mean area under the curve 0.805) without hyperparameter optimization. CONCLUSION: We believe our pre-processing pipeline would prove helpful also for other classification and regression experiments.

Klíčová slova
cardiogenic shock, classification, machine learning, missing data imputation, prediction model, processing pipeline,
Publikační typ
časopisecké články MeSH

Článek

Synergy between imputed genetic pathway and clinical information for predicting recurrence in early stage non-small cell lung cancer

... This study aims to impute and integrate specific types of genomic data with clinical data to improve ...

Journal of biomedical informatics. 2023 Aug ; 144 () : 104424. [epub] 20230621

J Biomed Inform
ISSN 1532-0480 | 1532-0464
Zdroj

OBJECTIVE: Lung cancer exhibits unpredictable recurrence in low-stage tumors and variable responses to different therapeutic interventions. Predicting relapse in early-stage lung cancer can facilitate precision medicine and improve patient survivability. While existing machine learning models rely on clinical data, incorporating genomic information could enhance their efficiency. This study aims to impute and integrate specific types of genomic data with clinical data to improve the accuracy of machine learning models for predicting relapse in early-stage, non-small cell lung cancer patients. METHODS: The study utilized a publicly available TCGA lung cancer cohort and imputed genetic pathway scores into the Spanish Lung Cancer Group (SLCG) data, specifically in 1348 early-stage patients. Initially, tumor recurrence was predicted without imputed pathway scores. Subsequently, the SLCG data were augmented with pathway scores imputed from TCGA. The integrative approach aimed to enhance relapse risk prediction performance. RESULTS: The integrative approach achieved improved relapse risk prediction with the following evaluation metrics: an area under the precision-recall curve (PR-AUC) score of 0.75, an area under the ROC (ROC-AUC) score of 0.80, an F1 score of 0.61, and a Precision of 0.80. The prediction explanation model SHAP (SHapley Additive exPlanations) was employed to explain the machine learning model's predictions. CONCLUSION: We conclude that our explainable predictive model is a promising tool for oncologists that addresses an unmet clinical need of post-treatment patient stratification based on the relapse risk while also improving the predictive power by incorporating proxy genomic data not available for specific patients.

Klíčová slova
Classification, Explanation, Imputation, Recurrence, Regression, Supervised,
MeSH
lidé MeSH
lokální recidiva nádoru genetika MeSH
malobuněčný karcinom plic * MeSH
nádory plic * diagnóza genetika MeSH
nemalobuněčný karcinom plic * diagnóza genetika MeSH
plíce MeSH
Check Tag
lidé MeSH
Publikační typ
časopisecké články MeSH
práce podpořená grantem MeSH

Článek

Imputing missing data of function and disease activity in rheumatoid arthritis registers: what is the best technique?

... OBJECTIVE: To compare several methods of missing data imputation for function (Health Assessment Questionnaire ...

RMD open. 2019 ; 5 (2) : e000994. [epub] 20191017

RMD Open
ISSN 2056-5933
Zdroj

OBJECTIVE: To compare several methods of missing data imputation for function (Health Assessment Questionnaire) and for disease activity (Disease Activity Score-28 and Clinical Disease Activity Index) in rheumatoid arthritis (RA) patients. METHODS: One thousand RA patients from observational cohort studies with complete data for function and disease activity at baseline, 6, 12 and 24 months were selected to conduct a simulation study. Values were deleted at random or following a predicted attrition bias. Three types of imputation were performed: (1) methods imputing forward in time (last observation carried forward; linear forward extrapolation); (2) methods considering data both forward and backward in time (nearest available observation-NAO; linear extrapolation; polynomial extrapolation); and (3) methods using multi-individual models (linear mixed effects cubic regression-LME3; multiple imputation by chained equation-MICE). The performance of each estimation method was assessed using the difference between the mean outcome value, the remission and low disease activity rates after imputation of the missing values and the true value. RESULTS: When imputing missing baseline values, all methods underestimated equally the true value, but LME3 and MICE correctly estimated remission and low disease activity rates. When imputing missing follow-up values at 6, 12, or 24 months, NAO provided the least biassed estimate of the mean disease activity and corresponding remission rate. These results were not affected by the presence of attrition bias. CONCLUSION: When imputing function and disease activity in large registers of active RA patients, researchers can consider the use of a simple method such as NAO for missing follow-up data, and the use of mixed-effects regression or multiple imputation for baseline data.

Klíčová slova
DAS28, disease activity, epidemiology, outcomes research, rheumatoid arthritis,
MeSH
algoritmy MeSH
indukce remise MeSH
interpretace statistických dat * MeSH
kohortové studie MeSH
lidé MeSH
lineární modely MeSH
následné studie MeSH
počítačová simulace MeSH
revmatoidní artritida epidemiologie MeSH
stupeň závažnosti nemoci MeSH
výzkumný projekt statistika a číselné údaje MeSH
zkreslení výsledků (epidemiologie) MeSH
Check Tag
lidé MeSH
mužské pohlaví MeSH
ženské pohlaví MeSH
Publikační typ
časopisecké články MeSH
pozorovací studie MeSH
práce podpořená grantem MeSH
srovnávací studie MeSH

Článek

Handling of missing component information for common composite score outcomes used in axial spondyloarthritis research when complete-case analysis is unbiased

... METHODS: Individual mean imputation (IMI), the modified formula method (MF), overall mean imputation ...

BMC medical research methodology. 2025 Feb 28 ; 25 (1) : 55. [epub] 20250228

BMC Med Res Methodol
ISSN 1471-2288
Zdroj

BACKGROUND: Observational data on composite scores often comes with missing component information. When a complete-case (CC) analysis of composite scores is unbiased, preferable approaches of dealing with missing component information should also be unbiased and provide a more precise estimate. We assessed the performance of several methods compared to CC analysis in estimating the means of common composite scores used in axial spondyloarthritis research. METHODS: Individual mean imputation (IMI), the modified formula method (MF), overall mean imputation (OMI), and multiple imputation of missing component values (MI) were assessed either analytically or by means of simulations from available data collected across Europe. Their performance in estimating the means of the Bath Ankylosing Spondylitis Disease Activity Index (BASDAI), the Bath Ankylosing Spondylitis Functional Index (BASFI), and the Ankylosing Spondylitis Disease Activity Score based on C-reactive protein (ASDAS-CRP) in cases where component information was set missing completely at random was compared to the CC approach based on bias, variance, and coverage. RESULTS: Like the MF method, IMI uses a modified formula for observations with missing components resulting in modified composite scores. In the case of an unbiased CC approach, these two methods yielded representative samples of the distribution arising from a mixture of the original and modified composite scores, which, however, could not be considered the same as the distribution of the original score. The IMI and MF method are, thus, intrinsically biased. OMI provided an unbiased mean but displayed a complex dependence structure among observations that, if not accounted for, resulted in severe coverage issues. MI improved precision compared to CC and gave unbiased means and proper coverage as long as the extent of missingness was not too large. CONCLUSIONS: MI of missing component values was the only method found successful in retaining CC's unbiasedness and in providing increased precision for estimating the means of BASDAI, BASFI, and ASDAS-CRP. However, since MI is susceptible to incorrect implementation and its performance may become questionable with increasing missingness, we consider the implementation of an error-free CC approach a valid and valuable option. TRIAL REGISTRATION: Not applicable as study uses data from patient registries.

Klíčová slova
Axial spondyloarthritis, Complete-case analysis, Composite score, Missing components, Multiple imputation,
MeSH
axiální spondyloartritida * diagnóza MeSH
C-reaktivní protein analýza MeSH
interpretace statistických dat MeSH
lidé MeSH
stupeň závažnosti nemoci MeSH
výzkumný projekt MeSH
zkreslení výsledků (epidemiologie) MeSH
Check Tag
lidé MeSH
Publikační typ
časopisecké články MeSH
Geografické názvy
Evropa MeSH
Názvy látek
C-reaktivní protein MeSH

Článek

Three new pancreatic cancer susceptibility signals identified on chromosomes 1q32.1, 5p15.33 and 8q24.21

... To identify new susceptibility variants, we performed imputation based on 1000 Genomes (1000G) Project ...

Oncotarget. 2016 Oct 11 ; 7 (41) : 66328-66343.

ISSN 1949-2553
Zdroj

Genome-wide association studies (GWAS) have identified common pancreatic cancer susceptibility variants at 13 chromosomal loci in individuals of European descent. To identify new susceptibility variants, we performed imputation based on 1000 Genomes (1000G) Project data and association analysis using 5,107 case and 8,845 control subjects from 27 cohort and case-control studies that participated in the PanScan I-III GWAS. This analysis, in combination with a two-staged replication in an additional 6,076 case and 7,555 control subjects from the PANcreatic Disease ReseArch (PANDoRA) and Pancreatic Cancer Case-Control (PanC4) Consortia uncovered 3 new pancreatic cancer risk signals marked by single nucleotide polymorphisms (SNPs) rs2816938 at chromosome 1q32.1 (per allele odds ratio (OR) = 1.20, P = 4.88x10 -15), rs10094872 at 8q24.21 (OR = 1.15, P = 3.22x10 -9) and rs35226131 at 5p15.33 (OR = 0.71, P = 1.70x10 -8). These SNPs represent independent risk variants at previously identified pancreatic cancer risk loci on chr1q32.1 ( NR5A2), chr8q24.21 ( MYC) and chr5p15.33 ( CLPTM1L- TERT) as per analyses conditioned on previously reported susceptibility variants. We assessed expression of candidate genes at the three risk loci in histologically normal ( n = 10) and tumor ( n = 8) derived pancreatic tissue samples and observed a marked reduction of NR5A2 expression (chr1q32.1) in the tumors (fold change -7.6, P = 5.7x10 -8). This finding was validated in a second set of paired ( n = 20) histologically normal and tumor derived pancreatic tissue samples (average fold change for three NR5A2 isoforms -31.3 to -95.7, P = 7.5x10 -4-2.0x10 -3). Our study has identified new susceptibility variants independently conferring pancreatic cancer risk that merit functional follow-up to identify target genes and explain the underlying biology.

Klíčová slova
GWAS, NR5A2, fine-mapping, imputation, pancreatic cancer,
MeSH
celogenomová asociační studie metody MeSH
datové soubory jako téma MeSH
genetická predispozice k nemoci genetika MeSH
genotyp MeSH
jednonukleotidový polymorfismus genetika MeSH
lidé MeSH
lidské chromozomy, pár 1 genetika MeSH
lidské chromozomy, pár 5 genetika MeSH
lidské chromozomy, pár 8 genetika MeSH
nádory slinivky břišní genetika MeSH
Check Tag
lidé MeSH
Publikační typ
časopisecké články MeSH

Článek

A machine learning approach to fill gaps in dendrometer data

Trees (Berlin, Germany. 2024 ; 38 (6) : 1557-1567. [epub] 20241015

Trees (Berl West)
ISSN 0931-1890
Zdroj

KEY MESSAGE: The machine learning algorithm extreme gradient boosting can be employed to address the issue of long data gaps in individual trees, without the need for additional tree-growth data or climatic variables. ABSTRACT: The susceptibility of dendrometer devices to technical failures often makes time-series analyses challenging. Resulting data gaps decrease sample size and complicate time-series comparison and integration. Existing methods either focus on bridging smaller gaps, are dependent on data from other trees or rely on climate parameters. In this study, we test eight machine learning (ML) algorithms to fill gaps in dendrometer data of individual trees in urban and non-urban environments. Among these algorithms, extreme gradient boosting (XGB) demonstrates the best skill to bridge artificially created gaps throughout the growing seasons of individual trees. The individual tree models are suited to fill gaps up to 30 consecutive days and perform particularly well at the start and end of the growing season. The method is independent of climate input variables or dendrometer data from neighbouring trees. The varying limitations among existing approaches call for cross-comparison of multiple methods and visual control. Our findings indicate that ML is a valid approach to fill gaps in individual trees, which can be of particular importance in situations of limited inter-tree co-variance, such as in urban environments. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s00468-024-02573-y.

Klíčová slova
Acer platanoides, Dendroecology, Imputation, Platanus x hispanica, Tree growth, Urban trees,
Publikační typ
časopisecké články MeSH

Článek

Residential Altitude Associates With Endurance but Not Muscle Power in Young Swiss Men

Frontiers in physiology. 2020 ; 11 () : 860. [epub] 20200723

Front Physiol
ISSN 1664-042X
Zdroj

INTRODUCTION: Physical fitness benefits health. However, there is a research gap on how physical fitness, particularly aerobic endurance capacity and muscle power, is influenced by residential altitude, blood parameters, weight, and other cofactors in a population living at low to moderate altitudes (300-2100 masl). MATERIALS AND METHODS: We explored how endurance and muscle power performance changes with residential altitude, Body Mass Index (BMI), hemoglobin and creatinine levels among 108,677 Swiss men aged 18-22 years (covering >90% of Swiss birth cohorts) conscripted to the Swiss Armed Forces between 2007 and 2012. The test battery included a blood test of about 65%, a physical evaluation of about 85%, and the BMI of all conscripts. RESULTS: Residential altitude was significantly associated with endurance (p < 0.001) but not with muscle power performance (p = 0.858) after adjusting for all available cofactors. Higher BMI showed the greatest negative association with both endurance and muscle power performance. For muscle power performance, the association with creatinine levels was significant. Elevated C-reactive protein (CRP) and hemoglobin levels were stronger contributors in explaining endurance than muscle power performance. CONCLUSION: We found a significant association between low to moderate residential altitude and aerobic endurance capacity even after adjustment for hemoglobin, creatinine, BMI and sociodemographic factors. Non-assessed factors such as vitamin D levels, air pollution, and lifestyle aspects may explain the presented remaining association partially and could also be associated with residential altitude. Monitoring the health and fitness of young people and their determinants is important and of practical concern for disease prevention and public health implications.

Klíčová slova
C-reactive protein, Switzerland, VO2max, general additive models, hemoglobin, multiple imputation,
Publikační typ
časopisecké články MeSH

Publikováno

Filtry

imputation Dotaz Zobrazit nápovědu

Přesná shoda

imputation Dotaz Zobrazit nápovědu Přesná shoda

Upřesnit dle MeSH

imputation Dotaz Zobrazit nápovědu

Přesná shoda