Systematic Error Removal Using Random Forest for Normalizing Large-Scale Untargeted Lipidomics Data
Jazyk angličtina Země Spojené státy americké Médium print-electronic
Typ dokumentu časopisecké články, Research Support, N.I.H., Extramural, práce podpořená grantem
Grantová podpora
P20 HL113452
NHLBI NIH HHS - United States
R01 HL091357
NHLBI NIH HHS - United States
U01 HL072524
NHLBI NIH HHS - United States
U2C ES030158
NIEHS NIH HHS - United States
PubMed
30758187
PubMed Central
PMC9652764
DOI
10.1021/acs.analchem.8b05592
Knihovny.cz E-zdroje
- MeSH
- datové soubory jako téma normy MeSH
- experimentální chyba statistika a číselné údaje MeSH
- lipidomika normy MeSH
- řízení kvality * MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Research Support, N.I.H., Extramural MeSH
Large-scale untargeted lipidomics experiments involve the measurement of hundreds to thousands of samples. Such data sets are usually acquired on one instrument over days or weeks of analysis time. Such extensive data acquisition processes introduce a variety of systematic errors, including batch differences, longitudinal drifts, or even instrument-to-instrument variation. Technical data variance can obscure the true biological signal and hinder biological discoveries. To combat this issue, we present a novel normalization approach based on using quality control pool samples (QC). This method is called systematic error removal using random forest (SERRF) for eliminating the unwanted systematic variations in large sample sets. We compared SERRF with 15 other commonly used normalization methods using six lipidomics data sets from three large cohort studies (832, 1162, and 2696 samples). SERRF reduced the average technical errors for these data sets to 5% relative standard deviation. We conclude that SERRF outperforms other existing methods and can significantly reduce the unwanted systematic variation, revealing biological variance of interest.
Zobrazit více v PubMed
Keurentjes JJ; Fu J; De Vos CR; Lommen A; Hall RD; Bino RJ; van der Plas LH; Jansen RC; Vreugdenhil D; Koornneef M Nat. Genet 2006, 38 (7), 842. PubMed
Fernie AR; Tohge T Annu. Rev. Genet 2017, 51, 287–310. PubMed
Tzoulaki I; Ebbels TM; Valdes A; Elliott P; Ioannidis JP Am. J. Epidemiol 2014, 180 (2), 129–139. PubMed
Zhang A; Sun H; Yan G; Wang P; Wang X BioMed Res. Int 2015, 2015, 354671. PubMed PMC
Bijlsma S; Bobeldijk I; Verheij ER; Ramaker R; Kochhar S; Macdonald IA; Van Ommen B; Smilde AK Anal. Chem 2006, 78 (2), 567–574. PubMed
Dunn WB; Wilson ID; Nicholls AW; Broadhurst D Bioanalysis 2012, 4 (18), 2249–2264. PubMed
Martin J-C; Maillot M; Mazerolles G; Verdu A; Lyan B; Migne C; Defoort C; Canlet C; Junot C; Guillou C; et al. Metabolomics 2015, 11 (4), 807–821. PubMed PMC
Sampson JN; Boca SM; Shu X-O; Stolzenberg-Solomon RZ; Matthews CE; Hsing AW; Tan Y-T; Ji B-T; Chow W-H; Cai Q; et al. Cancer Epidemiol., Biomarkers Prev 2013, 22, 631–640. PubMed PMC
Li B; Tang J; Yang Q; Li S; Cui X; Li Y; Chen Y; Xue W; Li X; Zhu F Nucleic Acids Res. 2017, 45 (W1), W162–W170. PubMed PMC
Zacharias H; Altenbuchinger M; Gronwald W Metabolites 2018, 8 (3), 47. PubMed PMC
Chetwynd AJ; Abdul-Sada A; Holt SG; Hill EM Journal of Chromatography A 2016, 1431, 103–110. PubMed
Borrego SL; Fahrmann J; Datta R; Stringari C; Grapov D; Zeller M; Chen Y; Wang P; Baldi P; Gratton E; Fiehn O; Kaiser P Cancer Metab. 2016, 4 (1), 9. PubMed PMC
Sysi-Aho M; Katajamaa M; Yetukuri L; Orešič M BMC Bioinf. 2007, 8 (1), 93. PubMed PMC
Redestig H; Fukushima A; Stenlund H; Moritz T; Arita M; Saito K; Kusano M Anal. Chem 2009, 81 (19), 7974–7980. PubMed
Bromke MA; Sabir JS; Alfassi FA; Hajarah NH; Kabli SA; Al-Malki AL; Ashworth MP; Méret M; Jansen RK; Willmitzer L PLoS One 2015, 10 (10), e0138965. PubMed PMC
Yang S; Sadilek M; Lidstrom ME Journal of chromatography A 2010, 1217 (47), 7401–7410. PubMed PMC
Boysen AK; Heal KR; Carlson LT; Ingalls AE Anal. Chem 2018, 90 (2), 1363–1369. PubMed
Livera AMD; Sysi-Aho M; Jacob L; Gagnon-Bartsch JA; Castillo S; Simpson JA; Speed TP Anal. Chem 2015, 87 (7), 3606–3615. PubMed PMC
Wang S-Y; Kuo C-H; Tseng YJ Anal. Chem 2013, 85 (2), 1037–1046. PubMed
Kamleh MA; Ebbels TM; Spagou K; Masson P; Want EJ Anal. Chem 2012, 84 (6), 2670–2677. PubMed
Li B; Tang J; Yang Q; Cui X; Li S; Chen S; Cao Q; Xue W; Chen N; Zhu F Sci. Rep 2016, 6, 38881. PubMed PMC
Cleveland WS; Devlin SJ J. Am. Stat. Assoc 1988, 83 (403), 596–610.
Luan H; Ji F; Chen Y; Cai Z Anal. Chim. Acta 2018, 1036, 66–72. PubMed
Karpievitch YV; Nikolic SB; Wilson R; Sharman JE; Edwards LM PLoS One 2014, 9 (12), e116221. PubMed PMC
Smolinska A; Blanchet L; Coulier L; Ampt KA; Luider T; Hintzen RQ; Wijmenga SS; Buydens LM PLoS One 2012, 7 (6), e38163. PubMed PMC
Shah AD; Bartlett JW; Carpenter J; Nicholas O; Hemingway H Am. J. Epidemiol 2014, 179 (6), 764–774. PubMed PMC
Rodriguez-Galiano VF; Ghimire B; Rogan J; Chica-Olmo M; Rigol-Sanchez JP ISPRS Journal of Photogrammetry and Remote Sensing 2012, 67, 93–104.
Díaz-Uriarte R; Alvarez de Andrés S BMC Bioinf. 2006, 7 (1), 3. PubMed PMC
Barupal DK; Fan S; Wancewicz B; Cajka T; Sa M; Showalter MR; Baillie R; Tenenbaum JD; Louie G; Kaddurah-Daouk R; Fiehn O Sci. Data 2018, 5, 180263. PubMed PMC
Cajka T; Fiehn O Methods Mol. Biol 2017, 1609, 149–170. PubMed
Cajka T; Smilowitz JT; Fiehn O Anal. Chem 2017, 89 (22), 12360–12368. PubMed
Cajka T; Davis R; Austin KJ; Newman JW; German JB; Fiehn O; Smilowitz JT Metabolomics 2016, 12 (8), 127.
Tu LN; Showalter MR; Cajka T; Fan S; Pillai VV; Fiehn O; Selvaraj V Sci. Rep 2017, 7 (1), 6120. PubMed PMC
Breiman L Machine learning 2001, 45 (1), 5–32.
Touw WG; Bayjanov JR; Overmars L; Backus L; Boekhorst J; Wels M; van Hijum SA Briefings Bioinf. 2013, 14 (3), 315–326. PubMed PMC
Parsons HM; Ekman DR; Collette TW; Viant MR Analyst 2009, 134 (3), 478–485. PubMed
Kirwan JA; Weber RJ; Broadhurst DI; Viant MR Sci. Data 2014, 1, 140012. PubMed PMC
Drabovich AP; Pavlou MP; Batruch I; Diamandis EP Proteomic AND Mass Spectrometry Technologies for Biomarker Discovery. In Proteomic and Metabolomic Approaches to Biomarker Discovery; Issaq HJ, Veenstra TD, Eds.; Elsevier: Amsterdam, The Netherlands, 2013; pp 17–37.
Liu H; Yu L IEEE Transactions on knowledge and data engineering 2005, 17 (4), 491–502.
García-Bilbao A; Armañanzas R; Ispizua Z; Calvo B; Alonso-Varona A; Inza I; Larrañaga P; López-Vivanco G; Suárez-Merino B; Betanzos M BMC Cancer 2012, 12 (1), 43. PubMed PMC
Lipsey MW Design Sensitivity: Statistical Power for Experimental Research; Sage: Newbury Park, CA, 1990; Vol. 19.
Van Iterson M; t Hoen PAC; Pedotti P; Hooiveld G; Den Dunnen JT; van Ommen GJB; Boer JM; Menezes RX BMC Genomics 2009, 10 (1), 439. PubMed PMC
Van Iterson M; van de Wiel MA; Boer JM; De Menezes RX Stat. Appl. Genet. Mol. Biol 2013, 12 (4), 449–467. PubMed
High-Resolution Mass Spectrometry for Human Exposomics: Expanding Chemical Space Coverage