Systematic Error Removal Using Random Forest for Normalizing Large-Scale Untargeted Lipidomics Data

. 2019 Mar 05 ; 91 (5) : 3590-3596. [epub] 20190219

Jazyk angličtina Země Spojené státy americké Médium print-electronic

Typ dokumentu časopisecké články, Research Support, N.I.H., Extramural, práce podpořená grantem

Perzistentní odkaz   https://www.medvik.cz/link/pmid30758187

Grantová podpora
P20 HL113452 NHLBI NIH HHS - United States
R01 HL091357 NHLBI NIH HHS - United States
U01 HL072524 NHLBI NIH HHS - United States
U2C ES030158 NIEHS NIH HHS - United States

Large-scale untargeted lipidomics experiments involve the measurement of hundreds to thousands of samples. Such data sets are usually acquired on one instrument over days or weeks of analysis time. Such extensive data acquisition processes introduce a variety of systematic errors, including batch differences, longitudinal drifts, or even instrument-to-instrument variation. Technical data variance can obscure the true biological signal and hinder biological discoveries. To combat this issue, we present a novel normalization approach based on using quality control pool samples (QC). This method is called systematic error removal using random forest (SERRF) for eliminating the unwanted systematic variations in large sample sets. We compared SERRF with 15 other commonly used normalization methods using six lipidomics data sets from three large cohort studies (832, 1162, and 2696 samples). SERRF reduced the average technical errors for these data sets to 5% relative standard deviation. We conclude that SERRF outperforms other existing methods and can significantly reduce the unwanted systematic variation, revealing biological variance of interest.

Zobrazit více v PubMed

Keurentjes JJ; Fu J; De Vos CR; Lommen A; Hall RD; Bino RJ; van der Plas LH; Jansen RC; Vreugdenhil D; Koornneef M Nat. Genet 2006, 38 (7), 842. PubMed

Fernie AR; Tohge T Annu. Rev. Genet 2017, 51, 287–310. PubMed

Tzoulaki I; Ebbels TM; Valdes A; Elliott P; Ioannidis JP Am. J. Epidemiol 2014, 180 (2), 129–139. PubMed

Zhang A; Sun H; Yan G; Wang P; Wang X BioMed Res. Int 2015, 2015, 354671. PubMed PMC

Bijlsma S; Bobeldijk I; Verheij ER; Ramaker R; Kochhar S; Macdonald IA; Van Ommen B; Smilde AK Anal. Chem 2006, 78 (2), 567–574. PubMed

Dunn WB; Wilson ID; Nicholls AW; Broadhurst D Bioanalysis 2012, 4 (18), 2249–2264. PubMed

Martin J-C; Maillot M; Mazerolles G; Verdu A; Lyan B; Migne C; Defoort C; Canlet C; Junot C; Guillou C; et al. Metabolomics 2015, 11 (4), 807–821. PubMed PMC

Sampson JN; Boca SM; Shu X-O; Stolzenberg-Solomon RZ; Matthews CE; Hsing AW; Tan Y-T; Ji B-T; Chow W-H; Cai Q; et al. Cancer Epidemiol., Biomarkers Prev 2013, 22, 631–640. PubMed PMC

Li B; Tang J; Yang Q; Li S; Cui X; Li Y; Chen Y; Xue W; Li X; Zhu F Nucleic Acids Res. 2017, 45 (W1), W162–W170. PubMed PMC

Zacharias H; Altenbuchinger M; Gronwald W Metabolites 2018, 8 (3), 47. PubMed PMC

Chetwynd AJ; Abdul-Sada A; Holt SG; Hill EM Journal of Chromatography A 2016, 1431, 103–110. PubMed

Borrego SL; Fahrmann J; Datta R; Stringari C; Grapov D; Zeller M; Chen Y; Wang P; Baldi P; Gratton E; Fiehn O; Kaiser P Cancer Metab. 2016, 4 (1), 9. PubMed PMC

Sysi-Aho M; Katajamaa M; Yetukuri L; Orešič M BMC Bioinf. 2007, 8 (1), 93. PubMed PMC

Redestig H; Fukushima A; Stenlund H; Moritz T; Arita M; Saito K; Kusano M Anal. Chem 2009, 81 (19), 7974–7980. PubMed

Bromke MA; Sabir JS; Alfassi FA; Hajarah NH; Kabli SA; Al-Malki AL; Ashworth MP; Méret M; Jansen RK; Willmitzer L PLoS One 2015, 10 (10), e0138965. PubMed PMC

Yang S; Sadilek M; Lidstrom ME Journal of chromatography A 2010, 1217 (47), 7401–7410. PubMed PMC

Boysen AK; Heal KR; Carlson LT; Ingalls AE Anal. Chem 2018, 90 (2), 1363–1369. PubMed

Livera AMD; Sysi-Aho M; Jacob L; Gagnon-Bartsch JA; Castillo S; Simpson JA; Speed TP Anal. Chem 2015, 87 (7), 3606–3615. PubMed PMC

Wang S-Y; Kuo C-H; Tseng YJ Anal. Chem 2013, 85 (2), 1037–1046. PubMed

Kamleh MA; Ebbels TM; Spagou K; Masson P; Want EJ Anal. Chem 2012, 84 (6), 2670–2677. PubMed

Li B; Tang J; Yang Q; Cui X; Li S; Chen S; Cao Q; Xue W; Chen N; Zhu F Sci. Rep 2016, 6, 38881. PubMed PMC

Cleveland WS; Devlin SJ J. Am. Stat. Assoc 1988, 83 (403), 596–610.

Luan H; Ji F; Chen Y; Cai Z Anal. Chim. Acta 2018, 1036, 66–72. PubMed

Karpievitch YV; Nikolic SB; Wilson R; Sharman JE; Edwards LM PLoS One 2014, 9 (12), e116221. PubMed PMC

Smolinska A; Blanchet L; Coulier L; Ampt KA; Luider T; Hintzen RQ; Wijmenga SS; Buydens LM PLoS One 2012, 7 (6), e38163. PubMed PMC

Shah AD; Bartlett JW; Carpenter J; Nicholas O; Hemingway H Am. J. Epidemiol 2014, 179 (6), 764–774. PubMed PMC

Rodriguez-Galiano VF; Ghimire B; Rogan J; Chica-Olmo M; Rigol-Sanchez JP ISPRS Journal of Photogrammetry and Remote Sensing 2012, 67, 93–104.

Díaz-Uriarte R; Alvarez de Andrés S BMC Bioinf. 2006, 7 (1), 3. PubMed PMC

Barupal DK; Fan S; Wancewicz B; Cajka T; Sa M; Showalter MR; Baillie R; Tenenbaum JD; Louie G; Kaddurah-Daouk R; Fiehn O Sci. Data 2018, 5, 180263. PubMed PMC

Cajka T; Fiehn O Methods Mol. Biol 2017, 1609, 149–170. PubMed

Cajka T; Smilowitz JT; Fiehn O Anal. Chem 2017, 89 (22), 12360–12368. PubMed

Cajka T; Davis R; Austin KJ; Newman JW; German JB; Fiehn O; Smilowitz JT Metabolomics 2016, 12 (8), 127.

Tu LN; Showalter MR; Cajka T; Fan S; Pillai VV; Fiehn O; Selvaraj V Sci. Rep 2017, 7 (1), 6120. PubMed PMC

Breiman L Machine learning 2001, 45 (1), 5–32.

Touw WG; Bayjanov JR; Overmars L; Backus L; Boekhorst J; Wels M; van Hijum SA Briefings Bioinf. 2013, 14 (3), 315–326. PubMed PMC

Parsons HM; Ekman DR; Collette TW; Viant MR Analyst 2009, 134 (3), 478–485. PubMed

Kirwan JA; Weber RJ; Broadhurst DI; Viant MR Sci. Data 2014, 1, 140012. PubMed PMC

Drabovich AP; Pavlou MP; Batruch I; Diamandis EP Proteomic AND Mass Spectrometry Technologies for Biomarker Discovery. In Proteomic and Metabolomic Approaches to Biomarker Discovery; Issaq HJ, Veenstra TD, Eds.; Elsevier: Amsterdam, The Netherlands, 2013; pp 17–37.

Liu H; Yu L IEEE Transactions on knowledge and data engineering 2005, 17 (4), 491–502.

García-Bilbao A; Armañanzas R; Ispizua Z; Calvo B; Alonso-Varona A; Inza I; Larrañaga P; López-Vivanco G; Suárez-Merino B; Betanzos M BMC Cancer 2012, 12 (1), 43. PubMed PMC

Lipsey MW Design Sensitivity: Statistical Power for Experimental Research; Sage: Newbury Park, CA, 1990; Vol. 19.

Van Iterson M; t Hoen PAC; Pedotti P; Hooiveld G; Den Dunnen JT; van Ommen GJB; Boer JM; Menezes RX BMC Genomics 2009, 10 (1), 439. PubMed PMC

Van Iterson M; van de Wiel MA; Boer JM; De Menezes RX Stat. Appl. Genet. Mol. Biol 2013, 12 (4), 449–467. PubMed

Najít záznam

Citační ukazatele

Nahrávání dat ...

Možnosti archivace

Nahrávání dat ...