Comprehensive assessment of the role of spectral data pre-processing in spectroscopy-based liquid biopsy
Jazyk angličtina Země Velká Británie, Anglie Médium print-electronic
Typ dokumentu časopisecké články
PubMed
40273765
DOI
10.1016/j.saa.2025.126261
PII: S1386-1425(25)00567-0
Knihovny.cz E-zdroje
- Klíčová slova
- Chiroptical spectroscopy, Classification, Data pre-processing, Diagnostics, Liquid biopsy, Machine learning, Vibrational spectroscopy,
- MeSH
- algoritmy MeSH
- hepatocelulární karcinom * diagnóza patologie MeSH
- jaterní cirhóza diagnóza patologie MeSH
- lidé MeSH
- metoda nejmenších čtverců MeSH
- nádory jater * diagnóza patologie MeSH
- Ramanova spektroskopie * metody MeSH
- spektrofotometrie infračervená metody MeSH
- tekutá biopsie metody MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
Spectroscopic data often contain artifacts or noise related to the sample characteristics, instrumental variations, or experimental design flaws. Therefore, classifying the raw data is not recommended and might lead to biased results. Nevertheless, most issues may be addressed through appropriate data pre-processing. Effective pre-processing is particularly crucial in critical applications like liquid biopsy for disease detection, where even minor performance improvements may impact patient outcomes. Unfortunately, there is no consensus regarding optimal pre-processing, complicating cross-study comparisons. This study presents a comprehensive evaluation of various pre-processing methods and their combinations to assess their influence on classification results. The goal was to identify whether some pre-processing methods are associated with higher classification outcomes and find an optimal strategy for the given data. Data from Raman optical activity and infrared and Raman spectroscopy were processed, applying tens of thousands of possible pre-processing pipelines. The resulting data were classified using three algorithms to distinguish between subjects with liver cirrhosis and those who had developed hepatocellular carcinoma. Results highlighted that some specific pre-processing methods often ranked among the best classification results, such as the Rolling Ball for correcting the baseline of Raman spectra or the Doubly Reweighted Penalized Least Squares and Mixture model in the case of Raman optical activity. On the other hand, the selection of filtering and/or normalization approach usually did not have a significant impact. Nonetheless, the pre-processing of top-scoring pipelines also depended on the classifier utilized. The best pipelines yielded an AUROC of 0.775-0.823, varying with the evaluated spectroscopic data and classifier.
Citace poskytuje Crossref.org