Detection of PatIent-Level distances from single cell genomics and pathomics data with Optimal Transport (PILOT)

. 2024 Feb ; 20 (2) : 57-74. [epub] 20231219

Jazyk angličtina Země Německo Médium print-electronic

Typ dokumentu časopisecké články

Perzistentní odkaz   https://www.medvik.cz/link/pmid38177382

Grantová podpora
KFO 5011 Deutsche Forschungsgemeinschaft (DFG)
IDs 322900939,454024652,432698239 445703531 Deutsche Forschungsgemeinschaft (DFG)
DFG-GE2811/3 Deutsche Forschungsgemeinschaft (DFG)
E:med Consortia Fibromap Bundesministerium für Bildung und Forschung (BMBF)
STOP-FSGS-01GM2202C Bundesministerium für Bildung und Forschung (BMBF)
No 101001791 EC | ERC | HORIZON EUROPE European Research Council (ERC)

Odkazy

PubMed 38177382
PubMed Central PMC10883279
DOI 10.1038/s44320-023-00003-8
PII: 10.1038/s44320-023-00003-8
Knihovny.cz E-zdroje

Although clinical applications represent the next challenge in single-cell genomics and digital pathology, we still lack computational methods to analyze single-cell or pathomics data to find sample-level trajectories or clusters associated with diseases. This remains challenging as single-cell/pathomics data are multi-scale, i.e., a sample is represented by clusters of cells/structures, and samples cannot be easily compared with each other. Here we propose PatIent Level analysis with Optimal Transport (PILOT). PILOT uses optimal transport to compute the Wasserstein distance between two individual single-cell samples. This allows us to perform unsupervised analysis at the sample level and uncover trajectories or cellular clusters associated with disease progression. We evaluate PILOT and competing approaches in single-cell genomics or pathomics studies involving various human diseases with up to 600 samples/patients and millions of cells or tissue structures. Our results demonstrate that PILOT detects disease-associated samples from large and complex single-cell or pathomics data. Moreover, PILOT provides a statistical approach to find changes in cell populations, gene expression, and tissue structures related to the trajectories or clusters supporting interpretation of predictions.

Zobrazit více v PubMed

Albergante L, Mirkes E, Bac J, Chen H, Martin A, Faure L, Barillot E, Pinello L, Gorban A, Zinovyev A. Robust and scalable learning of complex intrinsic dataset geometry via ElPiGraph. Entropy. 2020;3:296. doi: 10.3390/e22030296. PubMed DOI PMC

Baghy K, Dezso K, László V, Fullár A, Péterfia B, Paku S, Nagy P, Schaff Z, Iozzo RV, Kovalszky I. Ablation of the decorin gene enhances experimental hepatic fibrosis and impairs hepatic healing in mice. Lab Invest. 2011;3:439–451. doi: 10.1038/labinvest.2010.172. PubMed DOI PMC

Bonneel N, Van De Panne M, Paris S, Heidrich W (2011) Displacement interpolation using Lagrangian mass transport. In: Proceedings of the 2011 SIGGRAPH Asia conference, pp 1–12

Bülow RD, Hölscher DL, Costa IG, Boor P. Extending the landscape of omics technologies by pathomics. npj Syst Biol Appl. 2023;1:38. doi: 10.1038/s41540-023-00301-9. PubMed DOI PMC

Berry T, Harlim J. Variable bandwidth diffusion kernels. Appl Comput Harmon Anal. 2016;1:68–96. doi: 10.1016/j.acha.2015.01.001. DOI

Coifman RR, Lafon S, Lee AB, Maggioni M, Nadler B, Warner F, Zucker SW. Geometric diffusions as a tool for harmonic analysis and structure definition of data: diffusion maps. Proc Natl Acad Sci USA. 2005;21:7426–7431. doi: 10.1073/pnas.0500334102. PubMed DOI PMC

Cao J, O’Day DR, Pliner HA, Kingsley PD, Deng M, Daza RM, Zager MA, Aldinger KA, Blecher-Gonen R, Zhang F. A human cell atlas of fetal gene expression. Science. 2020;6518:eaba7721. doi: 10.1126/science.aba7721. PubMed DOI PMC

Cain A, Taga M, McCabe C, Green GS, Hekselman I, White CC, Lee DI, Gaur P, Rozenblatt-Rosen O, Zhang F et al (2023) Multicellular communities are perturbed in the aging human brain and Alzheimer’s disease. Nat Neurosci 26:1267–1280 PubMed PMC

Coifman RR, Lafon S. Diffusion maps. Appl Comput Harmon Anal. 2006;1:5–30. doi: 10.1016/j.acha.2006.04.006. DOI

Coppo R, Troyanov S, Bellur S, Cattran D, Cook HT, Feehally J, Roberts ISD, Morando L, Camilla R, Tesar V. Validation of the Oxford classification of IgA nephropathy in cohorts with different presentations and treatments. Kidney Int. 2014;4:828–836. doi: 10.1038/ki.2014.63. PubMed DOI PMC

Chen WS, Zivanovic N, van DD, Wolf G, Bodenmiller B, Krishnaswamy S. Uncovering axes of variation among single-cell cancer specimens. Nat Methods. 2020;3:302–310. doi: 10.1038/s41592-019-0689-z. PubMed DOI PMC

Demšar J. Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res. 2006;7:1–30.

Flamary R, Courty N, Gramfort A, Alaya MZ, Boisbunon A, Chambon S, Chapel L, Corenflos A, Fatras K. POT: python optimal transport. J Mach Learn Res. 2021;78:1–8.

Flores, ROR, Lanzer JD, Dimitrov D, Velten B, Saez-Rodruiguez J (2023) Multicellular factor analysis of single-cell data for a tissue-centric understanding of disease. eLife 12:e93161. 10.7554/eLife.93161 PubMed PMC

Hie B, Bryson B, Berger B. Efficient integration of heterogeneous single-cell transcriptomes using Scanorama. Nat Biotechnol. 2019;6:685–691. doi: 10.1038/s41587-019-0113-3. PubMed DOI PMC

Hölscher DL, Bouteldja N, Joodaki M, Russo ML, Lan YC, Sadr AV, Cheng M, Tesar V, Stillfried SV, Klinkhammer BM. Next-Generation Morphometry for pathomics-data mining in histopathology. Nat Commun. 2023;1:470. doi: 10.1038/s41467-023-36173-0. PubMed DOI PMC

Han G, Deng Q, Marques-Piubelli ML, Dai E, Dang M, Ma MCJ, Li X, Yang H, Henderson J, Kudryashova O. Follicular lymphoma microenvironment characteristics associated with tumor cell mutations and MHC class II expression. Blood Cancer Discov. 2022;5:428–443. doi: 10.1158/2643-3230.BCD-21-0075. PubMed DOI PMC

Hrovatin K, Bastidas-Ponce A, Bakhti M, Zappia L, Buttner M, Sallino C, Sterr M, Bottcher A, Migliorini A, Lickert H et al (2022) Delineating mouse β-cell identity during lifetime and in diabetes with a single cell atlas. Nature Metabolism 5:1615–1637. 10.1038/s42255-023-00876-x PubMed PMC

Hill KE, Lovett BM, Schwarzbauer JE (2022) Heparan sulfate is necessary for the early formation of nascent fibronectin and collagen I fibrils at matrix assembly sites. J Biol Chem 298(1):101479. 10.1016/j.jbc.2021.101479 PubMed PMC

Huber PJ (1965) A robust version of the probability ratio test. Ann Math Stat 36:1753–1758

Huber PJ (1992) Robust estimation of a location parameter. In: Breakthroughs in statistics, pp 492–518

Hershberger RE, Norton N, Morales A, Li D, Siegfried JD, Gonzalez-Quintana J. Coding sequence rare variants identified in MYBPC3, MYH6, TPM1, TNNC1, and TNNI3 from 312 patients with familial or idiopathic dilated cardiomyopathy. Circ Cardiovasc Genet. 2010;2:155–161. doi: 10.1161/CIRCGENETICS.109.912345. PubMed DOI PMC

Hao Y, Hao S, Andersen-Nissen E, Mauck IIIWM, Zheng S, Butler A, Lee MJ, Wilk AJ, Darby C, Zager M. Integrated analysis of multimodal single-cell data. Cell. 2021;13:3573–3587. doi: 10.1016/j.cell.2021.04.048. PubMed DOI PMC

Harrell EF (2001) Regression modeling strategies. Springer-Verlag, Berlin, Heidelberg

Isaka Y, Brees DK, Ikegaya K, Kaneda Y, Imai E, Noble NA, Border WA (1996) Gene therapy by skeletal muscle expression of decorin prevents fibrotic disease in rat kidney. Nat Med 2:418–423 PubMed

Jiang J, Burgon PG, Wakimoto H, Onoue K, Gorham JM, O’Meara CC, Fomovsky G, McConnell BK, Lee RT, Seidman JG. Cardiac myosin binding protein C regulates postnatal myocyte cytokinesis. Proc Natl Acad Sci USA. 2015;29:9046–9051. doi: 10.1073/pnas.1511004112. PubMed DOI PMC

Kuppe C, Ramirez FloresRO, Li Z, Hannani M, Tanevski J, Halder M, Cheng M, Ziegler S, Zhang X, Preisker F. Spatial multi-omic map of human myocardial infarction. Nature. 2020;6987:766–777. PubMed PMC

Korsunsky I, Millard N, Fan J, Slowikowski K, Zhang F, Wei K, Baglaenko Y, Brenner M, Loh PR, Raychaudhuri S. Fast, sensitive and accurate integration of single-cell data with Harmony. Nat Methods. 2019;12:1289–1296. doi: 10.1038/s41592-019-0619-0. PubMed DOI PMC

Kuchroo M, Huang J, Wong P, Grenier JC, Shung D, Tong A, Lucas C, Klein J, Burkhardt DB, Gigante S. Multiscale PHATE identifies multimodal signatures of COVID-19. Nat Biotechnol. 2022;5:681–691. doi: 10.1038/s41587-021-01186-x. PubMed DOI PMC

Lublin FD, Reingold SC. Defining the clinical course of multiple sclerosis: results of an international survey. Neurology. 1996;4:907–911. doi: 10.1212/WNL.46.4.907. PubMed DOI

Lake BB, Menon R, Winfree S, Hu Q, Ferreira RM, Kalhor K, Barwinska D, Otto EA, Ferkowicz M, Diep D et al (2023) An atlas of healthy and injured cell states and niches in the human kidney. Nature 619:585–594. 10.1038/s41586-023-05769-3 PubMed PMC

Liu J, Vinck M. Improved visualization of high-dimensional data using the distance-of-distance transformation. PLoS Comput Biol. 2022;12:e1010764. doi: 10.1371/journal.pcbi.1010764. PubMed DOI PMC

Lamber EP, Guicheney P, Pinotsis N. The role of the M-band myomesin proteins in muscle integrity and cardiac disease. J Biomed Sci. 2022;1:18. doi: 10.1186/s12929-022-00801-6. PubMed DOI PMC

Moon KR, van DD, Wang Z, Gigante S, Burkhardt DB, Chen WS, Yim K, van denElzenA, Hirn MJ, Coifman RR, Ivanova NB, Wolf G, Krishnaswamy S. Visualizing structure and transitions in high-dimensional biological data. Nat Biotechnol. 2019;12:1482–1492. doi: 10.1038/s41587-019-0336-3. PubMed DOI PMC

Marx V. How single-cell multi-omics builds relationships. Nat Methods. 2022;2:142–146. doi: 10.1038/s41592-022-01392-8. PubMed DOI PMC

Perez RK, Gordon MG, Subramaniam M, Kim MC, Hartoularos GC, Targ S, Sun Y, Ogorodnikov A, Bueno R, Lu A. Single-cell RNA-seq reveals cell type-specific molecular and genetic associations to lupus. Science. 2022;6589:eabf1970. doi: 10.1126/science.abf1970. PubMed DOI PMC

Peyré G, Cuturi M. Computational optimal transport. Found Trend Mach Learn. 2019;5-6:1–257.

Peng J, Sun B-F, Chen C-Y, Zhou J-Y, Chen Y-S, Chen H, Liu L, Huang D, Jiang J, Cui G-S. Single-cell RNA-seq highlights intra-tumoral heterogeneity and malignant progression in pancreatic ductal adenocarcinoma. Cell Res. 2019;9:725–738. doi: 10.1038/s41422-019-0195-y. PubMed DOI PMC

Polanski K, Young MD, Miao Z, Meyer KB, Teichmann SA, Park JE. BBKNN: fast batch alignment of single cell transcriptomes. Bioinformatics. 2020;3:964–965. doi: 10.1093/bioinformatics/btz625. PubMed DOI PMC

Ravindra N, Sehanobish A, Pappalardo JL, Hafler DA, van Dijk D (2020) Disease state prediction from single-cell data using graph attention networks. In: Proceedings of the ACM conference on health, inference, and learning, pp 121–130

Reimand, J, Kull, M, Peterson, H, Hansen, J, Vilo, J (2007) g: Profiler—a web-based toolset for functional profiling of gene lists from large-scale experiments. Nucleic Acids Res (Suppl 2) W193–W200 PubMed PMC

Ritchie ME, Phipson B, Wu D, Hu Y, Law CW, Shi W, Smyth GK. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 2015;7:e47. doi: 10.1093/nar/gkv007. PubMed DOI PMC

Rousseeuw PJ (1987) Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math 20:53–65

Rand WM. Objective criteria for the evaluation of clustering methods. J Am Stat Assoc. 1971;336:846–850. doi: 10.1080/01621459.1971.10482356. DOI

Ren X, Wen W, Fan X, Hou W, Su B, Cai P, Li J, Liu Y, Tang F, Zhang F. COVID-19 immune features revealed by a large-scale single-cell transcriptome atlas. Cell. 2021;7:1895–1913. doi: 10.1016/j.cell.2021.01.053. PubMed DOI PMC

Rubner Y, Tomasi C, Guibas LJ. The earth mover’s distance as a metric for image retrieval. Int J Comput Vis. 2000;2:99–121. doi: 10.1023/A:1026543900054. DOI

Sikkema L, Ramírez-Suástegui C, Strobl DC, Gillett TE, Zappia L, Madissoon E, Markov NS, Zaragosi L-E, Ji Y, Ansari M. An integrated cell atlas of the lung in health and disease. Nat Med. 2023;6:1563–1577. doi: 10.1038/s41591-023-02327-2. PubMed DOI PMC

Sklavenitis-Pistofidis R, Getz G, Ghobrial I. Single-cell RNA sequencing: one step closer to the clinic. Nat Med. 2021;3:375–376. doi: 10.1038/s41591-021-01276-y. PubMed DOI

Stephenson E, Reynolds G, Botting RA, Calero-Nieto FJ, Morgan MD, Tuong ZK, Bach K, Sungnak W, Worlock KB, Yoshida M. Single-cell multi-omics analysis of the immune response in COVID-19. Nat Med. 2021;5:904–916. doi: 10.1038/s41591-021-01329-2. PubMed DOI PMC

Salcher S, Sturm G, Horvath L, Untergasser G, Kuempers C, Fotakis G, Panizzolo E, Martowicz A, Trebo M, Pall G. High-resolution single-cell atlas reveals diversity and plasticity of tissue-resident neutrophils in non-small cell lung cancer. Cancer Cell. 2022;12:1503–1520. doi: 10.1016/j.ccell.2022.10.008. PubMed DOI PMC

Shah VM, Sheppard BC, Sears RC, Alani AWG. Hypoxia: friend or foe for drug delivery in pancreatic cancer. Cancer Lett. 2020;1:63–70. doi: 10.1016/j.canlet.2020.07.041. PubMed DOI PMC

Trapnell C, Cacchiarelli D, Grimsby J, Pokharel P, Li S, Morse M, Lennon NJ, Livak KJ, Mikkelsen TS, Rinn JL. The dynamics and regulators of cell fate decisions are revealed by pseudotemporal ordering of single cells. Nat Biotechnol. 2014;4:381–386. doi: 10.1038/nbt.2859. PubMed DOI PMC

Taniguchi K, Takeya R, Suetsugu S, Kan-o M, Narusawa M, Shiose A, Tominaga R, Sumimoto H. Mammalian formin Fhod3 regulates actin assembly and sarcomere organization in striated muscles. J Biol Chem. 2009;43:29873–29881. doi: 10.1074/jbc.M109.059303. PubMed DOI PMC

Tabula Sapiens Consortium. Jones RC, Karkanias J, Krasnow MA, Pisco AO, Quake SR, Salzman J, Yosef N, Bulthaup B, Brown P. The Tabula Sapiens: a multiple-organ, single-cell transcriptomic atlas of humans. Science. 2022;6594:eabl4896. doi: 10.1126/science.abl4896. PubMed DOI PMC

Traag VA, Waltman L, Van EckNJ. From Louvain to Leiden: guaranteeing well-connected communities. Sci Rep. 2019;1:5233. doi: 10.1038/s41598-019-41695-z. PubMed DOI PMC

Van den Berge K, Roux de Bézieux H, Street K, Saelens W, Cannoodt R, Saeys Y, Dudoit S, Clement L (2020) Trajectory-based differential expression analysis for single-cell sequencing data. Nat Commun 11:1201 PubMed PMC

Witten DM (2011) Classification and clustering of sequencing data using a Poisson model. Ann Appl Stat 5:2493–2518

Zhang Q, Wang L, Wang S, Cheng H, Xu L, Pei G, Wang Y, Fu C, Jiang Y, He C, Wei Q. Signaling pathways and targeted therapy for myocardial infarction. Signal Transduct Target Ther. 2022;1:78. doi: 10.1038/s41392-022-00925-z. PubMed DOI PMC

Najít záznam

Citační ukazatele

Nahrávání dat ...

Možnosti archivace

Nahrávání dat ...