A computational workflow for analysis of missense mutations in precision oncology
Status PubMed-not-MEDLINE Jazyk angličtina Země Anglie, Velká Británie Médium electronic
Typ dokumentu časopisecké články
Grantová podpora
NU20-03-00240
Ministerstvo Zdravotnictví Ceské Republiky
EXCELES LX22NPO5102
European Union
TEAMING CZ.02.1.01/0.0/0.0/17_043/0009632; ESFRI CZECRIN LM2023049; ESFRI eINFRA LM2018140
Ministerstvo Školství, Mládeže a Tělovýchovy
TREND FW03010208; PERMED TN02000109
Technology Agency of the Czech Republic
TREND FW03010208; PERMED TN02000109
Technology Agency of the Czech Republic
TEAMING 857560
Horizon 2020,European Union
PubMed
39075588
PubMed Central
PMC11285293
DOI
10.1186/s13321-024-00876-3
PII: 10.1186/s13321-024-00876-3
Knihovny.cz E-zdroje
- Klíčová slova
- Bioinformatics, Cancer, Function, High-performance computing, Machine learning, Molecular modelling, Oncology, Personalised medicine, Single nucleotide polymorphism, Stability, Treatment,
- Publikační typ
- časopisecké články MeSH
Every year, more than 19 million cancer cases are diagnosed, and this number continues to increase annually. Since standard treatment options have varying success rates for different types of cancer, understanding the biology of an individual's tumour becomes crucial, especially for cases that are difficult to treat. Personalised high-throughput profiling, using next-generation sequencing, allows for a comprehensive examination of biopsy specimens. Furthermore, the widespread use of this technology has generated a wealth of information on cancer-specific gene alterations. However, there exists a significant gap between identified alterations and their proven impact on protein function. Here, we present a bioinformatics pipeline that enables fast analysis of a missense mutation's effect on stability and function in known oncogenic proteins. This pipeline is coupled with a predictor that summarises the outputs of different tools used throughout the pipeline, providing a single probability score, achieving a balanced accuracy above 86%. The pipeline incorporates a virtual screening method to suggest potential FDA/EMA-approved drugs to be considered for treatment. We showcase three case studies to demonstrate the timely utility of this pipeline. To facilitate access and analysis of cancer-related mutations, we have packaged the pipeline as a web server, which is freely available at https://loschmidt.chemi.muni.cz/predictonco/ .Scientific contributionThis work presents a novel bioinformatics pipeline that integrates multiple computational tools to predict the effects of missense mutations on proteins of oncological interest. The pipeline uniquely combines fast protein modelling, stability prediction, and evolutionary analysis with virtual drug screening, while offering actionable insights for precision oncology. This comprehensive approach surpasses existing tools by automating the interpretation of mutations and suggesting potential treatments, thereby striving to bridge the gap between sequencing data and clinical application.
Central European Institute of Technology Masaryk University Brno Czech Republic
Department of Biology Faculty of Medicine Masaryk University Brno Czech Republic
International Clinical Research Center St Anne's University Hospital Brno Brno Czech Republic
Loschmidt Laboratories RECETOX Faculty of Science Masaryk University Brno Czech Republic
Zobrazit více v PubMed
Ainscough BJ et al (2016) DoCM: a database of curated mutations in cancer. Nat Method 13(10):806–807. 10.1038/nmeth.400010.1038/nmeth.4000 PubMed DOI PMC
Ammar A et al (2022) PSnpBind: a database of mutated binding site protein–ligand complexes constructed using a multithreaded virtual screening workflow. J Chemin. 10.1186/s13321-021-00573-510.1186/s13321-021-00573-5 PubMed DOI PMC
Anaya J (2016) OncoLnc: linking TCGA survival data to MRNAs, MiRNAs, and LncRNAs. PeerJ Comput Sci 2:e67. 10.7717/peerj-cs.6710.7717/peerj-cs.67 DOI
Bendl J et al (2014) PredictSNP: robust and accurate consensus classifier for prediction of disease-related mutations. PLoS Comput Biol. 10.1371/journal.pcbi.1003440 10.1371/journal.pcbi.1003440 PubMed DOI PMC
Blanco JD et al (2018) FoldX accurate structural protein–DNA binding prediction using PADA1 (protein assisted DNA assembly 1). Nucl Acid Res 46(8):3852–3863. 10.1093/nar/gky22810.1093/nar/gky228 PubMed DOI PMC
Boeckmann B (2003) The SWISS-PROT protein knowledgebase and its Supplement TrEMBL in 2003. Nucl Acid Res 31(1):365–370. 10.1093/nar/gkg09510.1093/nar/gkg095 PubMed DOI PMC
Bungartz KD et al (2018) Making the right calls in precision oncology. Nat Biotechnol 36(8):692–696. 10.1038/nbt.4214 10.1038/nbt.4214 PubMed DOI
Brandes N et al (2023) Genome-wide prediction of disease variant effects with a deep protein language model. Nat Genet. 10.1038/s41588-023-01465-0 10.1038/s41588-023-01465-0 PubMed DOI PMC
Buzdin A et al (2021) Editorial: next generation sequencing based diagnostic approaches in clinical oncology. Front Oncol. 10.3389/fonc.2020.635555 10.3389/fonc.2020.635555 PubMed DOI PMC
“Cancer Today.” Iarc.fr, 2020, https://gco.iarc.fr/today/home.
Capra JA, Singh M (2007) Predicting functionally important residues from sequence conservation. Bioinformatics 23(15):1875–1882. 10.1093/bioinformatics/btm270 10.1093/bioinformatics/btm270 PubMed DOI
Chakravarty D et al (2017) OncoKB: a precision oncology knowledge base. JCO Precis Oncol. 10.1200/po.17.00011 10.1200/po.17.00011 PubMed DOI PMC
Dana JM et al (2018) SIFTS: updated structure integration with function, taxonomy and sequences resource allows 40-fold increase in coverage of structure-based annotations for proteins. Nucl Acid Res. 10.1093/nar/gky111410.1093/nar/gky1114 PubMed DOI PMC
Darbyshire M et al (2019) Estimating the frequency of single point driver mutations across common solid tumours. Sci Rep. 10.1038/s41598-019-48765-2 10.1038/s41598-019-48765-2 PubMed DOI PMC
Deng N et al (2017) Single nucleotide polymorphisms and cancer susceptibility. Oncotarget. 10.1632/oncotarget.22372 10.1632/oncotarget.22372 PubMed DOI PMC
Eswar N et al (2008) Protein structure modeling with MODELLER. Method Mol Biol. 10.1007/978-1-60327-058-8_810.1007/978-1-60327-058-8_8 PubMed DOI
Evans R et al (2021) Protein complex prediction with AlphaFold-Multimer. BioRxiv. 10.1101/2021.10.04.463034 10.1101/2021.10.04.463034 PubMed DOI
Gao J et al (2013) Integrative analysis of complex cancer genomics and clinical profiles using the CBioPortal. Sci Signal. 10.1126/scisignal.2004088 10.1126/scisignal.2004088 PubMed DOI PMC
Gentles AJ et al (2015) The prognostic landscape of genes and infiltrating immune cells across human cancers. Nat Med. 10.1038/nm.3909 10.1038/nm.3909 PubMed DOI PMC
Irwin JJ et al (2012) ZINC: a free tool to discover chemistry for biology. J Chem Inf Model. 10.1021/ci3001277 10.1021/ci3001277 PubMed DOI PMC
Iwamura R et al (2023) PDGFRB and NOTCH3 mutations are detectable in a wider range of pericytic tumors, including myopericytomas, angioleiomyomas, glomus tumors, and their combined tumors. Mod Pathol. 10.1016/j.modpat.2022.100070 10.1016/j.modpat.2022.100070 PubMed DOI
Jiménez-Moreno A et al (2021) DeepAlign, a 3D alignment method based on regionalized deep learning for Cryo-EM. J Struct Biol 213(2):107712. 10.1016/j.jsb.2021.107712 10.1016/j.jsb.2021.107712 PubMed DOI
Jumper J et al (2021) Highly accurate protein structure prediction with AlphaFold. Nature. 10.1038/s41586-021-03819-2 10.1038/s41586-021-03819-2 PubMed DOI PMC
Kellogg EH et al (2010) Role of conformational sampling in computing mutation-induced changes in protein structure and stability. Protein Struct Funct Bioinform 79(3):830–838. 10.1002/prot.2292110.1002/prot.22921 PubMed DOI PMC
Krebs FS et al (2021) Swiss-PO: a new tool to analyze the impact of mutations on protein three-dimensional structures for precision oncology. NPJ Precis Oncol 5(1):19. 10.1038/s41698-021-00156-5 10.1038/s41698-021-00156-5 PubMed DOI PMC
Krivák R, Hoksza D (2018) P2Rank: machine learning based tool for rapid and accurate prediction of ligand binding sites from protein structure. J Cheminformatics. 10.1186/s13321-018-0285-810.1186/s13321-018-0285-8 PubMed DOI PMC
Krzyszczyk P et al (2018) The growing role of precision and personalized medicine for cancer treatment. Technology. 10.1142/s2339547818300020 10.1142/s2339547818300020 PubMed DOI PMC
Kurnit KC et al (2017) ‘Personalized cancer therapy’: a publicly available precision oncology resource. Cancer Res 77(21):e123–e126. 10.1158/0008-5472.can-17-0341 10.1158/0008-5472.can-17-0341 PubMed DOI PMC
Landrum MJ et al (2017) ClinVar: improving access to variant interpretations and supporting evidence. Nucl Acid Res 46(D1):D1062–D1067. 10.1093/nar/gkx115310.1093/nar/gkx1153 PubMed DOI PMC
Lassen UN et al (2021) Precision oncology: a clinical and patient perspective. Futur Oncol 17(30):3995–4009. 10.2217/fon-2021-068810.2217/fon-2021-0688 PubMed DOI
Li J et al (2013) TCPA: a resource for cancer functional proteomics data. Nat Method 10(11):1046–1047. 10.1038/nmeth.265010.1038/nmeth.2650 PubMed DOI PMC
Madeira F et al (2022) Search and sequence analysis tools services from EMBL-EBI in 2022. Nucl Acid Res 50(W1):W276–W279. 10.1093/nar/gkac24010.1093/nar/gkac240 PubMed DOI PMC
O’Meara MJ et al (2015) Combined covalent-electrostatic model of hydrogen bonding improves structure prediction with rosetta. J Chem Theor Computation. 11(2):609–622. 10.1021/ct500864r10.1021/ct500864r PubMed DOI PMC
Ortiz E et al (2020) Invasive myofibromatosis with visceral involvement in a term newborn: a case report. Am J Pediatr 6(2):173–173. 10.11648/j.ajp.20200602.3010.11648/j.ajp.20200602.30 DOI
Patterson SE et al (2016) The clinical trial landscape in oncology and connectivity of somatic mutational profiles to targeted therapies. Hum Genom. 10.1186/s40246-016-0061-710.1186/s40246-016-0061-7 PubMed DOI PMC
Pond D et al (2018) A patient with germ-line gain-of-function PDGFRB P.N666H mutation and marked clinical response to imatinib. Genet Med 20(1):142–150. 10.1038/gim.2017.104 10.1038/gim.2017.104 PubMed DOI
Prlić A et al (2007) Integrating sequence and structural biology with DAS. BMC Bioinform. 10.1186/1471-2105-8-33310.1186/1471-2105-8-333 PubMed DOI PMC
Ribeiro AJM et al (2017) Mechanism and catalytic site atlas (M-CSA): a database of enzyme reaction mechanisms and active sites. Nucl Acid Res 46(D1):D618–D623. 10.1093/nar/gkx101210.1093/nar/gkx1012 PubMed DOI PMC
Richards S et al (2015) Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the american college of medical genetics and genomics and the association for molecular pathology. Genet Med 17(5):405–424. 10.1038/gim.2015.30 10.1038/gim.2015.30 PubMed DOI PMC
Rostkowski M et al (2011) Graphical analysis of PH-dependent properties of proteins predicted using PROPKA. BMC Struct Biol. 10.1186/1472-6807-11-6 10.1186/1472-6807-11-6 PubMed DOI PMC
Seeliger D, de Groot BL (2010) Ligand docking and binding site analysis with PyMOL and autodock/vina. J Comput Aided Mol Des 24(5):417–422. 10.1007/s10822-010-9352-6 10.1007/s10822-010-9352-6 PubMed DOI PMC
Sievers F et al (2011) Fast, scalable generation of high-quality protein multiple sequence alignments using clustal omega. Mol Syst Biol 7(1):539. 10.1038/msb.2011.75 10.1038/msb.2011.75 PubMed DOI PMC
Sumbalova L et al (2018) HotSpot wizard 3.0: web server for automated design of mutations and smart libraries based on sequence input information. Nucl Acid Res 46(W1):W356–W362. 10.1093/nar/gky41710.1093/nar/gky417 PubMed DOI PMC
Sung H et al (2021) Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA A Cancer J Clin 71(3):209–249. 10.3322/caac.2166010.3322/caac.21660 PubMed DOI
Suzek BE et al (2014) UniRef clusters: a comprehensive and scalable alternative for improving sequence similarity searches. Bioinformatics 31(6):926–932. 10.1093/bioinformatics/btu739 10.1093/bioinformatics/btu739 PubMed DOI PMC
The International Cancer Genome Consortium (2010) International network of cancer genome projects. Nature 464(7291):993–998. 10.1038/nature08987 10.1038/nature08987 PubMed DOI PMC
The UniProt Consortium (2022) UniProt: the universal protein knowledgebase in 2023. Nucl Acid Res 51(D1):D523-531. 10.1093/nar/gkac105210.1093/nar/gkac1052 PubMed DOI PMC
Trott O, Olson AJ (2009) AutoDock vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. J Comput Chem. 10.1002/jcc.2133410.1002/jcc.21334 PubMed DOI PMC
Venselaar H et al (2010) Protein structure analysis of mutations causing inheritable diseases. An e-science approach with life scientist friendly interfaces. BMC Bioinform. 10.1186/1471-2105-11-54810.1186/1471-2105-11-548 PubMed DOI PMC
Weinstein JN et al (2013) The cancer genome atlas pan-cancer analysis project. Nat Genet 45(10):1113–1120. 10.1038/ng.2764 10.1038/ng.2764 PubMed DOI PMC
wwPDB Consortium (2018) Protein data bank: the single global archive for 3D macromolecular structure data. Nucl Acid Res 47(D1):D520–D528. 10.1093/nar/gky94910.1093/nar/gky949 PubMed DOI PMC