Improving protein-ligand binding site prediction accuracy by classification of inner pocket points using local features
Status PubMed-not-MEDLINE Jazyk angličtina Země Anglie, Velká Británie Médium electronic-ecollection
Typ dokumentu časopisecké články
PubMed
25932051
PubMed Central
PMC4414931
DOI
10.1186/s13321-015-0059-5
PII: 59
Knihovny.cz E-zdroje
- Klíčová slova
- Binding site prediction, Ligand binding site, Machine learning, Molecular recognition, Pocket score, Protein pocket, Random forests,
- Publikační typ
- časopisecké články MeSH
BACKGROUND: Protein-ligand binding site prediction from a 3D protein structure plays a pivotal role in rational drug design and can be helpful in drug side-effects prediction or elucidation of protein function. Embedded within the binding site detection problem is the problem of pocket ranking - how to score and sort candidate pockets so that the best scored predictions correspond to true ligand binding sites. Although there exist multiple pocket detection algorithms, they mostly employ a fairly simple ranking function leading to sub-optimal prediction results. RESULTS: We have developed a new pocket scoring approach (named PRANK) that prioritizes putative pockets according to their probability to bind a ligand. The method first carefully selects pocket points and labels them by physico-chemical characteristics of their local neighborhood. Random Forests classifier is subsequently applied to assign a ligandability score to each of the selected pocket point. The ligandability scores are finally merged into the resulting pocket score to be used for prioritization of the putative pockets. With the used of multiple datasets the experimental results demonstrate that the application of our method as a post-processing step greatly increases the quality of the prediction of Fpocket and ConCavity, two state of the art protein-ligand binding site prediction algorithms. CONCLUSIONS: The positive experimental results show that our method can be used to improve the success rate, validity and applicability of existing protein-ligand binding site prediction tools. The method was implemented as a stand-alone program that currently contains support for Fpocket and Concavity out of the box, but is easily extendible to support other tools. PRANK is made freely available at http://siret.ms.mff.cuni.cz/prank.
Zobrazit více v PubMed
Zheng X, Gan L, Wang E, Wang J. Pocket-based drug design: Exploring pocket space. AAPS J. 2013;15(1):228–41. doi: 10.1208/s12248-012-9426-6. PubMed DOI PMC
Pérot S, Sperandio O, Miteva M, Camproux A, Villoutreix B. Druggable pockets and binding site centric chemical space: a paradigm shift in drug discovery. Drug Discovery Today. 2010;15(15-16):656–67. doi: 10.1016/j.drudis.2010.05.015. PubMed DOI
Xie L, Xie L, Bourne PE. Structure-based systems biology for analyzing off-target binding. Curr Opin Struct Biol. 2011;21(2):189–99. doi: 10.1016/j.sbi.2011.01.004. PubMed DOI PMC
Konc J, Janežič D. Binding site comparison for function prediction and pharmaceutical discovery. Curr Opin Struct Biol. 2014;25:34–9. doi: 10.1016/j.sbi.2013.11.012. PubMed DOI
Weisel M, Proschak E, Schneider G. Pocketpicker: analysis of ligand binding-sites with shape descriptors. Chem Cent J. 2007;1(1):7. doi: 10.1186/1752-153X-1-7. PubMed DOI PMC
Sotriffer C, Klebe G. Identification and mapping of small-molecule binding sites in proteins: computational tools for structure-based drug design. Il Farmaco. 2002;57(3):243–51. doi: 10.1016/S0014-827X(02)01211-9. PubMed DOI
Nisius B, Sha F, Gohlke H. Structure-based computational analysis of protein binding sites for function and druggability prediction. J Biotechnol. 2012;159(3):123–34. doi: 10.1016/j.jbiotec.2011.12.005. PubMed DOI
Ghersi D, Sanchez R. EasyMIFS and SiteHound: a toolkit for the identification of ligand-binding sites in protein structures. Bioinf (Oxford, England) 2009;25(23):3185–6. doi: 10.1093/bioinformatics/btp562. PubMed DOI PMC
Le Guilloux V, Schmidtke P, Tuffery P. Fpocket: An open source platform for ligand pocket detection. BMC Bioinf. 2009;10(1):168. doi: 10.1186/1471-2105-10-168. PubMed DOI PMC
Capra JA, Laskowski RA, Thornton JM, Singh M, Funkhouser TA. Predicting protein ligand binding sites by combining evolutionary sequence conservation and 3d structure. PLoS Comput Biol. 2009;5(12):1000585. doi: 10.1371/journal.pcbi.1000585. PubMed DOI PMC
Henrich S, Outi S, Huang B, Rippmann F, Cruciani G, Wade R. Computational approaches to identifying and characterizing protein binding sites for ligand design. J Mol Recognit: JMR. 2010;23(2):209–19. PubMed
Leis S, Schneider S, Zacharias M. In silico prediction of binding sites on proteins. Curr Med Chem. 2010;17(15):1550–62. doi: 10.2174/092986710790979944. PubMed DOI
Hendlich M, Rippmann F, Barnickel G. LIGSITE: automatic and efficient detection of potential small molecule-binding sites in proteins. J Mol Graphics Modell. 1997;15(6):359–63389. doi: 10.1016/S1093-3263(98)00002-3. PubMed DOI
Huang B, Schroeder M. Ligsitecsc: predicting ligand binding sites using the connolly surface and degree of conservation. BMC Struct Biol. 2006;6(1):19. doi: 10.1186/1472-6807-6-19. PubMed DOI PMC
Labute P, Santavy M. Locating Binding Sites in Protein Structures. (Online; accessed 2013-07-16). http://www.chemcomp.com/journal/sitefind.htm Accessed 2013-07-16.
Hajduk PJ, Huth JR, Tse C. Predicting protein druggability. Drug Discovery Today. 2005;10(23-24):1675–82. doi: 10.1016/S1359-6446(05)03624-X. PubMed DOI
Schmidtke P, Axel B, Luque F, Barril X. MDpocket: open-source cavity detection and characterization on molecular dynamics trajectories. Bioinf (Oxford, England) 2011;27(23):3276–85. doi: 10.1093/bioinformatics/btr550. PubMed DOI
Laurie A, Jackson R. Q-SiteFinder: an energy-based method for the prediction of protein-ligand binding sites. Bioinf (Oxford, England) 2005;21(9):1908–16. doi: 10.1093/bioinformatics/bti315. PubMed DOI
Schneider S, Zacharias M. Combining geometric pocket detection and desolvation properties to detect putative ligand binding sites on proteins. J Struct Biol. 2012;180(3):546–50. doi: 10.1016/j.jsb.2012.09.010. PubMed DOI
Morita M, Nakamura S, Shimizu K. Highly accurate method for ligand-binding site prediction in unbound state (apo) protein structures. Proteins. 2008;73(2):468–79. doi: 10.1002/prot.22067. PubMed DOI
Roy A, Zhang Y. Recognizing protein-ligand binding sites by global structural alignment and local geometry refinement. Struct (London, England:1993) 2012;20(6):987–97. doi: 10.1016/j.str.2012.03.009. PubMed DOI PMC
Brylinski M, Skolnick J. A threading-based method (FINDSITE) for ligand-binding site prediction and functional annotation. Proc Nat Acad Sci USA. 2008;105(1):129–34. doi: 10.1073/pnas.0707684105. PubMed DOI PMC
Skolnick J, Brylinski M. FINDSITE: a combined evolution/structure-based approach to protein function prediction. Briefings Bioinf. 2009;10(4):378–91. doi: 10.1093/bib/bbp017. PubMed DOI PMC
Skolnick J, Kihara D, Zhang Y. Development and large scale benchmark testing of the PROSPECTOR_3 threading algorithm. Proteins. 2004;56(3):502–18. doi: 10.1002/prot.20106. PubMed DOI
Huang B. MetaPocket: a meta approach to improve protein ligand binding site prediction. Omics: J integrative Biol. 2009;13(4):325–30. doi: 10.1089/omi.2009.0045. PubMed DOI
Zhang Z, Li Y, Lin B, Schroeder M, Huang B. Identification of cavities on protein surface using multiple computational approaches for drug binding site prediction. Bioinf (Oxford, England) 2011;27(15):2083–8. doi: 10.1093/bioinformatics/btr331. PubMed DOI
Schmidtke P, Barril X. Understanding and predicting druggability. a high-throughput method for detection of drug binding sites. J Med Chem. 2010;53(15):5858–67. doi: 10.1021/jm100574m. PubMed DOI
Krasowski A, Muthas D, Sarkar A, Schmitt S, Brenk R. Drugpred: a structure-based approach to predict protein druggability developed using an extensive nonredundant data set. J Chem Inf Model. 2011;51(11):2829–42. doi: 10.1021/ci200266d. PubMed DOI
Soga S, Shirai H, Kobori M, Hirayama N. Use of amino acid composition to predict ligand-binding sites. J Chem Inf Model. 2007;47(2):400–6. doi: 10.1021/ci6002202. PubMed DOI
Schmidtke P. Protein-ligand binding sites Identification, characterization and interrelations. PhD thesis, University of Barcelona (September 2011).
Chen K, Mizianty M, Gao J, Kurgan L. A critical comparative assessment of predictions of protein-binding sites for biologically relevant organic compounds. Struct (London, England: 1993) 2011;19(5):613–21. doi: 10.1016/j.str.2011.02.015. PubMed DOI
Fariselli P, Pazos F, Valencia A, Casadio R. Prediction of protein–protein interaction sites in heterocomplexes with neural networks. Eur J Biochemistry/FEBS. 2002;269(5):1356–61. doi: 10.1046/j.1432-1033.2002.02767.x. PubMed DOI
Bordner AJ. Predicting small ligand binding sites in proteins using backbone structure. Bioinf (Oxford, England) 2008;24(24):2865–71. doi: 10.1093/bioinformatics/btn543. PubMed DOI PMC
Sikic M, Tomic S, Vlahovicek K. Prediction of protein-protein interaction sites in sequences and 3d structures by random forests. PLoS Computational Biol. 2009;5(1):1000278. doi: 10.1371/journal.pcbi.1000278. PubMed DOI PMC
Zhou H-X, Shan Y. Prediction of protein interaction sites from sequence profile and residue neighbor list. Proteins: Struct Funct Bioinf. 2001;44(3):336–43. doi: 10.1002/prot.1099. PubMed DOI
Xiong Y, Xia J, Zhang W, Liu J. Exploiting a reduced set of weighted average features to improve prediction of dna-binding residues from 3d structures. PloS one. 2011;6(12):28440. doi: 10.1371/journal.pone.0028440. PubMed DOI PMC
Nayal M, Honig B. On the nature of cavities on protein surfaces: application to the identification of drug-binding sites. Proteins. 2006;63(4):892–906. doi: 10.1002/prot.20897. PubMed DOI
Connolly M. Solvent-accessible surfaces of proteins and nucleic acids. Science. 1983;221(4612):709–13. doi: 10.1126/science.6879170. PubMed DOI
Kyte J, Doolittle RF. A simple method for displaying the hydropathic character of a protein. Journal of Molecular Biology. 1982;157(1):105–32. doi: 10.1016/0022-2836(82)90515-0. PubMed DOI
Desaphy J, Azdimousa K, Kellenberger E, Rognan D. Comparison and druggability prediction of protein-ligand binding sites from pharmacophore-annotated cavity shapes. J Chem Inf Model. 2012;52(8):2287–99. doi: 10.1021/ci300184x. PubMed DOI
Khazanov NA, Carlson HA. Exploring the composition of protein-ligand binding sites on a large scale. PLoS Comput Biol. 2013;9(11):1003321. doi: 10.1371/journal.pcbi.1003321. PubMed DOI PMC
Pintar A, Carugo O, Pongor S. Cx, an algorithm that identifies protruding atoms in proteins. Bioinformatics. 2002;18(7):980–4. doi: 10.1093/bioinformatics/18.7.980. PubMed DOI
Eisenhaber F, Lijnzaad P, Argos P, Sander C, Scharf M. The double cubic lattice method: Efficient approaches to numerical integration of surface area and volume and to dot surface contouring of molecular assemblies. Journal of Computational Chemistry. 1995;16(3):273–84. doi: 10.1002/jcc.540160303. DOI
Breiman L. Random forests. Machine Learning. 2001;45(1):5–32. doi: 10.1023/A:1010933404324. DOI
Svetnik V, Liaw A, Tong C, Culberson JC, Sheridan RP, Feuston BP. Random forest: a classification and regression tool for compound classification and qsar modeling. Journal of chemical information and computer sciences. 2003;43(6):1947–58. PubMed
Boulesteix A-L, Janitza S, Kruppa J, K-nig IR. Overview of random forest methodology and practical guidance with emphasis on computational biology and bioinformatics. Wiley Interdisciplinary Rev: Data Min Knowledge Discovery. 2012;2(6):493–507.
Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH. The weka data mining software: an update. ACM SIGKDD Explorations Newsletter. 2009;11(1):10–8. doi: 10.1145/1656274.1656278. DOI
Prlic A, Yates A, Bliven SE, Rose PW, Jacobsen J, Troshin PV, et al. Biojava: an open-source framework for bioinformatics in 2012. Bioinf (Oxford, England) 2012;28(20):2693–5. doi: 10.1093/bioinformatics/bts494. PubMed DOI PMC
Steinbeck C, Han Y, Kuhn S, Horlacher O, Luttmann E, Willighagen E. The chemistry development kit (cdk): An open-source java library for chemo- and bioinformatics. J Chem Inf Comput Sci. 2003;43(2):493–500. doi: 10.1021/ci025584y. PubMed DOI PMC
Hartshorn M, Verdonk M, Chessari G, Brewerton S, Mooij W, Mortenson P, et al. Diverse, high-quality test set for the validation of protein-ligand docking performance. J Med Chem. 2007;50(4):726–41. doi: 10.1021/jm061277y. PubMed DOI
ConCavity Website. http://compbio.cs.princeton.edu/concavity/.
Hajduk PJ, Huth JR, Fesik SW. Druggability indices for protein targets derived from nmr-based screening data. J Med Chem. 2005;48(7):2518–25. doi: 10.1021/jm049131r. PubMed DOI
Filippakopoulos P, Qi J, Picaud S, Shen Y, Smith WB, Fedorov O, et al. Selective inhibition of bet bromodomains. Nature. 2010;468(7327):1067–73. doi: 10.1038/nature09504. PubMed DOI PMC
Hajduk PJ. Sar by nmr: putting the pieces together. Mol Interventions. 2006;6(5):266–72. doi: 10.1124/mi.6.5.8. PubMed DOI