P2Rank: machine learning based tool for rapid and accurate prediction of ligand binding sites from protein structure

. 2018 Aug 14 ; 10 (1) : 39. [epub] 20180814

Status PubMed-not-MEDLINE Jazyk angličtina Země Velká Británie, Anglie Médium electronic

Typ dokumentu časopisecké články

Perzistentní odkaz   https://www.medvik.cz/link/pmid30109435

Grantová podpora
1556217 Univerzita Karlova v Praze
SVV 260451 Univerzita Karlova v Praze

Odkazy

PubMed 30109435
PubMed Central PMC6091426
DOI 10.1186/s13321-018-0285-8
PII: 10.1186/s13321-018-0285-8
Knihovny.cz E-zdroje

BACKGROUND: Ligand binding site prediction from protein structure has many applications related to elucidation of protein function and structure based drug discovery. It often represents only one step of many in complex computational drug design efforts. Although many methods have been published to date, only few of them are suitable for use in automated pipelines or for processing large datasets. These use cases require stability and speed, which disqualifies many of the recently introduced tools that are either template based or available only as web servers. RESULTS: We present P2Rank, a stand-alone template-free tool for prediction of ligand binding sites based on machine learning. It is based on prediction of ligandability of local chemical neighbourhoods that are centered on points placed on the solvent accessible surface of a protein. We show that P2Rank outperforms several existing tools, which include two widely used stand-alone tools (Fpocket, SiteHound), a comprehensive consensus based tool (MetaPocket 2.0), and a recent deep learning based method (DeepSite). P2Rank belongs to the fastest available tools (requires under 1 s for prediction on one protein), with additional advantage of multi-threaded implementation. CONCLUSIONS: P2Rank is a new open source software package for ligand binding site prediction from protein structure. It is available as a user-friendly stand-alone command line program and a Java library. P2Rank has a lightweight installation and does not depend on other bioinformatics tools or large structural or sequence databases. Thanks to its speed and ability to make fully automated predictions, it is particularly well suited for processing large datasets or as a component of scalable structural bioinformatics pipelines.

Zobrazit více v PubMed

Konc J, Janežiž D. Binding site comparison for function prediction and pharmaceutical discovery. Curr Opin Struct Biol. 2014;25:34–9. doi: 10.1016/j.sbi.2013.11.012. PubMed DOI

Zheng X, Gan L, Wang E, Wang J. Pocket-based drug design: exploring pocket space. AAPS J. 2013;15:228–241. doi: 10.1208/s12248-012-9426-6. PubMed DOI PMC

Pérot S, Sperandio O, Miteva M, Camproux A, Villoutreix B. Druggable pockets and binding site centric chemical space: a paradigm shift in drug discovery. Drug Discov Today. 2010;15(15–16):656–667. doi: 10.1016/j.drudis.2010.05.015. PubMed DOI

Tibaut T, Borišek J, Novič M, Turk D. Comparison of in silico tools for binding site prediction applied for structure-based design of autolysin inhibitors. SAR QSAR Environ Res. 2016;27(7):573–587. doi: 10.1080/1062936X.2016.1217271. PubMed DOI

Xie L, Xie L, Bourne PE. Structure-based systems biology for analyzing off-target binding. Curr Opin Struct Biol. 2011;21(2):189–99. doi: 10.1016/j.sbi.2011.01.004. PubMed DOI PMC

Grove Laurie E, Sandor Vajda DK. Computational methods to support fragment-based drug discovery. In: Fagerberg J, Mowery DC, Nelson RR, editors. Fragment-based drug discovery: lessons and outlook. Weinheim: Wiley; 2016. pp. 197–222.

Laurie A, Jackson R. Methods for the prediction of protein-ligand binding sites for structure-based drug design and virtual ligand screening. Curr Protein Peptide Sci. 2006;7(5):395–406. doi: 10.2174/138920306778559386. PubMed DOI

Feinstein WP, Brylinski M. Calculating an optimal box size for ligand docking and virtual screening against experimental and predicted binding pockets. J Cheminform. 2015;7(1):1–10. doi: 10.1186/s13321-015-0067-5. PubMed DOI PMC

Lionta E, Spyrou G, Cournia DKV. Zoe: structure-based virtual screening for drug discovery: principles, applications and recent advances. Curr Top Med Chem. 2014;14(16):1923–1938. doi: 10.2174/1568026614666140929124445. PubMed DOI PMC

Schomburg K, Bietz S, Briem H, Henzler A, Urbaczek S, Rarey M. Facing the challenges of structure-based target prediction by inverse virtual screening. J Chem Inf Model. 2014;54(6):1676–86. doi: 10.1021/ci500130e. PubMed DOI

Degac J, Winter U, Helms V. Graph-based clustering of predicted ligand-binding pockets on protein surfaces. J Chem Inf Model. 2015;55(9):1944–1952. doi: 10.1021/acs.jcim.5b00045. PubMed DOI

Meyers J, Brown N, Blagg J. Mapping the 3D structures of small molecule binding sites. J Cheminform. 2016;8(1):70. doi: 10.1186/s13321-016-0180-0. DOI

Monzon AM, Zea DJ, Fornasari MS, Saldaño TE, Fernandez-Alberti S, Tosatto SCE, Parisi G. Conformational diversity analysis reveals three functional mechanisms in proteins. PLOS Comput Biol. 2017;13(2):1–18. doi: 10.1371/journal.pcbi.1005398. PubMed DOI PMC

Shen Q, Cheng F, Song H, Lu W, Zhao J, An X, Liu M, Chen G, Zhao Z, Zhang J. Proteome-scale investigation of protein allosteric regulation perturbed by somatic mutations in 7000 cancer genomes. Am J Hum Genet. 2017;100(1):5–20. doi: 10.1016/j.ajhg.2016.09.020. PubMed DOI PMC

Bhagavat R, Sankar S, Srinivasan N, Chandra N. An augmented pocketome: detection and analysis of small-molecule binding pockets in proteins of known 3D structure. Structure. 2018;26(3):499–5122. doi: 10.1016/j.str.2018.02.001. PubMed DOI

Hussein H, Borrel A, Geneix C, Petitjean M, Regad L, Camproux A. PockDrug-Server: a new web server for predicting pocket druggability on holo and apo proteins. Nucleic Acids Res. 2015;43(W1):436–442. doi: 10.1093/nar/gkv462. PubMed DOI PMC

Huang W, Lu S, Huang Z, Liu X, Mou L, Luo Y, Zhao Y, Liu Y, Chen Z, Hou T, Zhang J. Allosite: a method for predicting allosteric sites. Bioinformatics. 2013;29(18):2357–2359. doi: 10.1093/bioinformatics/btt399. PubMed DOI

Le Guilloux V, Schmidtke P, Tuffery P. Fpocket: an open source platform for ligand pocket detection. BMC Bioinform. 2009;10(1):168. doi: 10.1186/1471-2105-10-168. PubMed DOI PMC

Henrich S, Outi S, Huang B, Rippmann F, Cruciani G, Wade R. Computational approaches to identifying and characterizing protein binding sites for ligand design. J Mol Recognit JMR. 2010;23(2):209–219. PubMed

Leis S, Schneider S, Zacharias M. In silico prediction of binding sites on proteins. Curr Med Chem. 2010;17(15):1550–1562. doi: 10.2174/092986710790979944. PubMed DOI

Chen K, Mizianty M, Gao J, Kurgan L. A critical comparative assessment of predictions of protein-binding sites for biologically relevant organic compounds. Structure (London, England : 1993) 2011;19(5):613–621. doi: 10.1016/j.str.2011.02.015. PubMed DOI

Fauman EB, Rai BK, Huang ES. Structure-based druggability assessment-identifying suitable targets for small molecule therapeutics. Curr Opin Chem Biol. 2011;15(4):463–468. doi: 10.1016/j.cbpa.2011.05.020. PubMed DOI

Roche DB, Brackenridge DA, McGuffin LJ. Proteins and their interacting partners: an introduction to protein-ligand binding site prediction methods. Int J Mol Sci. 2015;16(12):29829–29842. doi: 10.3390/ijms161226202. PubMed DOI PMC

Broomhead NK, Soliman ME. Can we rely on computational predictions to correctly identify ligand binding sites on novel protein drug targets? Assessment of binding site prediction methods and a protocol for validation of predicted binding sites. Cell Biochem Biophys. 2017;75(1):15–23. doi: 10.1007/s12013-016-0769-y. PubMed DOI

Simões T, Lopes D, Dias S, Fernandes F, Pereira J, Jorge J, Bajaj C, Gomes A (2017) Geometric detection algorithms for cavities on protein surfaces in molecular graphics: a survey. In: Computer graphics forum PubMed PMC

Krivak R, Hoksza D. Improving protein-ligand binding site prediction accuracy by classification of inner pocket points using local features. J Cheminform. 2015;7(1):12. doi: 10.1186/s13321-015-0059-5. PubMed DOI PMC

Zhang Z, Li Y, Lin B, Schroeder M, Huang B. Identification of cavities on protein surface using multiple computational approaches for drug binding site prediction. Bioinformatics (Oxford, England) 2011;27(15):2083–2088. doi: 10.1093/bioinformatics/btr331. PubMed DOI

Ghersi D, Sanchez R. EasyMIFS and SiteHound: a toolkit for the identification of ligand-binding sites in protein structures. Bioinformatics (Oxford, England) 2009;25(23):3185–3186. doi: 10.1093/bioinformatics/btp562. PubMed DOI PMC

Kauffman C, Karypis G. Librus: combined machine learning and homology information for sequence-based ligand-binding residue prediction. Bioinformatics (Oxford, England) 2009;25(23):3099–107. doi: 10.1093/bioinformatics/btp561. PubMed DOI PMC

Qiu Z, Wang X. Improved prediction of protein ligand-binding sites using random forests. Protein Peptide Lett. 2011;18(12):1212–1218. doi: 10.2174/092986611797642788. PubMed DOI

Chen P, Huang JZ, Gao X. Ligandrfs: random forest ensemble to identify ligand-binding residues from sequence information alone. BMC Bioinform. 2014;15(Suppl 15):4. doi: 10.1186/1471-2105-15-S15-S4. PubMed DOI PMC

Jian JW, Elumalai P, Pitti T, Wu CY, Tsai KC, Chang JY, Peng HP, Yang AS. Predicting ligand binding sites on protein surfaces by 3-Dimensional probability density distributions of interacting atoms. PLoS ONE. 2016;11(8):0160315. PubMed PMC

Jiménez J, Doerr S, Martínez-Rosell G, Rose AS, De Fabritiis G. Deepsite: protein-binding site predictor using 3D-convolutional neural networks. Bioinformatics. 2017;33(19):3036–3042. doi: 10.1093/bioinformatics/btx350. PubMed DOI

Nayal M, Honig B. On the nature of cavities on protein surfaces: application to the identification of drug-binding sites. Proteins. 2006;63(4):892–906. doi: 10.1002/prot.20897. PubMed DOI

Halgren TA. Identifying and characterizing binding sites and assessing druggability. J Chem Inf Model. 2009;49(2):377–389. doi: 10.1021/ci800324m. PubMed DOI

Capra JA, Laskowski RA, Thornton JM, Singh M, Funkhouser TA. Predicting protein ligand binding sites by combining evolutionary sequence conservation and 3D structure. PLoS Comput Biol. 2009;5(12):1000585. doi: 10.1371/journal.pcbi.1000585. PubMed DOI PMC

Wass MN, Kelley LA, Sternberg MJ. 3DLigandSite: predicting ligand-binding sites using similar structures. Nucleic Acids Res. 2017;38(Web Server issue):469–73. PubMed PMC

Yu J, Zhou Y, Tanaka I, Yao M. Roll: a new algorithm for the detection of protein pockets and cavities with a rolling probe sphere. Bioinformatics. 2010;26(1):46–52. doi: 10.1093/bioinformatics/btp599. PubMed DOI

Volkamer A, Griewel A, Grombacher T, Rarey M. Analyzing the topology of active sites: on the prediction of pockets and subpockets. J Chem Inf Model. 2010;50(11):2041–52. doi: 10.1021/ci100241y. PubMed DOI

Ngan CH, Hall DR, Zerbe B, Grove LE, Kozakov D, Vajda S. FTSite: high accuracy detection of ligand binding sites on unbound protein structures. Bioinformatics. 2012;28(2):286–7. doi: 10.1093/bioinformatics/btr651. PubMed DOI PMC

Xie Z, Hwang M. Ligand-binding site prediction using ligand-interacting and binding site-enriched protein triangles. Bioinformatics. 2012;28(12):1579–1585. doi: 10.1093/bioinformatics/bts182. PubMed DOI

Roy A, Yang J, Zhang Y. Cofactor: an accurate comparative algorithm for structure-based protein function annotation. Nucleic Acids Res. 2012;40(W1):471–477. doi: 10.1093/nar/gks372. PubMed DOI PMC

Yang J, Roy A, Zhang Y. Protein-ligand binding site recognition using complementary binding-specific substructure comparison and sequence profile alignment. Bioinformatics. 2013;29(20):2588–2595. doi: 10.1093/bioinformatics/btt447. PubMed DOI PMC

Lee HS, Im W. Ligand binding site detection by local structure alignment and its performance complementarity. J Chem Inf Model. 2013;53(9):2462–2470. doi: 10.1021/ci4003602. PubMed DOI PMC

Brylinski M, Feinstein WP. eFindSite: improved prediction of ligand binding sites in protein models using meta-threading, machine learning and auxiliary ligands. J Comput Aided Mol Des. 2013;27(6):551–567. doi: 10.1007/s10822-013-9663-5. PubMed DOI

Heo L, Shin W, Lee M, Seok C. GalaxySite: ligand-binding-site prediction by using molecular docking. Nucleic Acids Res. 2014;42(W1):210–214. doi: 10.1093/nar/gku321. PubMed DOI PMC

Viet Hung L, Caprari S, Bizai M, Toti D, Polticelli F. Libra: ligand binding site recognition application. Bioinformatics. 2015;31(24):4020–4022. PubMed

Gao J, Zhang Q, Liu M, Zhu L, Wu D, Cao Z, Zhu R. bSiteFinder, an improved protein-binding sites prediction server based on structural alignment: more accurate and less time-consuming. J Cheminform. 2016;8(1):38. doi: 10.1186/s13321-016-0149-z. PubMed DOI PMC

Krivák R, Hoksza D (2015) In: Dediu A-H, Hernández-Quiroz F, Martín-Vide C, Rosenblueth AD (eds) P2RANK: knowledge-based ligand binding site prediction using aggregated local features. Springer, Cham, pp 41–52

Huang B, Schroeder M. Ligsitecsc: predicting ligand binding sites using the connolly surface and degree of conservation. BMC Struct Biol. 2006;6(1):19. doi: 10.1186/1472-6807-6-19. PubMed DOI PMC

Laskowski RA, Watson JD, Thornton JM. Profunc: a server for predicting protein function from 3D structure. Nucleic Acids Res. 2005;33:89–93. doi: 10.1093/nar/gki414. PubMed DOI PMC

Brylinski M, Skolnick J. A threading-based method (FINDSITE) for ligand-binding site prediction and functional annotation. Proc Natl Acad Sci USA. 2008;105(1):129–134. doi: 10.1073/pnas.0707684105. PubMed DOI PMC

Skolnick J, Brylinski M. FINDSITE: a combined evolution/structure-based approach to protein function prediction. Briefings Bioinform. 2009;10(4):378–391. doi: 10.1093/bib/bbp017. PubMed DOI PMC

Lee J, Freddolino PL, Zhang Y (2017) In: Rigden DJ (ed) Ab initio protein structure prediction. Springer, Dordrecht, pp 3–35

Karanicolas J, Corn J, et al. A de novo protein binding pair by computational design and directed evolution. Mol Cell. 2011;42(2):250–260. doi: 10.1016/j.molcel.2011.03.010. PubMed DOI PMC

Damborsky J, Brezovsky J. Computational tools for designing and engineering enzymes. Curr Opin Chem Biol. 2014;19(Supplement C):8–16. doi: 10.1016/j.cbpa.2013.12.003. PubMed DOI

Wang M, Zhao H (2016) In: Stoddard BL (ed) Combined and iterative use of computational design and directed evolution for protein–ligand binding design. Springer, New York, pp 139–153 PubMed

Di Pietro O, Juárez-Jiménez J, Muñoz-Torrero D, Laughton CA, Luque FJ. Unveiling a novel transient druggable pocket in bace-1 through molecular simulations: conformational analysis and binding mode of multisite inhibitors. PLOS ONE. 2017;12(5):1–22. doi: 10.1371/journal.pone.0177683. PubMed DOI PMC

Gallo Cassarino T, Bordoli L, Schwede T. Assessment of ligand binding site predictions in CASP10. Proteins Struct Funct Bioinform. 2014;82:154–163. doi: 10.1002/prot.24495. PubMed DOI PMC

Haas J, Roth S, Arnold K, Kiefer F, Schmidt T, Bordoli L, Schwede T. The protein model portal-a comprehensive resource for protein structure and model information. Database. 2013;2013:031. doi: 10.1093/database/bat031. PubMed DOI PMC

Ma B, Shatsky M, Wolfson HJ, Nussinov R. Multiple diverse ligands binding at a single protein site: a matter of pre-existing populations. Protein Sci. 2002;11(2):184–197. doi: 10.1110/ps.21302. PubMed DOI PMC

Schmidtke P, Axel B, Luque F, Barril X. MDpocket: open-source cavity detection and characterization on molecular dynamics trajectories. Bioinformatics (Oxford, England) 2011;27(23):3276–3285. doi: 10.1093/bioinformatics/btr550. PubMed DOI

Stank A, Kokh DB, Horn M, Sizikova E, Neil R, Panecka J, Richter S, Wade RC. Trapp webserver: predicting protein binding site flexibility and detecting transient binding pockets. Nucleic Acids Res. 2017;45(W1):325–330. doi: 10.1093/nar/gkx277. PubMed DOI PMC

Schrödinger LLC (2015) The PyMOL molecular graphics system, version 1.8

Desaphy J, Bret G, Rognan D, Kellenberger E. sc-PDB: a 3D-database of ligandable binding sites-10 years on. Nucleic Acids Res. 2015;43(D1):399–404. doi: 10.1093/nar/gku928. PubMed DOI PMC

Ragoza M, Hochuli J, Idrobo E, Sunseri J, Koes DR. Protein-ligand scoring with convolutional neural networks. J Chem Inf Model. 2017;57(4):942–957. doi: 10.1021/acs.jcim.6b00740. PubMed DOI PMC

Ragoza M, Turner L, Koes DR (2017) Ligand pose optimization with atomic grid-based convolutional neural networks. ArXiv e-prints

Schmidtke P (2011) Protein-ligand binding sites. Identification, characterization and interrelations. Ph.D. thesis, University of Barcelona

Eisenhaber F, Lijnzaad P, Argos P, Sander C, Scharf M. The double cubic lattice method: Efficient approaches to numerical integration of surface area and volume and to dot surface contouring of molecular assemblies. J Comput Chem. 1995;16(3):273–284. doi: 10.1002/jcc.540160303. DOI

Steinbeck C, Han Y, Kuhn S, Horlacher O, Luttmann E, Willighagen E. The chemistry development kit (CDK): An open-source Java library for chemo- and bioinformatics. J Chem Inf Comput Sci. 2003;43(2):493–500. doi: 10.1021/ci025584y. PubMed DOI PMC

Morita M, Nakamura S, Shimizu K. Highly accurate method for ligand-binding site prediction in unbound state (apo) protein structures. Proteins. 2008;73(2):468–79. doi: 10.1002/prot.22067. PubMed DOI

Kyte J, Doolittle RF. A simple method for displaying the hydropathic character of a protein. J Mol Biol. 1982;157(1):105–132. doi: 10.1016/0022-2836(82)90515-0. PubMed DOI

Desaphy J, Azdimousa K, Kellenberger E, Rognan D. Comparison and druggability prediction of protein-ligand binding sites from pharmacophore-annotated cavity shapes. J Chem Inf Model. 2012;52(8):2287–2299. doi: 10.1021/ci300184x. PubMed DOI

Kapcha LH, Rossky PJ. A simple atomic-level hydrophobicity scale reveals protein interfacial structure. J Mol Biol. 2014;426(2):484–498. doi: 10.1016/j.jmb.2013.09.039. PubMed DOI

Khazanov NA, Carlson HA. Exploring the composition of protein-ligand binding sites on a large scale. PLoS Comput Biol. 2013;9(11):1003321. doi: 10.1371/journal.pcbi.1003321. PubMed DOI PMC

Pintar A, Carugo O, Pongor S. Cx, an algorithm that identifies protruding atoms in proteins. Bioinformatics. 2002;18(7):980–984. doi: 10.1093/bioinformatics/18.7.980. PubMed DOI

Murzin AG, Brenner SE, Hubbard T, Chothia C. Scop: a structural classification of proteins database for the investigation of sequences and structures. J Mol Biol. 1995;247(4):536–540. PubMed

Hartshorn M, Verdonk M, Chessari G, Brewerton S, Mooij W, Mortenson P, Murray C. Diverse, high-quality test set for the validation of protein-ligand docking performance. J Med Chem. 2007;50(4):726–741. doi: 10.1021/jm061277y. PubMed DOI

Schmidtke P, Souaille C, Estienne F, Baurin N, Kroemer R. Large-scale comparison of four binding site detection algorithms. J Chem Inf Model. 2010;50(12):2191–200. doi: 10.1021/ci1000289. PubMed DOI

Hu L, Benson ML, Smith RD, Lerner MG, Carlson HA. Binding moad (mother of all databases) Proteins Struct Funct Bioinform. 2005;60(3):333–340. doi: 10.1002/prot.20512. PubMed DOI

Zhu H, Pisabarro MT. MSPocket: an orientation-independent algorithm for the detection of ligand binding pockets. Bioinformatics. 2011;27(3):351–358. doi: 10.1093/bioinformatics/btq672. PubMed DOI

Nejnovějších 20 citací...

Zobrazit více v
Medvik | PubMed

CryptoBench: cryptic protein-ligand binding sites dataset and benchmark

. 2024 Dec 26 ; 41 (1) : .

Analysis of mutations in precision oncology using the automated, accurate, and user-friendly web tool PredictONCO

. 2024 Dec ; 24 () : 734-738. [epub] 20241114

Large-scale annotation of biochemically relevant pockets and tunnels in cognate enzyme-ligand complexes

. 2024 Oct 15 ; 16 (1) : 114. [epub] 20241015

A computational workflow for analysis of missense mutations in precision oncology

. 2024 Jul 29 ; 16 (1) : 86. [epub] 20240729

Installation of LYRM proteins in early eukaryotes to regulate the metabolic capacity of the emerging mitochondrion

. 2024 May ; 14 (5) : 240021. [epub] 20240522

PredictONCO: a web tool supporting decision-making in precision oncology by extending the bioinformatics predictions with advanced computing and machine learning

. 2023 Nov 22 ; 25 (1) : .

PrankWeb 3: accelerated ligand-binding site predictions for experimental and modelled protein structures

. 2022 Jul 05 ; 50 (W1) : W593-W597.

PrankWeb: a web server for ligand binding site prediction and visualization

. 2019 Jul 02 ; 47 (W1) : W345-W349.

Najít záznam

Citační ukazatele

Nahrávání dat ...

Možnosti archivace

Nahrávání dat ...