• This record comes from PubMed

Prediction of DNA-binding propensity of proteins by the ball-histogram method using automatic template search

. 2012 Jun 25 ; 13 Suppl 10 (Suppl 10) : S3. [epub] 20120625

Language English Country Great Britain, England Media electronic

Document type Journal Article, Research Support, Non-U.S. Gov't

Links

PubMed 22759427
PubMed Central PMC3382442
DOI 10.1186/1471-2105-13-s10-s3
PII: 1471-2105-13-S10-S3
Knihovny.cz E-resources

We contribute a novel, ball-histogram approach to DNA-binding propensity prediction of proteins. Unlike state-of-the-art methods based on constructing an ad-hoc set of features describing physicochemical properties of the proteins, the ball-histogram technique enables a systematic, Monte-Carlo exploration of the spatial distribution of amino acids complying with automatically selected properties. This exploration yields a model for the prediction of DNA binding propensity. We validate our method in prediction experiments, improving on state-of-the-art accuracies. Moreover, our method also provides interpretable features involving spatial distributions of selected amino acids.

See more in PubMed

Ohlendorf DH, Matthew JB. Electrostatics and flexibility in protein-DNA interactions. Advances in Biophysics. 1985;20:137–151. PubMed

Stawiski EW, Gregoret LM, Mandel-Gutfreund Y. Annotating Nucleic Acid-Binding Function Based on Protein Structure. Journal of Molecular Biology. 2003;326(4):1065–1079. doi: 10.1016/S0022-2836(03)00031-7. PubMed DOI

Jones S, Shanahan HP, Berman HM, Thornton JM. Using electrostatic potentials to predict DNA-binding sites on DNA-binding proteins. Nucleic Acids Research. 2003;31(24):7189–7198. doi: 10.1093/nar/gkg922. PubMed DOI PMC

Tsuchiya Y, Kinoshita K, Nakamura H. Structure-based prediction of DNA-binding sites on proteins Using the empirical preference of electrostatic potential and the shape of molecular surfaces. Proteins: Structure, Function, and Bioinformatics. 2004;55(4):885–894. doi: 10.1002/prot.20111. PubMed DOI

Ahmad S, Sarai A. Moment-based Prediction of DNA-binding Proteins. Journal of Molecular Biology. 2004;341:65–71. doi: 10.1016/j.jmb.2004.05.058. PubMed DOI

Bhardwaj N, Langlois RE, Zhao G, Lu H. Kernel-based machine learning protocol for predicting DNA-binding proteins. Nucleic Acids Research. 2005;33(20):6486–6493. doi: 10.1093/nar/gki949. PubMed DOI PMC

Szilágyi A, Skolnick J. Efficient Prediction of Nucleic Acid Binding Function from Low-resolution Protein Structures. Journal of Molecular Biology. 2006;358(3):922–933. doi: 10.1016/j.jmb.2006.02.053. PubMed DOI

Nimrod G, Szilágyi A, Leslie C, Ben-Tal N. Identification of DNA-binding proteins using structural, electrostatic and evolutionary features. Journal of Molecular Biology. 2009;387(4):1040–53. doi: 10.1016/j.jmb.2009.02.023. http://www.ncbi.nlm.nih.gov/pubmed/19233205 PubMed DOI PMC

Cathomen T, Joung J. Zinc-Finger Nucleases: The Next Generation Emerges. Molecular Therapy. 2008;16 PubMed

Breiman L. Random Forests. Machine Learning. 2001;45:5–32. doi: 10.1023/A:1010933404324. DOI

Caruana R, Karampatziakis N, Yessenalina A. An empirical evaluation of supervised learning in high dimensions. International Conference on Machine Learning (ICML) 2008. pp. 96–103.

Lavrač N, Flach PA. An extended transformation approach to inductive logic programming. ACM Transactions on Computational Logic. 2001;2:458–494. doi: 10.1145/383779.383781. DOI

Hastie T, Tibshirani R, Friedman J. The Elements of Statistical Learning. Springer; 2001.

Pabo CO, Sauer RT. Transcription factors: structural families and principles of DNA recognition. Annual review of biochemistry. 1992;61:1053–1095. doi: 10.1146/annurev.bi.61.070192.005201. PubMed DOI

Mandel-Gutfreund Y, Schueler O, Margalit H. Comprehensive analysis of hydrogen bonds in regulatory protein DNA-complexes: in search of common principles. Journal of Molecular Biology. 1995;253(2):370–382. doi: 10.1006/jmbi.1995.0559. PubMed DOI

Jones S, van Heyningen P, Berman HM, Thornton JM. Protein-DNA interactions: a structural analysis. Journal of Molecular Biology. 1999;287(5):877–896. doi: 10.1006/jmbi.1999.2659. PubMed DOI

Szabóová A, Kuzelka O, Morales SE, Železný F, Tolar J. Prediction of DNA-binding Propensity of Proteins by the Ball-Histogram Method. ISBRA 2011: Bioinformatics Research and Applications 7th International Symposium. 2011. pp. 358–367.

Bhattacharyya A. On a measure of divergence between two statistical populations defined by their probability distributions. Bulletin of the Calcutta Mathematical Society 35. 1943. pp. 99–109.

Burges CJC. A Tutorial on Support Vector Machines for Pattern Recognition. Data Mining and Knowledge Discovery. 1998;2(2):121–167. doi: 10.1023/A:1009715923555. DOI

Hosmer DW, Lemeshow S. Applied logistic regression (Wiley Series in probability and statistics) Wiley-Interscience Publication; 2000.

Sathyapriya R, Vijayabaskar MS, Vishveshwara S. Insights into Protein-DNA Interactions through Structure Network Analysis. PLoS Comput Biol. 2008;4(9):e1000170. doi: 10.1371/journal.pcbi.1000170. PubMed DOI PMC

Moreland J, Gramada A, Buzko O, Zhang Q, Bourne P. The Molecular Biology Toolkit (MBT): A Modular Platform for Developing Molecular Visualization Applications. BMC Bioinformatics. 2005;6:21. doi: 10.1186/1471-2105-6-21. PubMed DOI PMC

Find record

Citation metrics

Loading data ...

Archiving options

Loading data ...