Sequence-Specific Recognition of DNA by Proteins: Binding Motifs Discovered Using a Novel Statistical/Computational Analysis
Jazyk angličtina Země Spojené státy americké Médium electronic-ecollection
Typ dokumentu časopisecké články
PubMed
27384774
PubMed Central
PMC4934765
DOI
10.1371/journal.pone.0158704
PII: PONE-D-16-12829
Knihovny.cz E-zdroje
- MeSH
- adenin chemie metabolismus MeSH
- aminokyselinové motivy * MeSH
- aminokyseliny chemie metabolismus MeSH
- cytosin chemie metabolismus MeSH
- databáze proteinů MeSH
- DNA vazebné proteiny chemie genetika metabolismus MeSH
- DNA chemie genetika metabolismus MeSH
- guanin chemie metabolismus MeSH
- konformace nukleové kyseliny MeSH
- krystalografie rentgenová MeSH
- molekulární modely MeSH
- statistika jako téma metody MeSH
- terciární struktura proteinů MeSH
- termodynamika MeSH
- thymin chemie metabolismus MeSH
- vazba proteinů MeSH
- vazebná místa genetika MeSH
- výpočetní biologie metody MeSH
- Publikační typ
- časopisecké články MeSH
- Názvy látek
- adenin MeSH
- aminokyseliny MeSH
- cytosin MeSH
- DNA vazebné proteiny MeSH
- DNA MeSH
- guanin MeSH
- thymin MeSH
Decades of intensive experimental studies of the recognition of DNA sequences by proteins have provided us with a view of a diverse and complicated world in which few to no features are shared between individual DNA-binding protein families. The originally conceived direct readout of DNA residue sequences by amino acid side chains offers very limited capacity for sequence recognition, while the effects of the dynamic properties of the interacting partners remain difficult to quantify and almost impossible to generalise. In this work we investigated the energetic characteristics of all DNA residue-amino acid side chain combinations in the conformations found at the interaction interface in a very large set of protein-DNA complexes by the means of empirical potential-based calculations. General specificity-defining criteria were derived and utilised to look beyond the binding motifs considered in previous studies. Linking energetic favourability to the observed geometrical preferences, our approach reveals several additional amino acid motifs which can distinguish between individual DNA bases. Our results remained valid in environments with various dielectric properties.
EMBL EBI Wellcome Trust Genome Campus Hinxton Cambridge United Kingdom
Institute of Organic Chemistry and Biochemistry Prague 6 Czech Republic
Zobrazit více v PubMed
Smith ML, Chen IT, Zhan Q, O’Connor PM, Fornace AJ Jr. Involvement of the p53 tumor suppressor in repair of UV-type DNA damage. Oncogene. 1995;10: 1053–1059. PubMed
Drabløs F, Feyzi E, Aas PA, Vaagbø CB, Kavli B, Bratlie MS, et al. Alkylation damage in DNA and RNA-repair mechanisms and medical significance. DNA repair. 2004;3: 1389–1407. 10.1016/j.dnarep.2004.05.004 PubMed DOI
Stojic L, Brun R, Jiricny J. Mismatch repair and DNA damage signalling. DNA repair. 2004;3: 1091–1101. 10.1016/j.dnarep.2004.06.006 PubMed DOI
Davey CA, Sargent DF, Luger K, Maeder AW, Richmond TJ. Solvent mediated interactions in the structure of the nucleosome core particle at 1.9 Å resolution. J. Mol. Biol. 2002;319: 1097–1113. 10.1016/S0022-2836(02)00386-8 PubMed DOI
Balasubramanian S, Xu F, Olson WK. DNA sequence-directed organization of chromatin: structure-based computational analysis of nucleosome-binding sequences. Biophys. J. 2009;96: 2245–2260. 10.1016/j.bpj.2008.11.040 PubMed DOI PMC
Battistini F, Hunter CA, Moore IK, Widom J. Structure-based identification of new high-affinity nucleosome binding sequences. J. Mol. Biol. 2012;420: 8–16. 10.1016/j.jmb.2012.03.026 PubMed DOI
Bernstein FC, Koetzle TF, Williams GJ, Meyer EF, Brice MD, Rodgers JR, et al. The Protein Data Bank. A computer-based archival file for macromolecular structures. Eur. J. Biochem. 1977;80: 319–324. 10.1111/j.1432-1033.1977.tb11885.x PubMed DOI
Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, et al. The Protein Data Bank. Nucleic Acids Res. 2000;28: 235–242. 10.1093/nar/28.1.235 PubMed DOI PMC
Badis G, Chan ET, van Bakel H, Pena-Castillo L, Tillo D, Tsui K, et al. A library of yeast transcription factor motifs reveals a widespread function for Rsc3 in targeting nucleosome exclusion at promoters. Mol. Cell. 2008;32: 878–887. 10.1016/j.molcel.2008.11.020 PubMed DOI PMC
Zhu C, Byers KJRP, McCord RP, Shi Z, Berger MF, Newburger DE, et al. High-resolution DNA-binding specificity analysis of yeast transcription factors. Genome Res. 2009;19: 556–566. 10.1101/gr.090233.108 PubMed DOI PMC
Rohs R, Jin X, West SM, Joshi R, Honig B, Mann RS. Origins of specificity in protein—DNA recognition. Annu. Rev. Biochem. 2010;79: 233–269. 10.1146/annurev-biochem-060408-091030 PubMed DOI PMC
Seeman NC, Rosenberg JM, Rich A. Sequence-specific recognition of double helical nucleic acids by proteins. Proc. Natl. Acad. Sci. U. S. A. 1976;73: 804–808. 10.1073/pnas.73.3.804 PubMed DOI PMC
Kim Y, Geiger JH, Hahn S, Sigler PB. Crystal structure of a yeast TBP/TATA-box complex. Nature. 1993;365: 512–520. 10.1038/365512a0 PubMed DOI
Otwinowski Z, Schevitz RW, Zhang RG, Lawson CL, Joachimiak A, Marmorstein RQ, et al. Crystal structure of trp repressor/operator complex at atomic resolution. Nature. 1988;335: 321–239. 10.1038/335321a0 PubMed DOI
Hegde RS, Grossman SR, Laimins LA, Sigler PB. Crystal structure at 1.7 Å of the bovine papillomavirus-1 E2 DNA-binding domain bound to its DNA target. Nature. 1992;359: 505–512. 10.1038/359505a0 PubMed DOI
Rohs R, West SM, Liu P, Honig B. Nuance in the double-helix and its role in protein—DNA recognition. Curr. Opin. Struct. Biol. 2009;19: 171–177. 10.1016/j.sbi.2009.03.002 PubMed DOI PMC
Rohs R, West SM, Sosinsky A, Liu P, Mann RS, Honig B. The role of DNA shape in protein—DNA recognition. Nature. 2009;461: 1248–1253. 10.1038/nature08473 PubMed DOI PMC
Parker SCJ, Hansen L, Abaan HO, Tullius TD, Margulies EH. Local DNA topography correlates with functional noncoding regions of the human genome. Science. 2009;324: 389–392. 10.1126/science.1169050 PubMed DOI PMC
Shakked Z, Guerstein-Guzikevich G, Eisenstein M, Frolow F, Rabinovich D. The conformation of the DNA double helix in the crystal is dependent on its environment. Nature. 1989;342: 456–460. 10.1038/342456a0 PubMed DOI
Jones S, van Heyningen P, Berman HM, Thornton JM. Protein-DNA interactions: A structural analysis. J. Mol. Biol. 1999;287: 877–896. 10.1006/jmbi.1999.2659 PubMed DOI
Berger MF, Badis G, Gehrke AR, Talukder S, Philippakis AA, Peña-Castillo L, et al. Variation in Homeodomain DNA Binding Revealed by High-Resolution Analysis of Sequence Preferences. Cell. 2008;133: 1266–1276. 10.1016/j.cell.2008.05.024 PubMed DOI PMC
Gaj T, Gersbach CA, Barbas CF. ZFN, TALEN, and CRISPR/Cas-based methods for genome engineering. Trends Biotechnol. 2013;31: 397–405. 10.1016/j.tibtech.2013.04.004 PubMed DOI PMC
Akinrimisi E, Ts’o POP. Interactions of Purine with Proteins and Amino Acids. Biochemistry. 1964;3: 619–626. 10.1021/bi00893a004 PubMed DOI
Thomas PD, Podder SK. Specificity in protein–nucleic acid interaction. FEBS Lett. 1978;96: 90–94. 10.1016/0014-5793(78)81069-2 PubMed DOI
Berg OG, von Hippel PH. Selection of DNA Binding Sites by Regulatory Proteins: Statistical-mechanical Theory and Application to Operators and Promoters. J Mol Biol. 1987;193: 723–743. 10.1016/0022-2836(87)90354-8 PubMed DOI
Mandel-Gutfreund Y, Margalit H. Quantitative parameters for amino acid–base interaction: implications for prediction of protein–DNA binding sites. Nucleic Acids Res. 1998;26: 2306–2312. 10.1093/nar/26.10.2306 PubMed DOI PMC
Luscombe NM, Laskowski RA, Thornton JM. Amino acid–base interactions: a three-dimensional analysis of protein–DNA interactions at an atomic level. Nucleic Acids Res. 2001;29: 2860–2874. 10.1093/nar/29.13.2860 PubMed DOI PMC
Dror I, Zhou T, Mandel-Gutfreund Y, Rohs R. Covariation between homeodomain transcription factors and the shape of their DNA binding sites. Nucleic Acids Res. 2014;42: 430–441. 10.1093/nar/gkt862 PubMed DOI PMC
Yang L, Zhou T, Dror I, Mathelier A, Wasserman WW, Gordân R, Rohs R. TFBSshape: a motif database for DNA shape features of transcription factor binding sites. Nucleic Acids Res. 2014;42: D148–D155. 10.1093/nar/gkt1087 PubMed DOI PMC
Contreras-Moreira B. 3D-footprint: a database for the structural analysis of protein–DNA complexes. Nucleic Acids Res. 2010;38: D91–D97. 10.1093/nar/gkp781 PubMed DOI PMC
Prabakaran P, An J, Gromiha MM, Selvaraj S, Uedaira H, Kono H, et al. Thermodynamic database for protein-nucleic acid interactions (ProNIT). Bioinformatics. 2001;17: 1027–1034. 10.1093/bioinformatics/17.11.1027 PubMed DOI
Kiliç S, White ER, Sagitova DM, Cornish JP, Erill I. CollecTF: A database of experimentally validated transcription factor-binding sites in Bacteria. Nucleic Acids Res. 2014;42: 156–160. 10.1093/nar/gkt1123 PubMed DOI PMC
Wingender E, Dietze P, Karas H, Knüppel R. TRANSFAC: A database on transcription factors and their DNA binding sites. Nucleic Acids Res. 1996;24: 238–241. 10.1093/nar/24.1.238 PubMed DOI PMC
Bonaccorsi R, Pullman A, Scrocco E, Tomasi J. The molecular electrostatic potentials for the nucleic acid bases: Adenine, Thymine, and Cytosine. Theor. Chim. Acta. 1972;24: 51–60. 10.1007/BF00528310 DOI
Perahia D, Pullman A. The molecular electrostatic potentials of the complementary base pairs of DNA. Theor. Chim. Acta. 1978;48: 263–266. 10.1007/BF00549025 DOI
Šponer J, Hobza P. Nonplanar geometries of DNA bases. Ab initio second-order Møller-Plesset study. J. Phys. Chem. 1994;98: 3161–3164. 10.1021/j100063a019 DOI
Hobza P, Šponer J. Toward true DNA base-stacking energies: MP2, CCSD(T), and complete basis set calculations. J. Am. Chem. Soc. 2002;124: 11802–11808. 10.1021/ja026759n PubMed DOI
Jurečka P, Šponer J, Černý J, Hobza P. Benchmark database of accurate (MP2 and CCSD(T) complete basis set limit) interaction energies of small model complexes, DNA base pairs, and amino acid pairs. Phys. Chem. Chem. Phys. PCCP. 2006;8: 1985–1993. 10.1039/B600027D PubMed DOI
de Ruiter A, Zagrovic B. Absolute binding-free energies between standard RNA/DNA nucleobases and amino-acid sidechain analogs in different environments. Nucleic Acids Res. 2014;43: 708–718. 10.1093/nar/gku1344 PubMed DOI PMC
Pichierri F, Aida M, Gromiha MM, Sarai A. Free-Energy Maps of Base–Amino Acid Interactions for DNA–Protein Recognition. J. Am. Chem. Soc. 1999;121: 6152–6157. 10.1021/ja984124b DOI
Jakubec D, Hostaš J, Laskowski RA, Hobza P, Vondrášek J. Large-Scale Quantitative Assessment of Binding Preferences in Protein–Nucleic Acid Complexes. J. Chem. Theory Comput. 2015;11: 1939–1948. 10.1021/ct501168n PubMed DOI
Hostaš J, Jakubec D, Laskowski RA, Gnanasekaran R, Řezáč J, Vondrášek J, et al. Representative Amino Acid Side-Chain Interactions in Protein–DNA Complexes: A Comparison of Highly Accurate Correlated Ab Initio Quantum Mechanical Calculations and Efficient Approaches for Applications to Large Systems. J. Chem. Theory Comput. 2015;11: 4086–4092. 10.1021/acs.jctc.5b00398 PubMed DOI
Wang G, Dunbrack RL. PISCES: A protein sequence culling server. Bioinformatics. 2003;19: 1589–1591. 10.1093/bioinformatics/btg224 PubMed DOI
Singh J, Thornton JM. Atlas of Protein Side-Chain Interactions, Vols. I & II Oxford: IRL press; 1992.
Singh J, Thornton JM. SIRIUS. An automated method for the analysis of the preferred packing arrangements between protein groups. J. Mol. Biol. 1990;211: 595–615. 10.1016/0022-2836(90)90268-Q PubMed DOI
Humphrey W, Dalke A, Schulten K. VMD: Visual molecular dynamics. J. Mol. Graph. 1996;14: 33–38. 10.1016/0263-7855(96)00018-5 PubMed DOI
Needleman SB, Wunsch CD. A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol. 1970;48: 443–453. 10.1016/0022-2836(70)90057-4 PubMed DOI
Henikoff S, Henikoff JG. Amino acid substitution matrices from protein blocks. Proc. Natl. Acad. Sci. U. S. A. 1992;89: 10915–10919. 10.1073/pnas.89.22.10915 PubMed DOI PMC
Rice P, Longden I, Bleasby A. EMBOSS: the European Molecular Biology Open Software Suite. Trends Genet. 2000;16: 276–277. 10.1016/S0168-9525(00)02024-2 PubMed DOI
Berka K, Laskowski RA, Riley KE, Hobza P, Vondrášek J. Representative amino acid side chain interactions in proteins. A comparison of highly accurate correlated ab initio quantum chemical and empirical potential procedures. J. Chem. Theory Comput. 2009;5: 982–992. 10.1021/ct800508v PubMed DOI
Cornell WD, Cieplak P, Bayly CI, Gould IR, Merz KM, Ferguson DM, et al. A Second Generation Force Field for the Simulation of Proteins, Nucleic Acids, and Organic Molecules. J. Am. Chem. Soc. 1995;117: 5179–5197. 10.1021/ja00124a002 DOI
Pettersen EF, Goddard TD, Huang CC, Couch GS, Greenblatt DM, Meng EC, et al. UCSF Chimera—A visualization system for exploratory research and analysis. J. Comput. Chem. 2004;25: 1605–1612. 10.1002/jcc.20084 PubMed DOI
Lindorff-Larsen K, Piana S, Palmo K, Maragakis P, Klepeis JL, Dror RO, et al. Improved side-chain torsion potentials for the Amber ff99SB protein force field. Proteins. 2010;78: 1950–1958. 10.1002/prot.22711 PubMed DOI PMC
Hess B, Kutzner C, Van Der Spoel D, Lindahl E. GROMACS 4: Algorithms for highly efficient, load-balanced, and scalable molecular simulation. J. Chem. Theory Comput. 2008;4: 435–447. 10.1021/ct700301q PubMed DOI
Chocholoušová J, Feig M. Implicit solvent simulations of DNA and DNA-protein complexes: Agreement with explicit solvent vs experiment. J. Phys. Chem. B. 2006;110: 17240–17251. 10.1021/jp0627675 PubMed DOI
Gaillard T, Case DA. Evaluation of DNA force fields in implicit solvation. J. Chem. Theory Comput. 2011;7: 3181–3198. 10.1021/ct200384r PubMed DOI PMC
Kleinjung J, Fraternali F. Design and application of implicit solvent models in biomolecular simulations. Curr. Opin. Struct. Biol. 2014;25: 126–134. 10.1016/j.sbi.2014.04.003 PubMed DOI PMC
Still WC, Tempczyk A, Hawley RC, Hendrickson T. Semianalytical treatment of solvation for molecular mechanics and dynamics. J. Am. Chem. Soc. 1990;112: 6127–6129. 10.1021/ja00172a038 DOI
Qiu D, Shenkin PS, Hollinger FP, Still WC. The GB/SA continuum model for solvation. A fast analytical method for the calculation of approximate Born radii. J. Phys. Chem. A. 1997;101: 3005–3014. 10.1021/jp961992r DOI
Hawkins GD, Cramer CJ, Truhlar DG. Pairwise solute descreening of solute charges from a dielectric medium. Chem. Phys. Lett. 1995;246: 122–129. 10.1016/0009-2614(95)01082-K DOI
Hawkins GD, Cramer CJ, Truhlar DG. Parametrized models of aqueous free energies of solvation based on pairwise descreening of solute atomic charges from a dielectric medium. J. Phys. Chem. 1996;100: 19824–19839. 10.1021/jp961710n DOI
Freedman D, Diaconis P. On the histogram as a density estimator:L2 theory. Z. Wahrscheinlichkeit. 1981;57: 453–476. 10.1007/BF01025868 DOI
Berka K, Laskowski RA, Hobza P, Vondrášek J. Energy matrix of structurally important side-chain/side-chain interactions in proteins. J. Chem. Theory Comput. 2010;6: 2191–2203. 10.1021/ct100007y PubMed DOI
Schlitter J. Estimation of absolute and relative entropies of macromolecules using the covariance matrix. Chem. Phys. Lett. 1993;215: 617–621. 10.1016/0009-2614(93)89366-P DOI
Lustig B, Jernigan RL. Consistencies of individual DNA base–amino acid interactions in structures and sequences. Nucleic Acids Res. 1995;23: 4707–4711. 10.1093/nar/23.22.4707 PubMed DOI PMC
Mirny LA, Gelfand MS. Structural analysis of conserved base pairs in protein–DNA complexes. Nucleic Acids Res. 2002;30: 1704–1711. 10.1093/nar/30.7.1704 PubMed DOI PMC
Lustig B, Arora S, Jernigan RL. RNA base-amino acid interaction strengths derived from structures and sequences. Nucleic Acids Res. 1997;25: 2562–2565. 10.1093/nar/25.13.2562 PubMed DOI PMC
Shen M, Sali A. Statistical potential for assessment and prediction of protein structures. Protein Sci. 2006;15: 2507–2524. 10.1110/ps.062416606 PubMed DOI PMC
Benos PV, Bulyk ML, Stormo GD. Additivity in protein—DNA interactions: how good an approximation is it?. Nucleic Acids Res. 2002;30: 4442–4451. 10.1093/nar/gkf578 PubMed DOI PMC
Amino Acid Interaction (INTAA) web server