AggreProt: a web server for predicting and engineering aggregation prone regions in proteins
Jazyk angličtina Země Anglie, Velká Británie Médium print
Typ dokumentu časopisecké články
Grantová podpora
857560
Horizon 2020 Framework Programme
FW03010208
Technology Agency of the Czech Republic
CETOCOEN EXCELLENCE CZ.02.1.01/0.0/0.0/17_043/0009
Ministry of Education
LX22NPO5
European Union - Next Generation EU
PubMed
38801076
PubMed Central
PMC11223854
DOI
10.1093/nar/gkae420
PII: 7683054
Knihovny.cz E-zdroje
- MeSH
- algoritmy MeSH
- internet * MeSH
- konformace proteinů MeSH
- neuronové sítě (počítačové) MeSH
- proteinové agregáty * MeSH
- proteinové inženýrství metody MeSH
- proteiny chemie genetika MeSH
- rozpustnost MeSH
- sbalování proteinů MeSH
- software * MeSH
- Publikační typ
- časopisecké články MeSH
- Názvy látek
- proteinové agregáty * MeSH
- proteiny MeSH
Recombinant proteins play pivotal roles in numerous applications including industrial biocatalysts or therapeutics. Despite the recent progress in computational protein structure prediction, protein solubility and reduced aggregation propensity remain challenging attributes to design. Identification of aggregation-prone regions is essential for understanding misfolding diseases or designing efficient protein-based technologies, and as such has a great socio-economic impact. Here, we introduce AggreProt, a user-friendly webserver that automatically exploits an ensemble of deep neural networks to predict aggregation-prone regions (APRs) in protein sequences. Trained on experimentally evaluated hexapeptides, AggreProt compares to or outperforms state-of-the-art algorithms on two independent benchmark datasets. The server provides per-residue aggregation profiles along with information on solvent accessibility and transmembrane propensity within an intuitive interface with interactive sequence and structure viewers for comprehensive analysis. We demonstrate AggreProt efficacy in predicting differential aggregation behaviours in proteins on several use cases, which emphasize its potential for guiding protein engineering strategies towards decreased aggregation propensity and improved solubility. The webserver is freely available and accessible at https://loschmidt.chemi.muni.cz/aggreprot/.
Zobrazit více v PubMed
Wodak S.J., Vajda S., Lensink M.F., Kozakov D., Bates P.A. Critical assessment of methods for predicting the 3D structure of proteins and protein complexes. Annu. Rev. Biophys. 2023; 52:183–206. PubMed PMC
Elofsson A. Progress at protein structure prediction, as seen in CASP15. Curr. Opin. Struct. Biol. 2023; 80:102594. PubMed
Jumper J., Evans R., Pritzel A., Green T., Figurnov M., Ronneberger O., Tunyasuvunakool K., Bates R., Žídek A., Potapenko A. et al. . Highly accurate protein structure prediction with AlphaFold. Nature. 2021; 596:583–589. PubMed PMC
Baek M., DiMaio F., Anishchenko I., Dauparas J., Ovchinnikov S., Lee G.R., Wang J., Cong Q., Kinch L.N., Schaeffer R.D. et al. . Accurate prediction of protein structures and interactions using a three-track neural network. Science. 2021; 373:871–876. PubMed PMC
Lin Z., Akin H., Rao R., Hie B., Zhu Z., Lu W., Smetanin N., Verkuil R., Kabeli O., Shmueli Y. et al. . Evolutionary-scale prediction of atomic-level protein structure with a language model. Science. 2023; 379:1123–1130. PubMed
Pinheiro F., Santos J., Ventura S. AlphaFold and the amyloid landscape. J. Mol. Biol. 2021; 433:167059. PubMed
Chakravarty D., Porter L.L. AlphaFold2 fails to predict protein fold switching. Protein Sci. 2022; 31:e4353. PubMed PMC
Louros N., Schymkowitz J., Rousseau F. Mechanisms and pathology of protein misfolding and aggregation. Nat. Rev. Mol. Cell Biol. 2023; 24:912–933. PubMed
Soto C., Pritzkow S. Protein misfolding, aggregation, and conformational strains in neurodegenerative diseases. Nat. Neurosci. 2018; 21:1332–1340. PubMed PMC
Sawaya M.R., Sambashivan S., Nelson R., Ivanova M.I., Sievers S.A., Apostol M.I., Thompson M.J., Balbirnie M., Wiltzius J.J.W., McFarlane H.T. et al. . Atomic structures of amyloid cross-β spines reveal varied steric zippers. Nature. 2007; 447:453–457. PubMed
Fändrich M., Nyström S., Nilsson K.P.R., Böckmann A., LeVine H., Hammarström P. Amyloid fibril polymorphism: a challenge for molecular imaging and therapy. J. Intern. Med. 2018; 283:218–237. PubMed PMC
Lövestam S., Li D., Wagstaff J.L., Kotecha A., Kimanius D., McLaughlin S.H., Murzin A.G., Freund S.M.V., Goedert M., Scheres S.H.W. Disease-specific tau filaments assemble via polymorphic intermediates. Nature. 2024; 625:119–125. PubMed PMC
Wang H., Duo L., Hsu F., Xue C., Lee Y.K., Guo Z. Polymorphic Aβ42 fibrils adopt similar secondary structure but differ in cross-strand side chain stacking interactions within the same β-sheet. Sci. Rep. 2020; 10:5720. PubMed PMC
Sawaya M.R., Hughes M.P., Rodriguez J.A., Riek R., Eisenberg D.S. The expanding amyloid family: structure, stability, function, and pathogenesis. Cell. 2021; 184:4857–4873. PubMed PMC
van der Kant R., Louros N., Schymkowitz J., Rousseau F. Thermodynamic analysis of amyloid fibril structures reveals a common framework for stability in amyloid polymorphs. Structure. 2022; 30:1178–1189. PubMed
Conchillo-Solé O., de Groot N.S., Avilés F.X., Vendrell J., Daura X., Ventura S. AGGRESCAN: a server for the prediction and evaluation of ‘hot spots’ of aggregation in polypeptides. BMC Bioinf. 2007; 8:65. PubMed PMC
Sormanni P., Aprile F.A., Vendruscolo M. The CamSol method of rational design of protein mutants with enhanced solubility. J. Mol. Biol. 2015; 427:478–490. PubMed
Maurer-Stroh S., Debulpaep M., Kuemmerer N., Lopez de la Paz M., Martins I.C., Reumers J., Morris K.L., Copland A., Serpell L., Serrano L. et al. . Exploring the sequence determinants of amyloid structure using position-specific scoring matrices. Nat. Methods. 2010; 7:237–242. PubMed
Fernandez-Escamilla A.-M., Rousseau F., Schymkowitz J., Serrano L. Prediction of sequence-dependent and mutational effects on the aggregation of peptides and proteins. Nat. Biotechnol. 2004; 22:1302–1306. PubMed
Walsh I., Seno F., Tosatto S.C.E., Trovato A. PASTA 2.0: an improved server for protein aggregation prediction. Nucleic Acids Res. 2014; 42:W301–W307. PubMed PMC
Zibaee S., Makin O.S., Goedert M., Serpell L.C. A simple algorithm locates β-strands in the amyloid fibril core of α-synuclein, Aβ, and tau using the amino acid sequence alone. Protein Sci. 2007; 16:906–918. PubMed PMC
Garbuzynskiy S.O., Lobanov M.Y., Galzitskaya O.V FoldAmyloid: a method of prediction of amyloidogenic regions from protein sequence. Bioinformatics. 2010; 26:326–332. PubMed
Kuriata A., Iglesias V., Pujols J., Kurcinski M., Kmiecik S., Ventura S. Aggrescan3D (A3D) 2.0: prediction and engineering of protein solubility. Nucleic Acids Res. 2019; 47:W300–W307. PubMed PMC
Keresztes L., Szögi E., Varga B., Farkas V., Perczel A., Grolmusz V. The budapest amyloid predictor and its applications. Biomolecules. 2021; 11:500. PubMed PMC
Niu M., Li Y., Wang C., Han K. RFAmyloid: a web server for predicting amyloid proteins. Int. J. Mol. Sci. 2018; 19:2071. PubMed PMC
Burdukiewicz M., Sobczyk P., Rödiger S., Duda-Madej A., Mackiewicz P., Kotulska M. Amyloidogenic motifs revealed by n-gram analysis. Sci. Rep. 2017; 7:12961. PubMed PMC
Navarro S., Ventura S. Computational methods to predict protein aggregation. Curr. Opin. Struct. Biol. 2022; 73:102343. PubMed
Prabakaran R., Rawat P., Kumar S., Michael Gromiha M. ANuPP: a versatile tool to predict aggregation nucleating regions in peptides and proteins. J. Mol. Biol. 2021; 433:166707. PubMed
Gasior P., Kotulska M. FISH Amyloid – a new method for finding amyloidogenic segments in proteins based on site specific co-occurence of aminoacids. BMC Bioinf. 2014; 15:54. PubMed PMC
Louros N., Orlando G., De Vleeschouwer M., Rousseau F., Schymkowitz J. Structure-based machine-guided mapping of amyloid sequence space reveals uncharted sequence clusters with higher solubilities. Nat. Commun. 2020; 11:3314. PubMed PMC
Louros N., Konstantoulea K., De Vleeschouwer M., Ramakers M., Schymkowitz J., Rousseau F. WALTZ-DB 2.0: an updated database containing structural information of experimentally determined amyloid-forming peptides. Nucleic Acids Res. 2020; 48:D389–D393. PubMed PMC
Van Durme J., Delgado J., Stricher F., Serrano L., Schymkowitz J., Rousseau F. A graphical interface for the FoldX forcefield. Bioinformatics. 2011; 27:1711–1712. PubMed
Varadi M., De Baets G., Vranken W.F., Tompa P., Pancsa R. AmyPro: a database of proteins with validated amyloidogenic regions. Nucleic Acids Res. 2018; 46:D387–D392. PubMed PMC
Rawat P., Prabakaran R., Sakthivel R., Mary Thangakani A., Kumar S., Gromiha M.M. CPAD 2.0: a repository of curated experimental data on aggregating proteins and peptides. Amyloid. 2020; 27:128–133. PubMed
Cima V., Kunka A., Grakova E., Planas-Iglesias J., Havlasek M., Subramanian M., Beloch M., Marek M., Slaninova K., Damborsky J. et al. . Prediction of aggregation prone regions in proteins using deep neural networks and their suppression by computational design. 2024; bioRxiv doi:11 March 2024, preprint: not peer reviewed10.1101/2024.03.06.583680. DOI
Marcelino A.M.C., Gierasch L.M. Roles of β-turns in protein folding: from peptide models to protein engineering. Biopolymers. 2008; 89:380–391. PubMed PMC
Barth P., Senes A. Toward high-resolution computational design of the structure and function of helical membrane proteins. Nat. Struct. Mol. Biol. 2016; 23:475–480. PubMed PMC
Velecký J., Hamsikova M., Stourac J., Musil M., Damborsky J., Bednar D., Mazurenko S. SoluProtMutDB: a manually curated database of protein solubility changes upon mutations. Comput. Struct. Biotechnol. J. 2022; 20:6339–6347. PubMed PMC
Ruopp M.D., Perkins N.J., Whitcomb B.W., Schisterman E.F. Youden index and optimal cut-point estimated from observations affected by a lower limit of detection. Biometrical Journal. 2008; 50:419–430. PubMed PMC
Abadi M., Agarwal A., Barham P., Brevdo E., Chen Z., Citro C., Corrado G.S., Davis A., Dean J., Devin M. et al. . TensorFlow: large-Scale machine learning on heterogeneous systems. 2015; Zenodo10.5281/zenodo.4724125. DOI
Zemla A., Venclovas Č., Fidelis K., Rost B. A modified definition of Sov, a segment-based measure for protein secondary structure prediction assessment. Proteins Struct. Funct. Genet. 1999; 34:220–223. PubMed
Varadi M., Anyango S., Deshpande M., Nair S., Natassia C., Yordanova G., Yuan D., Stroe O., Wood G., Laydon A. et al. . AlphaFold protein structure database: massively expanding the structural coverage of protein-sequence space with high-accuracy models. Nucleic Acids Res. 2022; 50:D439–D444. PubMed PMC
Altschul S.F., Gish W., Miller W., Myers E.W., Lipman D.J. Basic local alignment search tool. J. Mol. Biol. 1990; 215:403–410. PubMed
Krogh A., Larsson B., von Heijne G., Sonnhammer E.L.L. Predicting transmembrane protein topology with a hidden markov model: application to complete genomes. J. Mol. Biol. 2001; 305:567–580. PubMed
Kabsch W., Sander C. Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers. 1983; 22:2577–2637. PubMed
Gohl P., Bonet J., Fornes O., Planas-Iglesias J., Fernandez-Fuentes N., Oliva B. SBILib: a handle for protein modeling and engineering. Bioinformatics. 2023; 39:btad613. PubMed PMC
Lafita A., Bliven S., Prlić A., Guzenko D., Rose P.W., Bradley A., Pavan P., Myers-Turnbull D., Valasatava Y., Heuer M. et al. . BioJava 5: a community driven open-source bioinformatics library. PLoS Comput. Biol. 2019; 15:e1006791. PubMed PMC
Sehnal D., Bittrich S., Deshpande M., Svobodová R., Berka K., Bazgier V., Velankar S., Burley S.K., Koča J., Rose A.S. Mol* Viewer: modern web app for 3D visualization and analysis of large biomolecular structures. Nucleic Acids Res. 2021; 49:W431–W437. PubMed PMC
O’Rourke T.W., Loya T.J., Head P.E., Horton J.R., Reines D. Amyloid-like assembly of the low complexity domain of yeast Nab3. Prion. 2015; 9:34–47. PubMed PMC
Wittmer Y., Jami K.M., Stowell R.K., Le T., Hung I., Murray D.T. Liquid droplet aging and seeded fibril formation of the cytotoxic granule associated RNA binding protein TIA1 low complexity domain. J. Am. Chem. Soc. 2023; 145:1580–1592. PubMed PMC
Si K., Lindquist S., Kandel E.R. A neuronal isoform of the aplysia CPEB has prion-like properties. Cell. 2003; 115:879–891. PubMed
Cserzo M., Eisenhaber F., Eisenhaber B., Simon I. TM or not TM: transmembrane protein prediction with low false positive rate using DAS-TMfilter. Bioinformatics. 2004; 20:136–137. PubMed
Schmidt C., Macpherson J.A., Lau A.M., Tan K.W., Fraternali F., Politis A. Surface accessibility and dynamics of macromolecular assemblies probed by covalent labeling mass spectrometry and integrative modeling. Anal. Chem. 2017; 89:1459–1468. PubMed PMC
Markova K., Chmelova K., Marques S.M., Carpentier P., Bednar D., Damborsky J., Marek M. Decoding the intricate network of molecular interactions of a hyperstable engineered biocatalyst. Chem. Sci. 2020; 11:11162–11178. PubMed PMC
Buck P.M., Kumar S., Singh S.K. On the role of aggregation prone regions in protein evolution, stability, and enzymatic catalysis: insights from diverse analyses. PLoS Comput. Biol. 2013; 9:e1003291. PubMed PMC
Wrenbeck E.E., Bedewitz M.A., Klesmith J.R., Noshin S., Barry C.S., Whitehead T.A. An automated data-driven pipeline for improving heterologous enzyme expression. ACS Synth. Biol. 2019; 8:474–481. PubMed PMC
Rosace A., Bennett A., Oeller M., Mortensen M.M., Sakhnini L., Lorenzen N., Poulsen C., Sormanni P. Automated optimisation of solubility and conformational stability of antibodies and proteins. Nat. Commun. 2023; 14:1937. PubMed PMC
Klesmith J.R., Bacik J.-P., Wrenbeck E.E., Michalczyk R., Whitehead T.A. Trade-offs between enzyme fitness and solubility illuminated by deep mutational scanning. Proc. Natl. Acad. Sci. U.S.A. 2017; 114:2265–2270. PubMed PMC
Houben B., Rousseau F., Schymkowitz J. Protein structure and aggregation: a marriage of necessity ruled by aggregation gatekeepers. Trends Biochem. Sci. 2022; 47:194–205. PubMed