FireProtDB: database of manually curated protein stability data
Jazyk angličtina Země Anglie, Velká Británie Médium print
Typ dokumentu časopisecké články, práce podpořená grantem
PubMed
33166383
PubMed Central
PMC7778887
DOI
10.1093/nar/gkaa981
PII: 5964070
Knihovny.cz E-zdroje
- MeSH
- anotace sekvence MeSH
- bodová mutace * MeSH
- databáze proteinů * MeSH
- datové soubory jako téma MeSH
- internet MeSH
- molekulární modely MeSH
- proteiny chemie genetika MeSH
- software MeSH
- stabilita proteinů MeSH
- strojové učení statistika a číselné údaje MeSH
- výpočetní biologie metody MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Názvy látek
- proteiny MeSH
The majority of naturally occurring proteins have evolved to function under mild conditions inside the living organisms. One of the critical obstacles for the use of proteins in biotechnological applications is their insufficient stability at elevated temperatures or in the presence of salts. Since experimental screening for stabilizing mutations is typically laborious and expensive, in silico predictors are often used for narrowing down the mutational landscape. The recent advances in machine learning and artificial intelligence further facilitate the development of such computational tools. However, the accuracy of these predictors strongly depends on the quality and amount of data used for training and testing, which have often been reported as the current bottleneck of the approach. To address this problem, we present a novel database of experimental thermostability data for single-point mutants FireProtDB. The database combines the published datasets, data extracted manually from the recent literature, and the data collected in our laboratory. Its user interface is designed to facilitate both types of the expected use: (i) the interactive explorations of individual entries on the level of a protein or mutation and (ii) the construction of highly customized and machine learning-friendly datasets using advanced searching and filtering. The database is freely available at https://loschmidt.chemi.muni.cz/fireprotdb.
Zobrazit více v PubMed
Modarres H.P., Mofrad M.R., Sanati-Nezhad A.. Protein thermostability engineering. RSC Adv. 2016; 6:115252–115270.
Gao D., Narasimhan D.L., Macdonald J., Brim R., Ko M.-C., Landry D.W., Woods J.H., Sunahara R.K., Zhan C.-G.. Thermostable variants of cocaine esterase for long-time protection against cocaine toxicity. Mol. Pharmacol. 2009; 75:318–323. PubMed PMC
Wijma H.J., Floor R.J., Janssen D.B.. Structure- and sequence-analysis inspired engineering of proteins for enhanced thermostability. Curr. Opin. Struct. Biol. 2013; 23:588–594. PubMed
Ferdjani S., Ionita M., Roy B., Dion M., Djeghaba Z., Rabiller C., Tellier C.. Correlation between thermostability and stability of glycosidases in ionic liquid. Biotechnol. Lett. 2011; 33:1215–1219. PubMed
Polizzi K.M., Bommarius A.S., Broering J.M., Chaparro-Riggers J.F.. Stability of biocatalysts. Curr. Opin. Chem. Biol. 2007; 11:220–225. PubMed
Musil M., Konegger H., Hon J., Bednar D., Damborsky J.. Computational design of stable and soluble biocatalysts. ACS Catal. 2019; 9:1033–1054.
Kumar M.D.S., Bava K.A., Gromiha M.M., Prabakaran P., Kitajima K., Uedaira H., Sarai A.. ProTherm and ProNIT: thermodynamic databases for proteins and protein–nucleic acid interactions. Nucleic Acids Res. 2006; 34:D204–D206. PubMed PMC
Pucci F., Bernaerts K.V., Kwasigroch J.M., Rooman M.. Quantification of biases in predictions of protein stability changes upon mutations. Bioinformatics. 2018; 34:3659–3665. PubMed
Folkman L., Stantic B., Sattar A., Zhou Y.. EASE-MM: sequence-based prediction of mutation-induced stability changes with feature-based multiple models. J. Mol. Biol. 2016; 428:1394–1405. PubMed
Mazurenko S. Predicting protein stability and solubility changes upon mutations: data perspective. Chem. Cat. Chem. 2020; 12:doi:10.1002/cctc.202000933.
Sasidharan Nair P., Vihinen M.. VariBench: a benchmark database for variations. Hum. Mutat. 2013; 34:42–49. PubMed
Wang C.Y., Chang P.M., Ary M.L., Allen B.D., Chica R.A., Mayo S.L., Olafson B.D.. ProtaBank: a repository for protein design and engineering data. Protein Sci. 2018; 27:1113–1124. PubMed PMC
The UniProt Consortium UniProt: a worldwide hub of protein knowledge. Nucleic Acids Res. 2019; 47:D506–D515. PubMed PMC
Jefferson E.R., Walsh T.P., Barton G.J.. Biological units and their effect upon the properties and prediction of protein-protein interactions. J. Mol. Biol. 2006; 364:1118–1129. PubMed
Sumbalova L., Stourac J., Martinek T., Bednar D., Damborsky J.. HotSpot Wizard 3.0: web server for automated design of mutations and smart libraries based on sequence input information. Nucleic Acids Res. 2018; 46:W356–W362. PubMed PMC
Martin A.C.R. Mapping PDB chains to UniProtKB entries. Bioinformatics. 2005; 21:4297–4301. PubMed
Needleman S.B., Wunsch C.D.. A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol. 1970; 48:443–453. PubMed
Musil M., Stourac J., Bendl J., Brezovsky J., Prokop Z., Zendulka J., Martinek T., Bednar D., Damborsky J.. FireProt: web server for automated design of thermostable proteins. Nucleic Acids Res. 2017; 45:W393–W399. PubMed PMC
Sequeiros-Borja C.E., Surpeta B., Brezovsky J.. Recent advances in user-friendly computational tools to engineer protein function. Brief. Bioinform. doi:10.1093/bib/bbaa150. PubMed PMC
Watkins X., Garcia L.J., Pundir S., Martin M.J. UniProt Consortium . ProtVista: visualization of protein sequence annotations. Bioinformatics. 2017; 33:2040–2041. PubMed PMC
Bunzel H.A., Garrabou X., Pott M., Hilvert D.. Speeding up enzyme discovery and engineering with ultrahigh-throughput methods. Curr. Opin. Struct. Biol. 2018; 48:149–156. PubMed
Matreyek K.A., Starita L.M., Stephany J.J., Martin B., Chiasson M.A., Gray V.E., Kircher M., Khechaduri A., Dines J.N., Hause R.J. et al. .. Multiplex assessment of protein variant abundance by massively parallel sequencing. Nat. Genet. 2018; 50:874–882. PubMed PMC
Naderi N., Witte R.. Automated extraction and semantic analysis of mutation impacts from the biomedical literature. BMC Genomics. 2012; 13:S10. PubMed PMC
Witte R., Baker C.J.O.. Towards a systematic evaluation of protein mutation extraction systems. J. Bioinform. Comput. Biol. 2007; 5:1339–1359. PubMed
Wei C.-H., Harris B.R., Kao H.-Y., Lu Z.. tmVar: a text mining approach for extracting sequence variants in biomedical literature. Bioinformatics. 2013; 29:1433–1439. PubMed PMC
BenchStab: a tool for automated querying of web-based stability predictors
FireProt 2.0: web-based platform for the fully automated design of thermostable proteins
Machine Learning-Guided Protein Engineering
SoluProtMutDB: A manually curated database of protein solubility changes upon mutations