FireProtDB: database of manually curated protein stability data

. 2021 Jan 08 ; 49 (D1) : D319-D324.

Jazyk angličtina Země Anglie, Velká Británie Médium print

Typ dokumentu časopisecké články, práce podpořená grantem

Perzistentní odkaz   https://www.medvik.cz/link/pmid33166383

The majority of naturally occurring proteins have evolved to function under mild conditions inside the living organisms. One of the critical obstacles for the use of proteins in biotechnological applications is their insufficient stability at elevated temperatures or in the presence of salts. Since experimental screening for stabilizing mutations is typically laborious and expensive, in silico predictors are often used for narrowing down the mutational landscape. The recent advances in machine learning and artificial intelligence further facilitate the development of such computational tools. However, the accuracy of these predictors strongly depends on the quality and amount of data used for training and testing, which have often been reported as the current bottleneck of the approach. To address this problem, we present a novel database of experimental thermostability data for single-point mutants FireProtDB. The database combines the published datasets, data extracted manually from the recent literature, and the data collected in our laboratory. Its user interface is designed to facilitate both types of the expected use: (i) the interactive explorations of individual entries on the level of a protein or mutation and (ii) the construction of highly customized and machine learning-friendly datasets using advanced searching and filtering. The database is freely available at https://loschmidt.chemi.muni.cz/fireprotdb.

Zobrazit více v PubMed

Modarres H.P., Mofrad M.R., Sanati-Nezhad A.. Protein thermostability engineering. RSC Adv. 2016; 6:115252–115270.

Gao D., Narasimhan D.L., Macdonald J., Brim R., Ko M.-C., Landry D.W., Woods J.H., Sunahara R.K., Zhan C.-G.. Thermostable variants of cocaine esterase for long-time protection against cocaine toxicity. Mol. Pharmacol. 2009; 75:318–323. PubMed PMC

Wijma H.J., Floor R.J., Janssen D.B.. Structure- and sequence-analysis inspired engineering of proteins for enhanced thermostability. Curr. Opin. Struct. Biol. 2013; 23:588–594. PubMed

Ferdjani S., Ionita M., Roy B., Dion M., Djeghaba Z., Rabiller C., Tellier C.. Correlation between thermostability and stability of glycosidases in ionic liquid. Biotechnol. Lett. 2011; 33:1215–1219. PubMed

Polizzi K.M., Bommarius A.S., Broering J.M., Chaparro-Riggers J.F.. Stability of biocatalysts. Curr. Opin. Chem. Biol. 2007; 11:220–225. PubMed

Musil M., Konegger H., Hon J., Bednar D., Damborsky J.. Computational design of stable and soluble biocatalysts. ACS Catal. 2019; 9:1033–1054.

Kumar M.D.S., Bava K.A., Gromiha M.M., Prabakaran P., Kitajima K., Uedaira H., Sarai A.. ProTherm and ProNIT: thermodynamic databases for proteins and protein–nucleic acid interactions. Nucleic Acids Res. 2006; 34:D204–D206. PubMed PMC

Pucci F., Bernaerts K.V., Kwasigroch J.M., Rooman M.. Quantification of biases in predictions of protein stability changes upon mutations. Bioinformatics. 2018; 34:3659–3665. PubMed

Folkman L., Stantic B., Sattar A., Zhou Y.. EASE-MM: sequence-based prediction of mutation-induced stability changes with feature-based multiple models. J. Mol. Biol. 2016; 428:1394–1405. PubMed

Mazurenko S. Predicting protein stability and solubility changes upon mutations: data perspective. Chem. Cat. Chem. 2020; 12:doi:10.1002/cctc.202000933.

Sasidharan Nair P., Vihinen M.. VariBench: a benchmark database for variations. Hum. Mutat. 2013; 34:42–49. PubMed

Wang C.Y., Chang P.M., Ary M.L., Allen B.D., Chica R.A., Mayo S.L., Olafson B.D.. ProtaBank: a repository for protein design and engineering data. Protein Sci. 2018; 27:1113–1124. PubMed PMC

The UniProt Consortium UniProt: a worldwide hub of protein knowledge. Nucleic Acids Res. 2019; 47:D506–D515. PubMed PMC

Jefferson E.R., Walsh T.P., Barton G.J.. Biological units and their effect upon the properties and prediction of protein-protein interactions. J. Mol. Biol. 2006; 364:1118–1129. PubMed

Sumbalova L., Stourac J., Martinek T., Bednar D., Damborsky J.. HotSpot Wizard 3.0: web server for automated design of mutations and smart libraries based on sequence input information. Nucleic Acids Res. 2018; 46:W356–W362. PubMed PMC

Martin A.C.R. Mapping PDB chains to UniProtKB entries. Bioinformatics. 2005; 21:4297–4301. PubMed

Needleman S.B., Wunsch C.D.. A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol. 1970; 48:443–453. PubMed

Musil M., Stourac J., Bendl J., Brezovsky J., Prokop Z., Zendulka J., Martinek T., Bednar D., Damborsky J.. FireProt: web server for automated design of thermostable proteins. Nucleic Acids Res. 2017; 45:W393–W399. PubMed PMC

Sequeiros-Borja C.E., Surpeta B., Brezovsky J.. Recent advances in user-friendly computational tools to engineer protein function. Brief. Bioinform. doi:10.1093/bib/bbaa150. PubMed PMC

Watkins X., Garcia L.J., Pundir S., Martin M.J. UniProt Consortium . ProtVista: visualization of protein sequence annotations. Bioinformatics. 2017; 33:2040–2041. PubMed PMC

Bunzel H.A., Garrabou X., Pott M., Hilvert D.. Speeding up enzyme discovery and engineering with ultrahigh-throughput methods. Curr. Opin. Struct. Biol. 2018; 48:149–156. PubMed

Matreyek K.A., Starita L.M., Stephany J.J., Martin B., Chiasson M.A., Gray V.E., Kircher M., Khechaduri A., Dines J.N., Hause R.J. et al. .. Multiplex assessment of protein variant abundance by massively parallel sequencing. Nat. Genet. 2018; 50:874–882. PubMed PMC

Naderi N., Witte R.. Automated extraction and semantic analysis of mutation impacts from the biomedical literature. BMC Genomics. 2012; 13:S10. PubMed PMC

Witte R., Baker C.J.O.. Towards a systematic evaluation of protein mutation extraction systems. J. Bioinform. Comput. Biol. 2007; 5:1339–1359. PubMed

Wei C.-H., Harris B.R., Kao H.-Y., Lu Z.. tmVar: a text mining approach for extracting sequence variants in biomedical literature. Bioinformatics. 2013; 29:1433–1439. PubMed PMC

Najít záznam

Citační ukazatele

Nahrávání dat ...

    Možnosti archivace