AlphaFind: discover structure similarity across the proteome in AlphaFold DB
Jazyk angličtina Země Anglie, Velká Británie Médium print
Typ dokumentu časopisecké články
Grantová podpora
GF23-07040K
Czech Science Foundation
LM2023055
Ministry of Education
Masaryk University
Oxford University Press
90254
Youth and Sports of the Czech Republic
PubMed
38747341
PubMed Central
PMC11223785
DOI
10.1093/nar/gkae397
PII: 7673488
Knihovny.cz E-zdroje
- MeSH
- databáze proteinů * MeSH
- internet MeSH
- konformace proteinů MeSH
- molekulární modely MeSH
- proteiny chemie genetika metabolismus MeSH
- proteom * chemie genetika MeSH
- sbalování proteinů MeSH
- software * MeSH
- strojové učení MeSH
- strukturní homologie proteinů MeSH
- vyhledávač MeSH
- Publikační typ
- časopisecké články MeSH
- Názvy látek
- proteiny MeSH
- proteom * MeSH
AlphaFind is a web-based search engine that provides fast structure-based retrieval in the entire set of AlphaFold DB structures. Unlike other protein processing tools, AlphaFind is focused entirely on tertiary structure, automatically extracting the main 3D features of each protein chain and using a machine learning model to find the most similar structures. This indexing approach and the 3D feature extraction method used by AlphaFind have both demonstrated remarkable scalability to large datasets as well as to large protein structures. The web application itself has been designed with a focus on clarity and ease of use. The searcher accepts any valid UniProt ID, Protein Data Bank ID or gene symbol as input, and returns a set of similar protein chains from AlphaFold DB, including various similarity metrics between the query and each of the retrieved results. In addition to the main search functionality, the application provides 3D visualizations of protein structure superpositions in order to allow researchers to instantly analyze the structural similarity of the retrieved results. The AlphaFind web application is available online for free and without any registration at https://alphafind.fi.muni.cz.
Faculty of Informatics Masaryk University Botanická 68A Brno 60200 Czech Republic
Institute of Computer Science Masaryk University Šumavská 416 15 Brno 60200 Czech Republic
Zobrazit více v PubMed
Burley S.K., Bhikadiya C., Bi C., Sebastian B., Chao H., Chen Li., Craig P.A., Crichlow G.V., Dalenberg K., Duarte J.M. et al. . RCSB Protein Data Bank (RCSB.org): delivery of experimentally-determined PDB structures alongside one million computed structure models of proteins from artificial intelligence/machine learning. Nucleic Acids Res. 2023; 51:D488–D508. PubMed PMC
Jumper J., Evans R., Pritzel A., Green T., Figurnov M., Ronneberger O., Tunyasuvunakool K., Bates R., Žídek A., Potapenko A. et al. . Applying and improving AlphaFold at CASP14. Proteins. 2021; 89:1711–1721. PubMed PMC
Varadi M., Anyango S., Deshpande M., Nair S., Natassia C., Yordanova G., Yuan D., Stroe O., Wood G., Laydon A. et al. . AlphaFold Protein Structure Database: massively expanding the structural coverage of protein-sequence space with high-accuracy models. Nucleic Acids Res. 2021; 50:D439–D444. PubMed PMC
Lin Z., Akin H., Rao R., Hie B., Zhu Z., Lu W., Smetanin N., Verkuil R., Kabeli O., Shmueli Y. et al. . Evolutionary-scale prediction of atomic-level protein structure with a language model. Science. 2023; 379:1123–1130. PubMed
Varadi M., Velankar S. The impact of AlphaFold Protein Structure Database on the fields of life sciences. Proteomics. 2023; 23:2200128. PubMed
van Kempen M., Kim S.S., Tumescheit C., Mirdita M., Gilchrist C.L.M., Söding J., Steinegger M. Fast and accurate protein structure search with Foldseek. Nat. Biotechnol. 20234; 42:243–246. PubMed PMC
La D., Esquivel-Rodríguez J., Venkatraman V., Li B., Sael L., Ueng S., Ahrendt S., Kihara D. 3D-SURFER: software for high-throughput protein surface comparison and analysis. Bioinformatics. 2009; 25:2843–2844. PubMed PMC
Holm L. Dali server: structural unification of protein families. Nucleic Acids Res. 2022; 50:W210–W215. PubMed PMC
Olha J., Slanináková T., Gendiar M., Antol M., Dohnal V.. Skopal T., Falchi F., Lokoč J., Sapino M.L., Bartolini I., Patella M. Learned indexing in proteins: substituting complex distance calculations with embedding and clustering techniques. Similarity Search and Applications. 2022; Cham, Switzerland: Springer International Publishing; 274–282.
Kraska T., Beutel A., Chi E.H., Dean J., Polyzotis N. The case for learned index structures. Proceedings of the 2018 International Conference on Management of Data. 2018; NY: Association for Computing Machinery; 489–504.
Antol M., Ol’ha J., Slanináková T., Dohnal V. Learned metric index—proposition of learned indexing for unstructured data. Inform. Syst. 2021; 100:101774.
Slanináková T., Antol M., Olha J., Kaňa V., Dohnal V.. Reyes N., Connor R., Kriege N., Kazempour D., Bartolini I., Schubert E., Chen J. Data-driven learned metric index: an unsupervised approach. Similarity Search and Applications. 2021; Cham, Switzerland: Springer International Publishing; 81–94.
Johnson J., Douze M., Jégou H. Billion-scale similarity search with GPUs. IEEE Trans. Big Data. 2019; 7:535–547.
Zhang C., Shine M., Pyle A.M., Zhang Y. US-align: universal structure alignments of proteins, nucleic acids, and macromolecular complexes. Nat. Methods. 2022; 19:1109–1115. PubMed
Rose A.S., Hildebrand P.W. NGL Viewer: a web application for molecular visualization. Nucleic Acids Res. 2015; 43:W576–W579. PubMed PMC
Sehnal D., Bittrich S., Deshpande M., Svobodová R., Berka K., Bazgier V., Velankar S., Burley S.K., Koča J., Rose A.S. Mol* Viewer: modern web app for 3D visualization and analysis of large biomolecular structure. Nucleic Acids Res. 2021; 49:W431–W437. PubMed PMC
Midlik A., Navrátilová V., Moturu T.R., Koča J., Svobodová R., Berka K. Uncovering of cytochrome P450 anatomy by SecStrAnnotator. Sci. Rep. 2021; 11:12345. PubMed PMC
Ung K.L., Winkler M., Schulz L., Kolb M., Janacek D.P., Dedic E., Stokes D.L., Hammes U.Z., Pedersen B.P. Structures and mechanism of the plant PIN-FORMED auxin transporter. Nature. 2022; 609:605–610. PubMed PMC
Slanináková T., Antol M., Ol’ha J., Dohnal V., Ladra S., Martínez-Prieto M.A. Reproducible experiments with learned metric index framework. Inform. Syst. 2023; 118:102255.