• This record comes from PubMed

Understanding bacterial pathogen diversity: A proteogenomic analysis and use of an array of genome assemblies to identify novel virulence factors of the honey bee bacterial pathogen Paenibacillus larvae

. 2024 Jul ; 24 (14) : e2300280. [epub] 20240514

Language English Country Germany Media print-electronic

Document type Journal Article

Grant support
QK1710228 Národní Agentura pro Zemědělský Výzkum
QK1910018 Národní Agentura pro Zemědělský Výzkum
RO0423 Ministerstvo Zemědělství

Mass spectrometry proteomics data are typically evaluated against publicly available annotated sequences, but the proteogenomics approach is a useful alternative. A single genome is commonly utilized in custom proteomic and proteogenomic data analysis. We pose the question of whether utilizing numerous different genome assemblies in a search database would be beneficial. We reanalyzed raw data from the exoprotein fraction of four reference Enterobacterial Repetitive Intergenic Consensus (ERIC) I-IV genotypes of the honey bee bacterial pathogen Paenibacillus larvae and evaluated them against three reference databases (from NCBI-protein, RefSeq, and UniProt) together with an array of protein sequences generated by six-frame direct translation of 15 genome assemblies from GenBank. The wide search yielded 453 protein hits/groups, which UpSet analysis categorized into 50 groups based on the success of protein identification by the 18 database components. Nine hits that were not identified by a unique peptide were not considered for marker selection, which discarded the only protein that was not identified by the reference databases. We propose that the variability in successful identifications between genome assemblies is useful for marker mining. The results suggest that various strains of P. larvae can exhibit specific traits that set them apart from the established genotypes ERIC I-V.

See more in PubMed

Aebersold, R., & Mann, M. (2003). Mass spectrometry‐based proteomics. Nature, 422(6928), 198–207. https://doi.org/10.1038/nature01511

Zhang, Y., Fonslow, B. R., Shan, B., Baek, M.‐C., & Yates, J. R. (2013). Protein analysis by shotgun/bottom‐up proteomics. Chemical Reviews, 113(4), 2343–2394. https://doi.org/10.1021/cr3003533

Cargile, B. J., Bundy, J. L., & Stephenson, J. L. (2004). Potential for false positive identifications from large databases through tandem mass spectrometry. Journal of Proteome Research, 3(5), 1082–1085. https://doi.org/10.1021/pr049946o

Foster, L. J. (2011). Interpretation of data underlying the link between colony collapse disorder (CCD) and an invertebrate iridescent virus. Molecular & Cellular Proteomics, 10(3), M110006387. https://doi.org/10.1074/mcp.M110.006387

Kumar, D., Yadav, A. K., & Dash, D. (2017). Choosing an optimal database for protein identification from tandem mass spectrometry data. In: S. Keerthikumar, & S. Mathivanan (Eds.), Proteome bioinformatics (pp. 17–29). Humana Press. https://doi.org/10.1007/978‐1‐4939‐6740‐7_3

Nesvizhskii, A. I. (2014). Proteogenomics: Concepts, applications and computational strategies. Nature Methods, 11(11), 1114–1125. https://doi.org/10.1038/nmeth.3144

O'Leary, N. A., Wright, M. W., Brister, J. R., Ciufo, S., Haddad, D., McVeigh, R., Rajput, B., Robbertse, B., Smith‐White, B., Ako‐Adjei, D., Astashyn, A., Badretdin, A., Bao, Y., Blinkova, O., Brover, V., Chetvernin, V., Choi, J., Cox, E., Ermolaeva, O., … Pruitt, K. D. (2016). Reference sequence (RefSeq) database at NCBI: Current status, taxonomic expansion, and functional annotation. Nucleic Acids Research, 44(D1), D733–D745. https://doi.org/10.1093/nar/gkv1189

The UniProt Consortium. (2017). UniProt: The universal protein knowledgebase. Nucleic Acids Research, 45(D1), D158–D169. https://doi.org/10.1093/nar/gkw1099

Li, H., Joh, Y. S., Kim, H., Paek, E., Lee, S.‐W., & Hwang, K.‐B. (2016). Evaluating the effect of database inflation in proteogenomic search on sensitive and reliable peptide identification. BMC Genomics, 17(Supplement 13), 1031. https://doi.org/10.1186/s12864‐016‐3327‐5

Erban, T., Zitek, J., Bodrinova, M., Talacko, P., Bartos, M., & Hrabak, J. (2019). Comprehensive proteomic analysis of exoproteins expressed by ERIC I, II, III and IV Paenibacillus larvae genotypes reveals a wide range of virulence factors. Virulence, 10(1), 363–375. https://doi.org/10.1080/21505594.2019.1603133

Erban, T., Sopko, B., Bodrinova, M., Talacko, P., Chalupnikova, J., Markovic, M., & Kamler, M. (2023). Proteomic insight into the interaction of Paenibacillus larvae with honey bee larvae before capping collected from an American foulbrood outbreak: Pathogen proteins within the host, lysis signatures and interaction markers. Proteomics, 23(1), 2200146. https://doi.org/10.1002/pmic.202200146

Genersch, E., Forsgren, E., Pentikäinen, J., Ashiralieva, A., Rauch, S., Kilwinski, J., & Fries, I. (2006). Reclassification of Paenibacillus larvae subsp. pulvifaciens and Paenibacillus larvae subsp. larvae as Paenibacillus larvae without subspecies differentiation. International Journal of Systematic and Evolutionary Microbiology, 56(3), 501–511. https://doi.org/10.1099/ijs.0.63928‐0

Beims, H., Bunk, B., Erler, S., Mohr, K. I., Sproer, C., Pradella, S., Gunther, G., Rohde, M., Von Der Ohe, W., & Steinert, M. (2020). Discovery of Paenibacillus larvae ERIC V: Phenotypic and genomic comparison to genotypes ERIC I‐IV reveal different inventories of virulence factors which correlate with epidemiological prevalences of American foulbrood. International Journal of Medical Microbiology, 310(2), 151394. https://doi.org/10.1016/j.ijmm.2020.151394

Djukic, M., Brzuszkiewicz, E., Funfhaus, A., Voss, J., Gollnow, K., Poppinga, L., Liesegang, H., Garcia‐Gonzalez, E., Genersch, E., & Daniel, R. (2014). How to kill the honey bee larva: Genomic potential and virulence mechanisms of Paenibacillus larvae. PLoS ONE, 9(3), e90914. https://doi.org/10.1371/journal.pone.0090914

Khatun, J., Yu, Y., Wrobel, J. A., Risk, B. A., Gunawardena, H. P., Secrest, A., Spitzer, W. J., Xie, L., Wang, L., Chen, X., & Giddings, M. C. (2013). Whole human genome proteogenomic mapping for ENCODE cell line data: Identifying protein‐coding regions. BMC Genomics, 14(1), 141. https://doi.org/10.1186/1471‐2164‐14‐141

Castellana, N. E., Shen, Z., He, Y., Walley, J. W., Cassidy, C. J., Briggs, S. P., & Bafna, V. (2014). An automated proteogenomic method uses mass spectrometry to reveal novel genes in Zea mays. Molecular & Cellular Proteomics, 13(1), 157–167. https://doi.org/10.1074/mcp.M113.031260

McAfee, A., Harpur, B. A., Michaud, S., Beavis, R. C., Kent, C. F., Zayed, A., & Foster, L. J. (2016). Toward an upgraded honey bee (Apis mellifera L.) genome annotation using proteogenomics. Journal of Proteome Research, 15(2), 411–421. https://doi.org/10.1021/acs.jproteome.5b00589

McAfee, A., Chan, Q. W. T., Evans, J., & Foster, L. J. (2017). A Varroa destructor protein atlas reveals molecular underpinnings of developmental transitions and sexual differentiation. Molecular & Cellular Proteomics, 16(12), 2125–2137. https://doi.org/10.1074/mcp.RA117.000104

Cox, J., Hein, M. Y., Luber, C. A., Paron, I., Nagaraj, N., & Mann, M. (2014). Accurate proteome‐wide label‐free quantification by delayed normalization and maximal peptide ratio extraction, termed MaxLFQ. Molecular & Cellular Proteomics, 13(9), 2513–2526. https://doi.org/10.1074/mcp.M113.031591

Cox, J. R., Neuhauser, N., Michalski, A., Scheltema, R. A., Olsen, J. V., & Mann, M. (2011). Andromeda: A peptide search engine integrated into the MaxQuant environment. Journal of Proteome Research, 10(4), 1794–1805. https://doi.org/10.1021/pr101065j

Tyanova, S., Temu, T., Sinitcyn, P., Carlson, A., Hein, M. Y., Geiger, T., Mann, M., & Cox, J. (2016). The Perseus computational platform for comprehensive analysis of (prote)omics data. Nature Methods, 13(9), 731–740. https://doi.org/10.1038/nmeth.3901

Bogdanow, B., Zauber, H., & Selbach, M. (2016). Systematic errors in peptide and protein identification and quantification by modified peptides. Molecular & Cellular Proteomics, 15(8), 2791–2801. https://doi.org/10.1074/mcp.M115.055103

Lex, A., Gehlenborg, N., Strobelt, H., Vuillemot, R., & Pfister, H. (2014). UpSet: Visualization of intersecting sets. IEEE Transactions on Visualization and Computer Graphics, 20(12), 1983–1992. https://doi.org/10.1109/TVCG.2014.2346248

Conway, J. R., Lex, A., & Gehlenborg, N. (2017). UpSetR: An R package for the visualization of intersecting sets and their properties. Bioinformatics, 33(18), 2938–2940. https://doi.org/10.1093/bioinformatics/btx364

Sievers, F., Wilm, A., Dineen, D., Gibson, T. J., Karplus, K., Li, W., Lopez, R., Mcwilliam, H., Remmert, M., Soding, J., Thompson, J. D., & Higgins, D. G. (2011). Fast, scalable generation of high‐quality protein multiple sequence alignments using Clustal Omega. Molecular Systems Biology, 7, 539. https://doi.org/10.1038/msb.2011.75

Waterhouse, A. M., Procter, J. B., Martin, D. M. A., Clamp, M., & Barton, G. J. (2009). Jalview Version 2—a multiple sequence alignment editor and analysis workbench. Bioinformatics, 25(9), 1189–1191. https://doi.org/10.1093/bioinformatics/btp033

Blum, M., Chang, H.‐Y., Chuguransky, S., Grego, T., Kandasaamy, S., Mitchell, A., Nuka, G., Paysan‐Lafosse, T., Qureshi, M., Raj, S., Richardson, L., Salazar, G. A., Williams, L., Bork, P., Bridge, A., Gough, J., Haft, D. H., Letunic, I., Marchler‐Bauer, A., … Finn, R. D. (2021). The InterPro protein families and domains database: 20 years on. Nucleic Acids Research, 49(D1), D344–D354. https://doi.org/10.1093/nar/gkaa977

Paysan‐Lafosse, T., Blum, M., Chuguransky, S., Grego, T., Pinto, B. L., Salazar, G. A., Bileschi, M. L., Bork, P., Bridge, A., Colwell, L., Gough, J., Haft, D. H., Letunic, I., Marchler‐Bauer, A., Mi, H., Natale, D. A., Orengo, C. A., Pandurangan, A. P., Rivoire, C., … Bateman, A. (2023). InterPro in 2022. Nucleic Acids Research, 51(D1), D418–D427. https://doi.org/10.1093/nar/gkac993

Hertlein, G., Muller, S., Garcia‐Gonzalez, E., Poppinga, L., Sussmuth, R. D., & Genersch, E. (2014). Production of the catechol type siderophore bacillibactin by the honey bee pathogen Paenibacillus larvae. PLoS ONE, 9(9), e108272. https://doi.org/10.1371/journal.pone.0108272

Spivak, M., Weston, J., Bottou, L., Kall, L., & Noble, W. S. (2009). Improvements to the Percolator algorithm for peptide identification from shotgun proteomics data sets. Journal of Proteome Research, 8(7), 3737–3745. https://doi.org/10.1021/pr801109k

The, M., Maccoss, M. J., Noble, W. S., & Kall, L. (2016). Fast and accurate protein false discovery rates on large‐scale proteomics data sets with Percolator 3.0. Journal of the American Society for Mass Spectrometry, 27(11), 1719–1727. https://doi.org/10.1007/s13361‐016‐1460‐7

Silva, A. S. C., Bouwmeester, R., Martens, L., & Degroeve, S. (2019). Accurate peptide fragmentation predictions allow data driven approaches to replace and improve upon proteomics search engine scoring functions. Bioinformatics, 35(24), 5243–5248. https://doi.org/10.1093/bioinformatics/btz383

Declercq, A., Bouwmeester, R., Hirschler, A., Carapito, C., Degroeve, S., Martens, L., & Gabriels, R. (2022). MS2Rescore: Data‐driven rescoring dramatically boosts immunopeptide identification rates. Molecular & Cellular Proteomics, 21(8), 100266. https://doi.org/10.1016/j.mcpro.2022.100266

Woo, S., Cha, S. W., Na, S., Guest, C., Liu, T., Smith, R. D., Rodland, K. D., Payne, S., & Bafna, V. (2014). Proteogenomic strategies for identification of aberrant cancer peptides using large‐scale next‐generation sequencing data. Proteomics, 14(23–24), 2719–2730. https://doi.org/10.1002/pmic.201400206

Wen, B., & Zhang, B. (2023). PepQuery2 democratizes public MS proteomics data for rapid peptide searching. Nature Communications, 14(1), 2213. https://doi.org/10.1038/s41467‐023‐37462‐4

Bork, P., & Doolittle, R. F. (1992). Proposed acquisition of an animal protein domain by bacteria. Proceedings of the National Academy of Sciences of the United States of America, 89(19), 8990–8994. https://doi.org/10.1073/pnas.89.19.8990

Kaur, J., & Reinhardt, D. P. (2015). Extracellular matrix (ECM) molecules. In: A. Vishwakarma, P. Sharpe, S. Shi, & Ramalingam M. (Eds.), Stem cell biology and tissue engineering in dental sciences (pp. 25–45). Academic Press. https://doi.org/10.1016/B978‐0‐12‐397157‐9.00003‐5

Hobbs, J. K., Meier, E. P. W., Pluvinage, B., Mey, M. A., & Boraston, A. B. (2019). Molecular analysis of an enigmatic Streptococcus pneumoniae virulence factor: The raffinose‐family oligosaccharide utilization system. Journal of Biological Chemistry, 294(46), 17197–17208. https://doi.org/10.1074/jbc.RA119.010280

Chakrapani, N., Fischer, J., Swiontek, K., Codreanu‐Morel, F., Hannachi, F., Morisset, M., Mugemana, C., Bulaev, D., Blank, S., Bindslev‐Jensen, C., Biedermann, T., Ollert, M., & Hilger, C. (2022). α‐Gal present on both glycolipids and glycoproteins contributes to immune response in meat‐allergic patients. Journal of Allergy and Clinical Immunology, 150(2), 396–405.e11. https://doi.org/10.1016/j.jaci.2022.02.030

Athukoralage, J. S., Mcmahon, S. A., Zhang, C., Gruschow, S., Graham, S., Krupovic, M., Whitaker, R. J., Gloster, T. M., & White, M. F. (2020). An anti‐CRISPR viral ring nuclease subverts type III CRISPR immunity. Nature, 577(7791), 572–575. https://doi.org/10.1038/s41586‐019‐1909‐5

Louwen, R., Staals, R. H. J., Endtz, H. P., van Baarlen, P., & van der Oost, J. (2014). The role of CRISPR‐Cas systems in virulence of pathogenic bacteria. Microbiology and Molecular Biology Reviews, 78(1), 74–88. https://doi.org/10.1128/MMBR.00039‐13

Ratledge, C., & Dover, L. G. (2000). Iron metabolism in pathogenic bacteria. Annual Review of Microbiology, 54, 881–941. https://doi.org/10.1146/annurev.micro.54.1.881

Khasheii, B., Mahmoodi, P., & Mohammadzadeh, A. (2021). Siderophores: Importance in bacterial pathogenesis and applications in medicine and industry. Microbiological Research, 250, 126790. https://doi.org/10.1016/j.micres.2021.126790

Braun, V., & Hantke, K. (2011). Recent insights into iron import by bacteria. Current Opinion in Chemical Biology, 15(2), 328–334. https://doi.org/10.1016/j.cbpa.2011.01.005

Al Shaer, D., Al Musaimi, O., De La Torre, B. G., & Albericio, F. (2020). Hydroxamate siderophores: Natural occurrence, chemical synthesis, iron binding affinity and use as Trojan horses against pathogens. European Journal of Medicinal Chemistry, 208, 112791. https://doi.org/10.1016/j.ejmech.2020.112791

Keerthikumar, S., & Mathivanan, S. (2017). Proteomic data storage and sharing. In: S. Keerthikumar, & S. Mathivanan (Eds.), Proteome bioinformatics (pp. 5–15). Humana Press. https://doi.org/10.1007/978‐1‐4939‐6740‐7_2

Martens, L. (2011). Proteomics databases and repositories. In: C. H. Wu, & C. Chen (Eds.), Bioinformatics for comparative proteomics (pp. 213–227). Humana Press. https://doi.org/10.1007/978‐1‐60761‐977‐2_14

Find record

Citation metrics

Loading data ...

Archiving options

Loading data ...