Mass spectrometry proteomics data are typically evaluated against publicly available annotated sequences, but the proteogenomics approach is a useful alternative. A single genome is commonly utilized in custom proteomic and proteogenomic data analysis. We pose the question of whether utilizing numerous different genome assemblies in a search database would be beneficial. We reanalyzed raw data from the exoprotein fraction of four reference Enterobacterial Repetitive Intergenic Consensus (ERIC) I-IV genotypes of the honey bee bacterial pathogen Paenibacillus larvae and evaluated them against three reference databases (from NCBI-protein, RefSeq, and UniProt) together with an array of protein sequences generated by six-frame direct translation of 15 genome assemblies from GenBank. The wide search yielded 453 protein hits/groups, which UpSet analysis categorized into 50 groups based on the success of protein identification by the 18 database components. Nine hits that were not identified by a unique peptide were not considered for marker selection, which discarded the only protein that was not identified by the reference databases. We propose that the variability in successful identifications between genome assemblies is useful for marker mining. The results suggest that various strains of P. larvae can exhibit specific traits that set them apart from the established genotypes ERIC I-V.
- MeSH
- bakteriální proteiny * genetika metabolismus MeSH
- databáze proteinů MeSH
- faktory virulence * genetika metabolismus MeSH
- genom bakteriální * genetika MeSH
- Paenibacillus larvae * genetika patogenita metabolismus MeSH
- proteogenomika * metody MeSH
- proteomika metody MeSH
- včely mikrobiologie MeSH
- zvířata MeSH
- Check Tag
- zvířata MeSH
- Publikační typ
- časopisecké články MeSH
The allergen repertoire of the house dust mite, Dermatophagoides farinae, is incomplete despite most mite allergens having been described in this species. Using proteogenomics, we aimed to compare proteins and allergens between sexes and provide a foundation for the identification of novel allergens. Overall, 6297 protein hits were identified, and 2899 and 886 were male- and female-specific, respectively. Removal of trace results narrowed the dataset to 3478 hits, including 275 and 157 male- and female-specific hits, respectively. All 34 WHO/IUIS-approved D. farinae allergens (omitting Der f 17) were identified, and we also identified homologs of the yet undescribed Der f 9 and 38. Der f 27/serpin exhibited the largest sex-dependent difference and was dominant in females. Using official protein sequences, Der f 11, 14, 23, 28 and 30 were identified with low success. However, identification success of Der f 11 and 14 was greatly increased by using longer/complete sequences. Der f 30 is characterized by the same tryptic digests as the more abundant Der f 30 (isoform) identified here. Der f 23 appears to be of low abundance in mite bodies. Der f 28.0101 and Der f 28.0201 were detected at low abundance and in trace amounts, respectively. SIGNIFICANCE: In this work, we performed a proteogenomic annotation of the house dust mite, Dermatophagoides farinae, which is the most important source of house dust allergens. The proteogenomic analysis performed here provides a foundation for not only understanding the biology of the mite but also the identification of novel allergens. This study generated a robust proteomic dataset for D. farinae and reviewed existing and candidate allergens in this species. We stress some pitfalls of high-throughput analyses, especially that improper headers of allergen protein records provided in databases can lead to confusion. Using partial sequences in proteomic identification and quantification can lead to low identification success (low signal intensity or MS/MS counts). Thus, we individually curated the protein sequences for proper identification and quantification. The discovered sex differences can be one factor affecting allergen/immunogen variations in mite extracts. Overall, this work provides a benchmark for accurate identification of mite immunogenic proteins using proteomics.
- MeSH
- alergeny genetika imunologie metabolismus MeSH
- Dermatophagoides farinae genetika imunologie metabolismus MeSH
- proteiny členovců genetika imunologie metabolismus MeSH
- proteogenomika metody MeSH
- proteom metabolismus MeSH
- Pyroglyphidae genetika imunologie metabolismus MeSH
- sekvence aminokyselin MeSH
- sekvenční homologie MeSH
- sexuální faktory MeSH
- zvířata MeSH
- Check Tag
- mužské pohlaví MeSH
- ženské pohlaví MeSH
- zvířata MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Research Support, U.S. Gov't, Non-P.H.S. MeSH
Circulating tumor cells (CTCs) are rare cells that can be found in the peripheral blood of cancer patients. They have been demonstrated to be useful prognostic markers in many cancer types. Within the last decade various methods have been developed to detect rare cells within a liquid biopsy from a cancer patient. These methods have revealed the phenotypic diversity of CTCs and how they can represent the complement of cells that are found in a tumor. Single-cell proteogenomics has emerged as an all-encompassing next-generation technological approach for CTC research. This allows for the deconstruction of cellular heterogeneity, dynamics of metastatic initiation and progression, and response or resistance to therapeutics in the clinical settings. We take advantage of this opportunity to investigate CTC heterogeneity and understand their full potential in precision medicine.The high-definition single-cell analysis (HD-SCA) workflow combines detection of the entire population of CTCs and rare cancer related cells with single-cell genomic analysis and may therefore provide insight into their subpopulations based on molecular as well as morphological data. In this chapter we describe in detail the protocols from isolation of a candidate cell from a microscopy slide, through whole-genome amplification and library preparation, to CNV analysis of identified cells from the HD-SCA workflow. This process may also be applicable to any platform starting with a standard microscopy slide or isolated cell of interest.