Automated shape-based clustering of 3D immunoglobulin protein structures in chronic lymphocytic leukemia

. 2018 Nov 20 ; 19 (Suppl 14) : 414. [epub] 20181120

Jazyk angličtina Země Anglie, Velká Británie Médium electronic

Typ dokumentu časopisecké články

Perzistentní odkaz   https://www.medvik.cz/link/pmid30453883
Odkazy

PubMed 30453883
PubMed Central PMC6245605
DOI 10.1186/s12859-018-2381-1
PII: 10.1186/s12859-018-2381-1
Knihovny.cz E-zdroje

BACKGROUND: Although the etiology of chronic lymphocytic leukemia (CLL), the most common type of adult leukemia, is still unclear, strong evidence implicates antigen involvement in disease ontogeny and evolution. Primary and 3D structure analysis has been utilised in order to discover indications of antigenic pressure. The latter has been mostly based on the 3D models of the clonotypic B cell receptor immunoglobulin (BcR IG) amino acid sequences. Therefore, their accuracy is directly dependent on the quality of the model construction algorithms and the specific methods used to compare the ensuing models. Thus far, reliable and robust methods that can group the IG 3D models based on their structural characteristics are missing. RESULTS: Here we propose a novel method for clustering a set of proteins based on their 3D structure focusing on 3D structures of BcR IG from a large series of patients with CLL. The method combines techniques from the areas of bioinformatics, 3D object recognition and machine learning. The clustering procedure is based on the extraction of 3D descriptors, encoding various properties of the local and global geometrical structure of the proteins. The descriptors are extracted from aligned pairs of proteins. A combination of individual 3D descriptors is also used as an additional method. The comparison of the automatically generated clusters to manual annotation by experts shows an increased accuracy when using the 3D descriptors compared to plain bioinformatics-based comparison. The accuracy is increased even more when using the combination of 3D descriptors. CONCLUSIONS: The experimental results verify that the use of 3D descriptors commonly used for 3D object recognition can be effectively applied to distinguishing structural differences of proteins. The proposed approach can be applied to provide hints for the existence of structural groups in a large set of unannotated BcR IG protein files in both CLL and, by logical extension, other contexts where it is relevant to characterize BcR IG structural similarity. The method does not present any limitations in application and can be extended to other types of proteins.

Zobrazit více v PubMed

Bender A, Glen RC. Molecular similarity: a key technique in molecular informatics. Org Biomol Chem. 2004;2(22):3204–18. doi: 10.1039/b409813g. PubMed DOI

Webb B, Sali A. Comparative protein structure modeling using MODELLER. Curr Protoc Bioinforma. 2014;47(1):5–6. doi: 10.1002/0471250953.bi0506s47. PubMed DOI

Axenopoulos A, Rafailidis D, Papadopoulos G, Houstis EN, Daras P. Similarity search of flexible 3d molecules combining local and global shape descriptors. IEEE/ACM Trans Comput Biol Bioinforma. 2016;13(5):954–70. doi: 10.1109/TCBB.2015.2498553. PubMed DOI

Murzin AG, Brenner SE, Hubbard T, Chothia C. Scop: a structural classification of proteins database for the investigation of sequences and structures. J Mol Biol. 1995;247(4):536–40. PubMed

Knudsen M, Wiuf C. The cath database. Hum Genomics. 2010;4(3):207. doi: 10.1186/1479-7364-4-3-207. PubMed DOI PMC

Csaba G, Birzele F, Zimmer R. Systematic comparison of scop and cath: a new gold standard for protein structure analysis. BMC Struct Biol. 2009;9(1):23. doi: 10.1186/1472-6807-9-23. PubMed DOI PMC

Sillitoe I, Dawson N, Thornton J, Orengo C. The history of the cath structural classification of protein domains. Biochimie. 2015;119:209–17. doi: 10.1016/j.biochi.2015.08.004. PubMed DOI PMC

Li Z, Natarajan P, Ye Y, Hrabe T, Godzik A. Posa: a user-driven, interactive multiple protein structure alignment server. Nucleic Acids Res. 2014;42(W1):240–5. doi: 10.1093/nar/gku394. PubMed DOI PMC

Liu Y-S, Li Q, Zheng G-Q, Ramani K, Benjamin W. Using diffusion distances for flexible molecular shape comparison. BMC Bioinformatics. 2010;11(1):480. doi: 10.1186/1471-2105-11-480. PubMed DOI PMC

Arenas AF, Salcedo GE, Montoya AM, Gomez-Marin JE. Msca: a spectral comparison algorithm between time series to identify protein-protein interactions. BMC Bioinformatics. 2015;16(1):152. doi: 10.1186/s12859-015-0599-8. PubMed DOI PMC

Srivastava S, Lal SB, Mishra D, Angadi U, Chaturvedi K, Rai SN, Rai A. An efficient algorithm for protein structure comparison using elastic shape analysis. Algoritm Mol Biol. 2016;11(1):27. doi: 10.1186/s13015-016-0089-1. PubMed DOI PMC

Kamburov A, Lawrence MS, Polak P, Leshchiner I, Lage K, Golub TR, Lander ES, Getz G. Comprehensive assessment of cancer missense mutation clustering in protein structures. Proc Natl Acad Sci. 2015;112(40):5486–95. doi: 10.1073/pnas.1516373112. PubMed DOI PMC

Jiang M, Xu Y, Zhu B. Protein structure–structure alignment with discrete fréchet distance. J Bioinforma Comput Biol. 2008;6(01):51–64. doi: 10.1142/S0219720008003278. PubMed DOI

Ballester PJ, Richards WG. Ultrafast shape recognition to search compound databases for similar molecular shapes. J Comput Chem. 2007;28(10):1711–23. doi: 10.1002/jcc.20681. PubMed DOI

Ballester PJ, Richards WG. Ultrafast shape recognition for similarity search in molecular databases. In: Proceedings of the Royal Society of London A: Mathematical, Physical and Engineering Sciences. 463, 2081. The Royal Society;2007.1307–1321.

Bock M, Cortelazzo G, Ferrari C, Guerra C. Identifying similar surface patches on proteins using a spin-image surface representation. In: Combinatorial Pattern Matching. Springer: 2005. p. 29–99.

Ankerst M, Kastenmüller G, Kriegel H-P, Seidl T. 3d shape histograms for similarity search and classification in spatial databases. In: International Symposium on Spatial Databases. Springer: 1999. p. 207–26.

Kinoshita K, Nakamura H. Identification of protein biochemical functions by similarity search using the molecular surface database ef-site. Protein Sci. 2003;12(8):1589–95. doi: 10.1110/ps.0368703. PubMed DOI PMC

Furuya T, Ohbuchi R. Dense sampling and fast encoding for 3d model retrieval using bag-of-visual features. In: Proceedings of the ACM International Conference on Image and Video Retrieval. ACM: 2009. p. 26.

Chen D-Y, Tian X-P, Shen Y-T, Ouhyoung M. On visual similarity based 3d model retrieval, vol. 22. In: Computer Graphics Forum. Wiley Online Library: 2003. p. 223–32.

Chen BY, Honig B. Vasp: a volumetric analysis of surface properties yields insights into protein-ligand binding specificity. PLoS Comput Biol. 2010;6(8):1000881. doi: 10.1371/journal.pcbi.1000881. PubMed DOI PMC

Chen BY. Vasp-e: Specificity annotation with a volumetric analysis of electrostatic isopotentials. PLoS Comput Biol. 2014;10(8):1003792. doi: 10.1371/journal.pcbi.1003792. PubMed DOI PMC

Amin SR, Erdin S, Ward RM, Lua RC, Lichtarge O. Prediction and experimental validation of enzyme substrate specificity in protein structures. Proc Natl Acad Sci. 2013;110(45):4195–202. doi: 10.1073/pnas.1305162110. PubMed DOI PMC

Wang Y, You Z, Li X, Chen X, Jiang T, Zhang J. Pcvmzm: Using the probabilistic classification vector machines model combined with a zernike moments descriptor to predict protein–protein interactions from protein sequences. Int J Mol Sci. 2017;18(5):1029. doi: 10.3390/ijms18051029. PubMed DOI PMC

Wang Y-B, You Z-H, Li L-P, Huang Y-A, Yi H-C. Detection of interactions between proteins by using legendre moments descriptor to extract discriminatory information embedded in pssm. Molecules. 2017;22(8):1366. doi: 10.3390/molecules22081366. PubMed DOI PMC

Sael L, Li B, La D, Fang Y, Ramani K, Rustamov R, Kihara D. Fast protein tertiary structure retrieval based on global surface shape similarity. Proteins Struct Funct Bioinforma. 2008;72(4):1259–73. doi: 10.1002/prot.22030. PubMed DOI

Ritchie DW, Venkatraman V. Ultra-fast fft protein docking on graphics processors. Bioinformatics. 2010;26(19):2398–405. doi: 10.1093/bioinformatics/btq444. PubMed DOI

Sit A, Kihara D. Comparison of image patches using local moment invariants. IEEE Trans Image Process. 2014;23(5):2369–79. doi: 10.1109/TIP.2014.2315923. PubMed DOI

Eck S, Wörz S, Müller-Ott K, Hahn M, Biesdorf A, Schotta G, Rippe K, Rohr K. A spherical harmonics intensity model for 3d segmentation and 3d shape analysis of heterochromatin foci. Med Image Anal. 2016;32:18–31. doi: 10.1016/j.media.2016.03.001. PubMed DOI

Li Z, Geng C, He P, Yao Y. A novel method of 3d graphical representation and similarity analysis for proteins. MATCH Commun Math Comput Chem. 2014;71:213–26.

Fang Y, Liu Y-S, Ramani K. Three dimensional shape comparison of flexible proteins using the local-diameter descriptor. BMC Struct Biol. 2009;9(1):29. doi: 10.1186/1472-6807-9-29. PubMed DOI PMC

Li B, Lu Y, Li C, Godil A, Schreck T, Aono M, Burtscher M, Chen Q, Chowdhury NK, Fang B, et al. A comparison of 3d shape retrieval methods based on a large-scale benchmark supporting multimodal queries. Comp Vision Image Underst. 2015;131:1–27. doi: 10.1016/j.cviu.2014.10.006. DOI

Can T, Wang Y-F. Ctss: a robust and efficient method for protein structure alignment based on local geometrical and biological features. In: Bioinformatics Conference, 2003. CSB 2003. Proceedings of the 2003 IEEE. IEEE: 2003. p. 169–79. PubMed

Mrozek D, BroŻek M, Małysiak-Mrozek B. Parallel implementation of 3d protein structure similarity searches using a gpu and the cuda. J Mol Model. 2014;20(2):2067. doi: 10.1007/s00894-014-2067-1. PubMed DOI PMC

Marcatili P, Ghiotto F, Tenca C, Chailyan A, Mazzarello AN, Yan X-J, Colombo M, Albesiano E, Bagnara D, Cutrona G, et al. Igs expressed by chronic lymphocytic leukemia b cells show limited binding-site structure variability. J Immunol. 2013;190(11):5771–8. doi: 10.4049/jimmunol.1300321. PubMed DOI

Sutton L-A, Agathangelidis A, Belessi C, Darzentas N, Davi F, Ghia P, Rosenquist R, Stamatopoulos K. Antigen selection in b-cell lymphomas—tracing the evidence. vol. 23. In: Seminars in Cancer Biology. Elsevier: 2013. p. 399–409. PubMed

Agathangelidis A, Darzentas N, Hadzidimitriou A, Brochet X, Murray F, Yan X-J, Davis Z, van Gastel-Mol EJ, Tresoldi C, Chu CC, et al. Stereotyped b-cell receptors in one-third of chronic lymphocytic leukemia: a molecular classification with implications for targeted therapies. Blood. 2012;119(19):4467–75. doi: 10.1182/blood-2011-11-393694. PubMed DOI PMC

Stamatopoulos K, Agathangelidis A, Rosenquist R, Ghia P. Antigen receptor stereotypy in chronic lymphocytic leukemia. Leukemia. 2017;31(2):282. doi: 10.1038/leu.2016.322. PubMed DOI

Rusu RB, Blodow N, Beetz M. Fast point feature histograms (fpfh) for 3d registration. In: Robotics and Automation, 2009. ICRA’09. IEEE International Conference On. IEEE: 2009. p. 3212–7.

Frome A, Huber D, Kolluri R, Bülow T, Malik J. Recognizing objects in range data using regional point descriptors. Comp Vision -ECCV 2004. 2004:224–37.

Marton Z-C, Pangercic D, Blodow N, Kleinehellefort J, Beetz M. General 3d modelling of novel objects from a single view. In: Intelligent Robots and Systems (IROS), 2010 IEEE/RSJ International Conference On. IEEE: 2010. p. 3700–5.

Rusu RB, Bradski G, Thibaux R, Hsu J. Fast 3d recognition and pose using the viewpoint feature histogram. In: Intelligent Robots and Systems (IROS), 2010 IEEE/RSJ International Conference2 On. IEEE: 2010. p. 2155–62.

Zhang Y, Skolnick J. Tm-align: a protein structure alignment algorithm based on the tm-score. Nucleic Acids Res. 2005;33(7):2302–9. doi: 10.1093/nar/gki524. PubMed DOI PMC

Zhang Y, Skolnick J. Scoring function for automated assessment of protein structure template quality. Proteins Struct Funct Bioinforma. 2004;57(4):702–10. doi: 10.1002/prot.20264. PubMed DOI

Rusu RB, Cousins S. 3D is here: Point Cloud Library (PCL). In: IEEE International Conference on Robotics and Automation (ICRA). Shanghai: 2011.

Rusu RB, Marton ZC, Blodow N, Beetz M. Learning informative point classes for the acquisition of object model maps. In: Control, Automation, Robotics and Vision, 2008. ICARCV 2008. 10th International Conference On. IEEE: 2008. p. 643–650.

Hallek M, Cheson BD, Catovsky D, Caligaris-Cappio F, Dighiero G, Döhner H, Hillmen P, Keating MJ, Montserrat E, Rai KR, et al. Guidelines for the diagnosis and treatment of chronic lymphocytic leukemia: a report from the international workshop on chronic lymphocytic leukemia updating the national cancer institute–working group 1996 guidelines. Blood. 2008;111(12):5446–56. doi: 10.1182/blood-2007-06-093906. PubMed DOI PMC

Darzentas N, Stamatopoulos K. The significance of stereotyped b-cell receptors in chronic lymphocytic leukemia. Hematol Oncol Clin N Am. 2013;27(2):237–50. doi: 10.1016/j.hoc.2012.12.001. PubMed DOI

Bystry V, Agathangelidis A, Bikos V, Sutton LA, Baliakas P, Hadzidimitriou A, Stamatopoulos K, Darzentas N. Arrest/assignsubsets: a novel application for robust subclassification of chronic lymphocytic leukemia based on b cell receptor ig stereotypy. Bioinformatics. 2015;31(23):3844–6. PubMed

Marcatili P, Olimpieri PP, Chailyan A, Tramontano A. Antibody modeling using the prediction of immunoglobulin structure (pigs) web server. Nat Protoc. 2014;9(12):2771–83. doi: 10.1038/nprot.2014.189. PubMed DOI

Henikoff S, Henikoff JG. Amino acid substitution matrices from protein blocks. Proc Natl Acad Sci. 1992;89(22):10915–9. doi: 10.1073/pnas.89.22.10915. PubMed DOI PMC

Messih MA, Lepore R, Marcatili P, Tramontano A. Improving the accuracy of the structure prediction of the third hypervariable loop of the heavy chains of antibodies. Bioinformatics. 2014;30(19):2733–40. doi: 10.1093/bioinformatics/btu194. PubMed DOI PMC

Marcatili P, Mochament K, Agathangelidis A, Moschonas P, Sutton L-A, Yan X-J, Bikos V, Vardi A, Chailyan A, Stavroyianni N, et al.Automated clustering analysis of immunoglobulin sequences in chronic lymphocytic leukemia based on 3D structural descriptors. Blood. 2016; 128(22).

Vardi A, Agathangelidis A, Sutton L-A, Chatzouli M, Scarfò L, Mansouri L, Douka V, Anagnostopoulos A, Darzentas N, Rosenquist R, et al. Igg-switched cll has a distinct immunogenetic signature from the common md variant: ontogenetic implications. Clin Cancer Res. 2014;20(2):323–30. doi: 10.1158/1078-0432.CCR-13-1993. PubMed DOI

Ortiz AR, Strauss CE, Olmea O. Mammoth (matching molecular models obtained from theory): an automated method for model comparison. Protein Sci. 2002;11(11):2606–21. doi: 10.1110/ps.0215902. PubMed DOI PMC

Zemla A. Lga: a method for finding 3d similarities in protein structures. Nucleic Acids Res. 2003;31(13):3370–4. doi: 10.1093/nar/gkg571. PubMed DOI PMC

Wrabl JO, Grishin NV. Statistics of random protein superpositions: p-values for pairwise structure alignment. J Comput Biol. 2008;15(3):317–55. doi: 10.1089/cmb.2007.0161. PubMed DOI

Kolodny R, Koehl P, Levitt M. Comprehensive evaluation of protein structure alignment methods: scoring by geometric measures. J Mol Biol. 2005;346(4):1173–88. doi: 10.1016/j.jmb.2004.12.032. PubMed DOI PMC

Pandit SB, Skolnick J. Fr-tm-align: a new protein structural alignment method based on fragment alignments and the tm-score. BMC Bioinformatics. 2008;9(1):531. doi: 10.1186/1471-2105-9-531. PubMed DOI PMC

Aung Z, Tan K-L. Matalign: precise protein structure comparison by matrix alignment. J Bioinforma Comput Biol. 2006;4(06):1197–216. doi: 10.1142/S0219720006002417. PubMed DOI

Martínez L, Andreani R, Martínez JM. Convergent algorithms for protein structural alignment. BMC Bioinformatics. 2007;8(1):306. doi: 10.1186/1471-2105-8-306. PubMed DOI PMC

Krissinel E, Henrick K. Secondary-structure matching (ssm), a new tool for fast protein structure alignment in three dimensions. Acta Crystallogr D Biol Crystallogr. 2004;60(12):2256–68. doi: 10.1107/S0907444904026460. PubMed DOI

Najít záznam

Citační ukazatele

Nahrávání dat ...

Možnosti archivace

Nahrávání dat ...