-
Je něco špatně v tomto záznamu ?
Searching protein 3-D structures for optimal structure alignment using intelligent algorithms and data structures
T. Novosád, V. Snášel, A. Abraham, JY. Yang,
Jazyk angličtina Země Spojené státy americké
Typ dokumentu časopisecké články
- MeSH
- algoritmy MeSH
- data mining metody MeSH
- databáze proteinů MeSH
- lidé MeSH
- proteiny chemie MeSH
- reprodukovatelnost výsledků MeSH
- strukturní homologie proteinů MeSH
- terciární struktura proteinů MeSH
- umělá inteligence MeSH
- výpočetní biologie metody MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
In this paper, we present a novel algorithm for measuring protein similarity based on their 3-D structure (protein tertiary structure). The algorithm used a suffix tree for discovering common parts of main chains of all proteins appearing in the current research collaboratory for structural bioinformatics protein data bank (PDB). By identifying these common parts, we build a vector model and use some classical information retrieval (IR) algorithms based on the vector model to measure the similarity between proteins--all to all protein similarity. For the calculation of protein similarity, we use term frequency × inverse document frequency ( tf × idf ) term weighing schema and cosine similarity measure. The goal of this paper is to introduce new protein similarity metric based on suffix trees and IR methods. Whole current PDB database was used to demonstrate very good time complexity of the algorithm as well as high precision. We have chosen the structural classification of proteins (SCOP) database for verification of the precision of our algorithm because it is maintained primarily by humans. The next success of this paper would be the ability to determine SCOP categories of proteins not included in the latest version of the SCOP database (v. 1.75) with nearly 100% precision.
Citace poskytuje Crossref.org
- 000
- 00000naa a2200000 a 4500
- 001
- bmc12026773
- 003
- CZ-PrNML
- 005
- 20160307160204.0
- 007
- ta
- 008
- 120816s2010 xxu f 000 0#eng||
- 009
- AR
- 024 7_
- $a 10.1109/titb.2010.2079939 $2 doi
- 035 __
- $a (PubMed)20876026
- 040 __
- $a ABA008 $b cze $d ABA008 $e AACR2
- 041 0_
- $a eng
- 044 __
- $a xxu
- 100 1_
- $a Novosád, Tomáš $u Department of Computer Science, Vysoká Skola Báňská—Technical University of Ostrava, Ostrava 70833, Czech Republic. tomas.novosad@vsb.cz $7 _AN086321
- 245 10
- $a Searching protein 3-D structures for optimal structure alignment using intelligent algorithms and data structures / $c T. Novosád, V. Snášel, A. Abraham, JY. Yang,
- 520 9_
- $a In this paper, we present a novel algorithm for measuring protein similarity based on their 3-D structure (protein tertiary structure). The algorithm used a suffix tree for discovering common parts of main chains of all proteins appearing in the current research collaboratory for structural bioinformatics protein data bank (PDB). By identifying these common parts, we build a vector model and use some classical information retrieval (IR) algorithms based on the vector model to measure the similarity between proteins--all to all protein similarity. For the calculation of protein similarity, we use term frequency × inverse document frequency ( tf × idf ) term weighing schema and cosine similarity measure. The goal of this paper is to introduce new protein similarity metric based on suffix trees and IR methods. Whole current PDB database was used to demonstrate very good time complexity of the algorithm as well as high precision. We have chosen the structural classification of proteins (SCOP) database for verification of the precision of our algorithm because it is maintained primarily by humans. The next success of this paper would be the ability to determine SCOP categories of proteins not included in the latest version of the SCOP database (v. 1.75) with nearly 100% precision.
- 650 _2
- $a algoritmy $7 D000465
- 650 _2
- $a umělá inteligence $7 D001185
- 650 _2
- $a výpočetní biologie $x metody $7 D019295
- 650 _2
- $a data mining $x metody $7 D057225
- 650 _2
- $a databáze proteinů $7 D030562
- 650 _2
- $a lidé $7 D006801
- 650 _2
- $a terciární struktura proteinů $7 D017434
- 650 _2
- $a proteiny $x chemie $7 D011506
- 650 _2
- $a reprodukovatelnost výsledků $7 D015203
- 650 _2
- $a strukturní homologie proteinů $7 D040681
- 655 _2
- $a časopisecké články $7 D016428
- 700 1_
- $a Snášel, Václav $7 _AN086319
- 700 1_
- $a Abraham, Ajith $7 gn_A_00000694
- 700 1_
- $a Yang, Jack Y
- 773 0_
- $w MED00005198 $t IEEE transactions on information technology in biomedicine a publication of the IEEE Engineering in Medicine and Biology Society $x 1558-0032 $g Roč. 14, č. 6 (2010), s. 1378-1386
- 856 41
- $u https://pubmed.ncbi.nlm.nih.gov/20876026 $y Pubmed
- 910 __
- $a ABA008 $b sig $c sign $y m $z 0
- 990 __
- $a 20120816 $b ABA008
- 991 __
- $a 20160307160218 $b ABA008
- 999 __
- $a ok $b bmc $g 948815 $s 784119
- BAS __
- $a 3
- BAS __
- $a PreBMC
- BMC __
- $a 2010 $b 14 $c 6 $d 1378-1386 $e 20100927 $i 1558-0032 $m IEEE transactions on information technology in biomedicine $n IEEE Trans Inf Technol Biomed $x MED00005198
- LZP __
- $b NLK113 $a Pubmed-20120816/11/01