A parallel genetic algorithm for single class pattern classification and its application for gene expression profiling in Streptomyces coelicolor

. 2007 Feb 13 ; 8 () : 49. [epub] 20070213

Jazyk angličtina Země Anglie, Velká Británie Médium electronic

Typ dokumentu časopisecké články, práce podpořená grantem

Perzistentní odkaz   https://www.medvik.cz/link/pmid17298664

BACKGROUND: Identification of coordinately regulated genes according to the level of their expression during the time course of a process allows for discovering functional relationships among genes involved in the process. RESULTS: We present a single class classification method for the identification of genes of similar function from a gene expression time series. It is based on a parallel genetic algorithm which is a supervised computer learning method exploiting prior knowledge of gene function to identify unknown genes of similar function from expression data. The algorithm was tested with a set of randomly generated patterns; the results were compared with seven other classification algorithms including support vector machines. The algorithm avoids several problems associated with unsupervised clustering methods, and it shows better performance then the other algorithms. The algorithm was applied to the identification of secondary metabolite gene clusters of the antibiotic-producing eubacterium Streptomyces coelicolor. The algorithm also identified pathways associated with transport of the secondary metabolites out of the cell. We used the method for the prediction of the functional role of particular ORFs based on the expression data. CONCLUSION: Through analysis of a time series of gene expression, the algorithm identifies pathways which are directly or indirectly associated with genes of interest, and which are active during the time course of the experiment.

Zobrazit více v PubMed

Eisen MB, Spellman PT, Brown PO, Botstein D. Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci U S A. 1998;95:14863–14868. doi: 10.1073/pnas.95.25.14863. PubMed DOI PMC

Bar-Joseph Z, Demaine ED, Gifford DK, Srebro N, Hamel AM, Jaakkola TS. K-ary clustering with optimal leaf ordering for gene expression data. Bioinformatics. 2003;19:1070–1078. doi: 10.1093/bioinformatics/btg030. PubMed DOI

Belacel N, Cuperlovic-Culf M, Laflamme M, Ouellette R. Fuzzy J-Means and VNS methods for clustering genes from microarray data. Bioinformatics. 2004;20:1690–1701. doi: 10.1093/bioinformatics/bth142. PubMed DOI

Michaels GS, Carr DB, Askenazi M, Fuhrman S, Wen X, Somogyi R. Cluster analysis and data visualization of large-scale gene expression data. PSB. 1998. pp. 42–53. PubMed

DeRisi JL, Lyer VR, Brown PO. Exploring metabolic and genetic control expression on genomic scale. Science. 1997;278:680. doi: 10.1126/science.278.5338.680. PubMed DOI

Alon U, Barkai N, Notterman DA, Gish K, Ybarra S, Mack D, Levine AJ. Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc Natl Acad Sci U S A. 1999;96:6745–6750. doi: 10.1073/pnas.96.12.6745. PubMed DOI PMC

Carr DB, Somogyi R, Michaels G. Templates for Looking at Gene Expression Clustering. Statistical Computing and statistical Graphics Newsletter. 1997. pp. 20–29.

Wen X, Fuhrman S, Michaels GS, Carr DB, Smith S, Barker JL, Somogyi R. Large-Scale Temporal Gene Expression Mapping of Central Nervous System Development. Proc Natl Acad Sci USA. 1998;95:334–339. doi: 10.1073/pnas.95.1.334. PubMed DOI PMC

Mateos A, Dopazo J, Jansen R, Tu Y, Gerstein M, Stolovitzky G. Systematic learning of gene functional classes from DNA array expression data by using multilayer perceptrons. Genome Res. 2002;12:1703–1715. doi: 10.1101/gr.192502. PubMed DOI PMC

Golub TR, Slonim DK, Tamayo P, Huard C, Gaasenbeek M, Mesirov JP, Coller H, Loh ML, Downing JR, Caligiuri MA, Bloomfield CD, Lander ES. Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science. 1999;286:531–537. doi: 10.1126/science.286.5439.531. PubMed DOI

Vohradsky J. Adaptive classification of two-dimensional gel electrophoretic spot patterns by neural networks and cluster analysis. Electrophoresis. 1997;18:2749–2754. doi: 10.1002/elps.1150181508. PubMed DOI

Brown MP, Grundy WN, Lin D, Cristianini N, Sugnet CW, Furey TS, Ares M, Jr., Haussler D. Knowledge-based analysis of microarray gene expression data by using support vector machines. Proc Natl Acad Sci U S A. 2000;97:262–267. doi: 10.1073/pnas.97.1.262. PubMed DOI PMC

Furey TS, Cristianini N, Duffy N, Bednarski DW, Schummer M, Haussler D. Support vector machine classification and validation of cancer tissue samples using microarray expression data. Bioinformatics. 2000;16:906–914. doi: 10.1093/bioinformatics/16.10.906. PubMed DOI

Lee Y, Lee CK. Classification of multiple cancer types by multicategory support vector machines using gene expression data. Bioinformatics. 2003;19:1132–1139. doi: 10.1093/bioinformatics/btg102. PubMed DOI

Valentini G. Gene expression data analysis of human lymphoma using support vector machines and output coding ensembles. Artif Intell Med. 2002;26:281–304. doi: 10.1016/S0933-3657(02)00077-5. PubMed DOI

Ando S, Iba H. Classification of gene expression profile using combinatory method of evolutionary computation and machine learning. Genetic Programming and Evolvable Machines. 2004;5:145–156. doi: 10.1023/B:GENP.0000023685.83861.69. DOI

Pan KH, Lih CJ, Cohen SN. Analysis of DNA microarrays using algorithms that employ rule-based expert knowledge. Proc Natl Acad Sci U S A. 2002;99:2118–2123. doi: 10.1073/pnas.251687398. PubMed DOI PMC

Bar-Joseph Z. Analyzing time series gene expression data. Bioinformatics. 2004;20:2493–2503. doi: 10.1093/bioinformatics/bth283. PubMed DOI

Spellman PT, Sherlock G, Zhang MQ, Iyer VR, Anders K, Eisen MB, Brown PO, Botstein D, Futcher B. Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. Mol Biol Cell. 1998;9:3273–3297. PubMed PMC

Grunenfelder B, Rummel G, Vohradsky J, Roder D, Langen H, Jenal U. Proteomic analysis of the bacterial cell cycle. Proc Natl Acad Sci USA. 2001;98:4681–4686. doi: 10.1073/pnas.071538098. PubMed DOI PMC

Huang J, Lih CJ, Pan KH, Cohen SN. Global analysis of growth phase responsive gene expression and regulation of antibiotic biosynthetic pathways in Streptomyces coelicolor using DNA microarrays. Genes Dev. 2001;15:3183–3192. doi: 10.1101/gad.943401. PubMed DOI PMC

Novotna J, Vohradsky J, Berndt P, Gramajo H, Langen H, Li XM, Minas W, Orsaria L, Roeder D, Thompson CJ. Proteomics studies of diauxic lag in the differentiating prokaryote Streptomyces coelicolor reveal a regulatory network of stress-induced proteins and central metabolic enzymes. Mol Micro. 2003;48:1289–1303. doi: 10.1046/j.1365-2958.2003.03529.x. PubMed DOI

Cantu-Paz E. Efficient and accurate parallel genetic algorithms. Massachusetts, USA , Kluwer Academic Publishers; 2001.

http://genome-www5stanfordedu. http://genome-www5.stanford.edu; Stanford MicroArray Database.

To CC, Vohradsky J. Classification of proteomic kinetic patterns using supervised genetic programming: Edinbourgh, UK. 2005. pp. 1823–1830.

Bentley SD, Chater KF, Cerdeno-Tarraga AM, Challis GL, Thomson NR, James KD, Harris DE, Quail MA, Kieser H, Harper D, Bateman A, Brown S, Chandra G, Chen CW, Collins M, Cronin A, Fraser A, Goble A, Hidalgo J, Hornsby T, Howarth S, Huang CH, Kieser T, Larke L, Murphy L, Oliver K, O'Neil S, Rabbinowitsch E, Rajandream MA, Rutherford K, Rutter S, Seeger K, Saunders D, Sharp S, Squares R, Squares S, Taylor K, Warren T, Wietzorrek A, Woodward J, Barrell BG, Parkhill J, Hopwood DA. Complete genome sequence of the model actinomycete Streptomyces coelicolor A3(2) Nature. 2002;417:141–147. doi: 10.1038/417141a. PubMed DOI

Sanger Institute. http://www.sanger.ac.uk/Projects/S_coelicolor/;

Scholkopf B, Smola AJ. Learning with kernels. Cambridge, Massachusets , The MIT Press; 2002. pp. 187–222.

Alba E, Laguna M, Luque G. Workforce Planning with a Parallel Genetic Algorithm: Granada, Spain. 2005. pp. 911–919.

Calegari P, Guidec F, Kuonen P, Kobler D. Parallel island-based genetic algorithm for radio network design. . Journal of Parallel and Distributed Computing (JPDC): Special Issue on Parallel Evolutionary Computing. 1997;47:86–90.

de Vega FF. Parallel genetic programming: Edinburg, UK. 2005.

Nejnovějších 20 citací...

Zobrazit více v
Medvik | PubMed

General and molecular microbiology and microbial genetics in the IM CAS

. 2010 Dec ; 37 (12) : 1227-39. [epub] 20101118

Najít záznam

Citační ukazatele

Nahrávání dat ...

Možnosti archivace

Nahrávání dat ...