Gaining Insight into Large Gene Families with the Aid of Bioinformatic Tools
Jazyk angličtina Země Spojené státy americké Médium print
Typ dokumentu přehledy, časopisecké články, práce podpořená grantem
- Klíčová slova
- Formins, Molecular phylogenetics, Multidomain proteins, Multiple sequence alignment, Phylogenetic tree construction, Splicing prediction, Transcriptome analysis,
- MeSH
- aktiny * metabolismus MeSH
- forminy metabolismus MeSH
- rostlinné proteiny * metabolismus MeSH
- terciární struktura proteinů MeSH
- výpočetní biologie MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- přehledy MeSH
- Názvy látek
- aktiny * MeSH
- forminy MeSH
- rostlinné proteiny * MeSH
Proteins participating in plant cell morphogenesis are often encoded by large gene families, in some cases comprising paralogs with variable (modular) domain organization, as in the case of the formin (FH2 protein) family of actin nucleators that can have also additional functions. Unravelling the phylogeny of such a complex gene family brings a number of specific challenges but may be crucial for predictions of protein function and for experimental design. Here we present an overview of our "cottage industry" semi-manual bioinformatic approach, based mostly, though not exclusively, on freely available software tools, which we used to obtain insight into the evolutionary history of plant FH2 proteins and some other components of the plant cell morphogenesis apparatus.
Zobrazit více v PubMed
Žárský V, Cvrčková F, Potocký M et al (2009) Exocytosis and cell polarity in plants – exocyst and recycling domains. New Phytol 183:255–272. https://doi.org/10.1111/j.1469-8137.2009.02880.x DOI
Eliáš M, Drdová E, Žiak D et al (2003) The exocyst complex in plants. Cell Biol Int 27:199–201. https://doi.org/10.1016/s1065-6995(02)00349-9 DOI
Cvrčková F, Grunt M, Bezvoda R et al (2012) Evolution of the land plant exocyst complexes. Front Plant Sci 3:159. https://doi.org/10.3389/fpls.2012.00159 DOI
Rawat A, Brejšková L, Hála M et al (2017) The Physcomitrella patens exocyst subunit EXO70.3d has distinct roles in growth and development, and is essential for completion of the moss life cycle. New Phytol 216:438–454. https://doi.org/10.1111/nph.14548 DOI
Žárský V, Sekereš J, Kubátová Z et al (2020) Three subfamilies of exocyst EXO70 family subunits in land plants: early divergence and ongoing functional specialization. J Exp Bot 71:49–62. https://doi.org/10.1093/jxb/erz423 DOI
Marković V, Cvrčková F, Potocký M et al (2020) EXO70A2 is critical for exocyst complex function in pollen development. Plant Physiol 184:1823–1839. https://doi.org/10.1104/pp.19.01340 DOI
Eliáš M, Potocký M, Cvrčková F et al (2002) Molecular diversity of phospholipase D in angiosperms. BMC Genomics 3:2. https://doi.org/10.1186/1471-2164-3-2 DOI
Andreeva Z, Ho AYY, Barthet MM et al (2009) Phospholipase D family interactions with the cytoskeleton: isoform delta promotes plasma membrane anchoring of cortical microtubules. Funct Plant Biol 36:600–612. https://doi.org/10.1071/FP09024 DOI
Pleskot R, Pejchar P, Bezvoda R et al (2012) Turnover of phosphatidic acid through distinct signaling pathways affects multiple aspects of pollen tube growth in tobacco. Front Plant Sci 3:54. https://doi.org/10.3389/fpls.2012.00054 DOI
Cvrčková F (2000) Are plant formins integral membrane proteins? Genome Biol. 1:RESEARCH001. https://doi.org/10.1186/gb-2000-1-1-research001 DOI
Cvrčková F, Novotný M, Pícková D et al (2004) Formin homology 2 domains occur in multiple contexts in angiosperms. BMC Genomics 5:44. https://doi.org/10.1186/1471-2164-5-44 DOI
Grunt M, Žárský V, Cvrčková F (2008) Roots of angiosperm formins: the evolutionary history of plant FH2 domain-containing proteins. BMC Evol Biol 8:115. https://doi.org/10.1186/1471-2148-8-115 DOI
Cvrčková F, Grunt M, Žárský V (2012) Expression of GFP-mTalin reveals an actin-related role for the Arabidopsis Class II formin AtFH12. Biol Plant 56:431–440. https://doi.org/10.1007/s10535-012-0071-9 DOI
Kollárová E, Baquero Forero A, Stillerová L et al (2020) Arabidopsis class II formins AtFH13 and AtFH14 can form heterodimers but exhibit distinct patterns of cellular localization. Int J Mol Sci 21:348. https://doi.org/10.3390/ijms21010348 DOI
Kollárová E, Baquero Forero A, Cvrčková F (2021) The Arabidopsis thaliana class II formin FH13 modulates pollen tube growth. Front Plant Sci 12:599961. https://doi.org/10.3389/fpls.2021.599961 DOI
Soukup A, Tylová E (2019) Essential methods of plant sample preparation for light microscopy. Methods Mol Biol 1992:1–26. https://doi.org/10.1007/978-1-4939-9469-4_1 DOI
Schuler GD, Altschul SF, Lipman DJ (1991) A workbench for multiple alignment construction and analysis. Proteins 9:180–190. https://doi.org/10.1002/prot.340090304 DOI
Hall TA (1999) BioEdit: a user-friendly biological sequence alignment editor and analysis program for Windows 95/98/NT. Nucl Acids Symp Ser 41:95–98
Vorster PJ, Goetsch P, Wijeratne TU et al (2020) A long lost key opens an ancient lock: Drosophila Myb causes a synthetic multivulval phenotype in nematodes. Biol Open 9:bio051508. https://doi.org/10.1242/bio.051508 DOI
Sofi MY, SafiA MKZ (2022) BioEdit in bioinformatics. In: Sofi MY, SafiA MKZ (eds) Bioinformatics for Everyone. Academic, London, pp 231–236. https://doi.org/10.1016/B978-0-323-91128-3.00022-7 DOI
Cvrčková F (2016) A plant biologists’ guide to phylogenetic analysis of biological macromolecule sequences. Biol Plant 60:619–627. https://doi.org/10.1007/s10535-016-0649-8 DOI
Stothard P (2000) The sequence manipulation suite: JavaScript programs for analyzing and formatting protein and DNA sequences. Biotechniques 28(1102):1104. https://doi.org/10.2144/00286ir01 DOI
Larkin MA, Blackshields G, Brown NP et al (2007) Clustal W and Clustal X version 2.0. Bioinformatics. 23:2947–2948. https://doi.org/10.1093/bioinformatics/btm404 DOI
Katoh K, Standley DM (2013) MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol 30:772–780. https://doi.org/10.1093/molbev/mst010 DOI
Edgar RC (2004) MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 32:1792–1797. https://doi.org/10.1093/nar/gkh340 DOI
Tamura K, Stecher G, Kumar S (2021) MEGA11: molecular evolutionary genetics analysis version 11. Mol Biol Evol 38:3022–3027. https://doi.org/10.1093/molbev/msab120 DOI
Goodstein DM, Shu S, Howson R et al (2012) Phytozome: a comparative platform for green plant genomics. Nucleic Acids Res 40:D1178–D1186. https://doi.org/10.1093/nar/gkr944 DOI
Tello-Ruiz MK, Naithani S, Gupta P et al (2021) Gramene 2021: harnessing the power of comparative genomics and pathways for plant research. Nucleic Acids Res 49:D1452–D1463. https://doi.org/10.1093/nar/gkaa979 DOI
Fernandez-Pozo N, Haas FB, Gould SB et al (2022) An overview of bioinformatics, genomics and transcriptomics resources for bryophytes. J Exp Bot 73:4291. https://doi.org/10.1093/jxb/erac052 DOI
Fernandez-Pozo N, Menda N, Edwards JD et al (2015) The Sol Genomics Network (SGN)–from genotype to phenotype to breeding. Nucleic Acids Res 43:D1036–D1041. https://doi.org/10.1093/nar/gku1195 DOI
Camacho C, Coulouris G, Avagyan V et al (2009) BLAST+: architecture and applications. BMC Bioinformatics 10:421. https://doi.org/10.1186/1471-2105-10-421 DOI
Solovyev V, Kosarev P, Seledsov I et al (2006) Automatic annotation of eukaryotic genes, pseudogenes and promoters. Genome Bio 7(Suppl 1):S10.1-12. https://doi.org/10.1186/gb-2006-7-s1-s10 DOI
Burge C, Karlin S (1997) Prediction of complete gene structures in human genomic DNA. J Mol Biol 268:78–94. https://doi.org/10.1006/jmbi.1997.0951 DOI
Hebsgaard SM, Korning PG, Tolstrup N et al (1996) Splice site prediction in Arabidopsis thaliana pre-mRNA by combining local and global sequence information. Nucleic Acids Res 24:3439–3452. https://doi.org/10.1093/nar/24.17.3439 DOI
Scalzitti N, Kress A, Orhand R et al (2021) Spliceator: multi-species splice site prediction using convolutional neural networks. BMC Bioinformatics 22:561. https://doi.org/10.1186/s12859-021-04471-3 DOI
Letunic I, Khedkar S, Bork P (2021) SMART: recent updates, new developments and status in 2020. Nucleic Acids Res 49:D458–D460. https://doi.org/10.1093/nar/gkaa937 DOI
de Castro E, Sigrist CJ, Gattiker A et al (2006) ScanProsite: detection of PROSITE signature matches and ProRule-associated functional and structural residues in proteins. Nucleic Acids Res 34:W362–W365. https://doi.org/10.1093/nar/gkl124 DOI
Teufel F, Almagro Armenteros JJ, Johansen AR et al (2022) SignalP 6.0 predicts all five types of signal peptides using protein language models. Nat Biotechnol 40:1023–1025. https://doi.org/10.1038/s41587-021-01156-3 DOI
Krogh A, Larsson B, von Heijne G et al (2001) Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. J Mol Biol 305:567–580. https://doi.org/10.1006/jmbi.2000.4315 DOI
Notredame C, Higgins DG, Heringa J (2000) T-Coffee: a novel method for fast and accurate multiple sequence alignment. J Mol Biol 302:205–217. https://doi.org/10.1006/jmbi.2000.4042 DOI
Lassmann T, Sonnhammer EL (2005) Kalign–an accurate and fast multiple sequence alignment algorithm. BMC Bioinformaticss 6:298. https://doi.org/10.1186/1471-2105-6-298 DOI
Lassmann T (2019) Kalign 3: multiple sequence alignment of large datasets. Bioinformatics 36:1928–1929. https://doi.org/10.1093/bioinformatics/btz795 DOI
Papadopoulos JS, Agarwala R (2007) COBALT: constraint-based alignment tool for multiple protein sequences. Bioinformatics 23:1073–1079. https://doi.org/10.1093/bioinformatics/btm076 DOI
Hall BG (2013) Building phylogenetic trees from molecular data with MEGA. Mol Biol Evol 30:1229–1235. https://doi.org/10.1093/molbev/mst012 DOI
Waese J, Fan J, Pasha A et al (2017) ePlant: visualizing and exploring multiple levels of data for hypothesis generation in plant biology. Plant Cell 29:1806–1821. https://doi.org/10.1105/tpc.17.00073 DOI
Winter D, Vinegar B, Nahal H et al (2007) An “Electronic Fluorescent Pictograph” browser for exploring and analyzing large-scale biological data sets. PLoS One 2:e718. https://doi.org/10.1371/journal.pone.0000718 DOI
Klepikova AV, Kasianov AS, Gerasimov ES et al (2016) A high resolution map of the Arabidopsis thaliana developmental transcriptome based on RNA-seq profiling. Plant J 88:1058–1070. https://doi.org/10.1111/tpj.13312 DOI
Robinson AJ, Tamiru M, Salby R et al (2018) AgriSeqDB: an online RNA-Seq database for functional studies of agriculturally relevant plant species. BMC Plant Biol 18:200. https://doi.org/10.1186/s12870-018-1406-2 DOI
Barrett T, Troup DB, Wilhite SE (2011) NCBI GEO: archive for functional genomics data sets–10 years on. Nucleic Acids Res 39:D1005–D1010. https://doi.org/10.1093/nar/gkq1184 DOI
Hruz T, Laule O, Szabo G et al (2008) Genevestigator v3: a reference expression database for the meta-analysis of transcriptomes. Adv Bioinformatics 2008:420747. https://doi.org/10.1155/2008/420747 DOI
O’Connor TR, Dyreson C, Wyrick JJ (2005) Athena: a resource for rapid visualization and systematic analysis of Arabidopsis promoter sequences. Bioinformatics 21:4411–4413. https://doi.org/10.1093/bioinformatics/bti714 DOI
Mergner J, Frejno M, List M et al (2020) Mass-spectrometry-based draft of the Arabidopsis proteome. Nature 579:409–414. https://doi.org/10.1038/s41586-020-2094-2 DOI
Szklarczyk D, Gable AL, Lyon D et al (2019) STRING v11: protein-protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res 47:D607–D613. https://doi.org/10.1093/nar/gky1131 DOI
Stern A, Doron-Faigenboim A, Erez E (2007) Selecton 2007: advanced models for detecting positive and purifying selection using a Bayesian inference approach. Nucleic Acids Res 35:W506–W511. https://doi.org/10.1093/nar/gkm382 DOI
Ma L, Qi W, Bai J (2022) Genome-wide identification and analysis of the ascorbate peroxidase (APX) gene family of winter rapeseed (Brassica rapa L.) under abiotic stress. Front Genet 12:753624. https://doi.org/10.3389/fgene.2021.753624 DOI