PhyloFisher: A phylogenomic package for resolving eukaryotic relationships

. 2021 Aug ; 19 (8) : e3001365. [epub] 20210806

Jazyk angličtina Země Spojené státy americké Médium electronic-ecollection

Typ dokumentu hodnotící studie, časopisecké články, práce podpořená grantem, Research Support, U.S. Gov't, Non-P.H.S.

Perzistentní odkaz   https://www.medvik.cz/link/pmid34358228

Phylogenomic analyses of hundreds of protein-coding genes aimed at resolving phylogenetic relationships is now a common practice. However, no software currently exists that includes tools for dataset construction and subsequent analysis with diverse validation strategies to assess robustness. Furthermore, there are no publicly available high-quality curated databases designed to assess deep (>100 million years) relationships in the tree of eukaryotes. To address these issues, we developed an easy-to-use software package, PhyloFisher (https://github.com/TheBrownLab/PhyloFisher), written in Python 3. PhyloFisher includes a manually curated database of 240 protein-coding genes from 304 eukaryotic taxa covering known eukaryotic diversity, a novel tool for ortholog selection, and utilities that will perform diverse analyses required by state-of-the-art phylogenomic investigations. Through phylogenetic reconstructions of the tree of eukaryotes and of the Saccharomycetaceae clade of budding yeasts, we demonstrate the utility of the PhyloFisher workflow and the provided starting database to address phylogenetic questions across a large range of evolutionary time points for diverse groups of organisms. We also demonstrate that undetected paralogy can remain in phylogenomic "single-copy orthogroup" datasets constructed using widely accepted methods such as all vs. all BLAST searches followed by Markov Cluster Algorithm (MCL) clustering and application of automated tree pruning algorithms. Finally, we show how the PhyloFisher workflow helps detect inadvertent paralog inclusions, allowing the user to make more informed decisions regarding orthology assignments, leading to a more accurate final dataset.

Zobrazit více v PubMed

Leipe DD, Gunderson JH, Nerad TA, Sogin ML. Small subunit ribosomal RNA+ of Hexamita inflata and the quest for the first branch in the eukaryotic tree. Mol Biochem Parasitol. 1993;59:41–48. doi: 10.1016/0166-6851(93)90005-i PubMed DOI

Baldauf SL, Roger AJ, Wenk-Siefert I, Doolittle WF. A Kingdom-Level Phylogeny of Eukaryotes Based on Combined Protein Data. Science. 2000;290:972. doi: 10.1126/science.290.5493.972 PubMed DOI

Brown MW, Heiss AA, Kamikawa R, Inagaki Y, Yabuki A, Tice AK, et al.. Phylogenomics Places Orphan Protistan Lineages in a Novel Eukaryotic Super-Group. Genome Biol Evol. 2018;10:427–433. doi: 10.1093/gbe/evy014 PubMed DOI PMC

Strassert JFH, Jamy M, Mylnikov AP, Tikhonenkov DV, Burki F. New Phylogenomic Analysis of the Enigmatic Phylum Telonemia Further Resolves the Eukaryote Tree of Life. Mol Biol Evol. 2019;36:757–765. doi: 10.1093/molbev/msz012 PubMed DOI PMC

Lax G, Eglit Y, Eme L, Bertrand EM, Roger AJ, Simpson AGB. Hemimastigophora is a novel supra-kingdom-level lineage of eukaryotes. Nature. 2018;564:410–414. doi: 10.1038/s41586-018-0708-8 PubMed DOI

Yang Y, Smith SA. Orthology Inference in Nonmodel Organisms Using Transcriptomes and Low-Coverage Genomes: Improving Accuracy and Matrix Occupancy for Phylogenomics. Mol Biol Evol. 2014;31:3081–3092. doi: 10.1093/molbev/msu245 PubMed DOI PMC

Kumar S, Krabberød AK, Neumann RS, Michalickova K, Zhao S, Zhang X, et al.. BIR Pipeline for Preparation of Phylogenomic Data. Evol Bioinform Online. 2015;11:EBO.S10189. doi: 10.4137/EBO.S10189 PubMed DOI PMC

Salomaki ED, Terpis KX, Rueckert S, Kotyk M, Varadínová ZK, Čepička I, et al.. Gregarine single-cell transcriptomics reveals differential mitochondrial remodeling and adaptation in apicomplexans. BMC Biol. 2021;19:77. doi: 10.1186/s12915-021-01007-2 PubMed DOI PMC

Zhang C, Rabiee M, Sayyari E, Mirarab S. ASTRAL-III: polynomial time species tree reconstruction from partially resolved gene trees. BMC Bioinformatics. 2018;19:153. doi: 10.1186/s12859-018-2129-y PubMed DOI PMC

Susko E, Field C, Blouin C, Roger AJ. Estimation of Rates-Across-Sites Distributions in Phylogenetic Substitution Models. Syst Biol. 2003;52:594–603. doi: 10.1080/10635150390235395 PubMed DOI

Susko E, Lincker L, Roger AJ. Accelerated Estimation of Frequency Classes in Site-Heterogeneous Profile Mixture Models. Mol Biol Evol. 2018;35:1266–1283. doi: 10.1093/molbev/msy026 PubMed DOI

Stamatakis A. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics. 2014;30:1312–1313. doi: 10.1093/bioinformatics/btu033 PubMed DOI PMC

Susko E, Roger AJ. On Reduced Amino Acid Alphabets for Phylogenetic Inference. Mol Biol Evol. 2007;24:2139–2150. doi: 10.1093/molbev/msm144 PubMed DOI

Minh BQ, Schmidt HA, Chernomor O, Schrempf D, Woodhams MD, von Haeseler A, et al.. IQ-TREE 2: New Models and Efficient Methods for Phylogenetic Inference in the Genomic Era. Mol Biol Evol. 2020;37:1530–1534. doi: 10.1093/molbev/msaa015 PubMed DOI PMC

Burki F, Roger AJ, Brown MW, Simpson AGB. The New Tree of Eukaryotes. Trends Ecol Evol. 2020;35:43–55. doi: 10.1016/j.tree.2019.08.008 PubMed DOI

Gawryluk RMR, Tikhonenkov DV, Hehenberger E, Husnik F, Mylnikov AP, Keeling PJ. Non-photosynthetic predators are sister to red algae. Nature. 2019;572:240–243. doi: 10.1038/s41586-019-1398-6 PubMed DOI

Irisarri I, Strassert JFH, Burki F. Phylogenomic Insights into the Origin of Primary Plastids. Syst Biol. 2021. [cited 20 May 2021]. doi: 10.1093/sysbio/syab036 PubMed DOI

Schön ME, Zlatogursky VV, Singh RP, Poirier C, Wilken S, Mathur V, et al.. Picozoa are archaeplastids without plastid. bioRxiv. 2021:2021.04.14.439778. doi: 10.1101/2021.04.14.439778 DOI

Cavalier-Smith T, Chao EE, Lewis R. Multigene phylogeny and cell evolution of chromist infrakingdom Rhizaria: contrasting cell organisation of sister phyla Cercozoa and Retaria. Protoplasma. 2018;255:1517–1574. doi: 10.1007/s00709-018-1241-1 PubMed DOI PMC

Shen X-X, Opulente DA, Kominek J, Zhou X, Steenwyk JL, Buh KV, et al.. Tempo and Mode of Genome Evolution in the Budding Yeast Subphylum. Cell. 2018;175:1533–1545.e20. doi: 10.1016/j.cell.2018.10.023 PubMed DOI PMC

Seenivasan R, Sausen N, Medlin LK, Melkonian M. Picomonas judraskeda gen. et sp. nov.: the first identified member of the Picozoa phylum nov., a widespread group of picoeukaryotes, formerly known as “picobiliphytes”. PLoS ONE. 2013;8:e59565. doi: 10.1371/journal.pone.0059565 PubMed DOI PMC

Siu-Ting K, Torres-Sánchez M, San Mauro D, Wilcockson D, Wilkinson M, Pisani D, et al.. Inadvertent Paralog Inclusion Drives Artifactual Topologies and Timetree Estimates in Phylogenomics. Mol Biol Evol. 2019;36:1344–1356. doi: 10.1093/molbev/msz067 PubMed DOI PMC

Fu L, Niu B, Zhu Z, Wu S, Li W. CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics. 2012;28:3150–3152. doi: 10.1093/bioinformatics/bts565 PubMed DOI PMC

Mistry J, Finn RD, Eddy SR, Bateman A, Punta M. Challenges in homology search: HMMER3 and convergent evolution of coiled-coil regions. Nucleic Acids Res. 2013;41:e121–e121. doi: 10.1093/nar/gkt263 PubMed DOI PMC

Buchfink B, Xie C, Huson DH. Fast and sensitive protein alignment using DIAMOND. Nat Methods. 2015;12:59–60. doi: 10.1038/nmeth.3176 PubMed DOI

Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, et al.. BLAST+: architecture and applications. BMC Bioinformatics. 2009;10:421. doi: 10.1186/1471-2105-10-421 PubMed DOI PMC

Chen F, Mackey AJ, Stoeckert CJ Jr, Roos DS. OrthoMCL-DB: querying a comprehensive multi-species collection of ortholog groups. Nucleic Acids Res. 2006;34:D363–D368. doi: 10.1093/nar/gkj123 PubMed DOI PMC

Katoh K, Standley DM. MAFFT Multiple Sequence Alignment Software Version 7: Improvements in Performance and Usability. Mol Biol Evol. 2013;30:772–780. doi: 10.1093/molbev/mst010 PubMed DOI PMC

Capella-Gutiérrez S, Silla-Martínez JM, Gabaldón T. trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics. 2009;25:1972–1973. doi: 10.1093/bioinformatics/btp348 PubMed DOI PMC

Price MN, Dehal PS, Arkin AP. FastTree 2—Approximately Maximum-Likelihood Trees for Large Alignments. PLoS ONE. 2010;5:1–10. doi: 10.1371/journal.pone.0009490 PubMed DOI PMC

Huerta-Cepas J, Serra F, Bork P. ETE 3: Reconstruction, Analysis, and Visualization of Phylogenomic Data. Mol Biol Evol. 2016;33:1635–1638. doi: 10.1093/molbev/msw046 PubMed DOI PMC

Whelan S, Irisarri I, Burki F. PREQUAL: detecting non-homologous characters in sets of unaligned homologous sequences. Bioinformatics. 2018;34:3929–3930. doi: 10.1093/bioinformatics/bty448 PubMed DOI

Ali RH, Bogusz M, Whelan S. Identifying Clusters of High Confidence Homologies in Multiple Sequence Alignments. Mol Biol Evol 2019;36:2340–2351. doi: 10.1093/molbev/msz142 PubMed DOI PMC

Criscuolo A, Gribaldo S. BMGE (Block Mapping and Gathering with Entropy): a new software for selection of phylogenetic informative regions from multiple sequence alignments. BMC Evol Biol. 2010;10:210. doi: 10.1186/1471-2148-10-210 PubMed DOI PMC

Song L, Florea L. Rcorrector: efficient and accurate error correction for Illumina RNA-seq reads. Gigascience. 2015;4. doi: 10.1186/s13742-015-0089-y PubMed DOI PMC

Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014;30. doi: 10.1093/bioinformatics/btu170 PubMed DOI PMC

Tice AK, Shadwick LL, Fiore-Donno AM, Geisen S, Kang S, Schuler GA, et al.. Expansion of the molecular and morphological diversity of Acanthamoebidae (Centramoebida, Amoebozoa) and identification of a novel life cycle type within the group. Biol Direct. 2016;11:69. doi: 10.1186/s13062-016-0171-0 PubMed DOI PMC

Wang H-C, Minh BQ, Susko E, Roger AJ. Modeling Site Heterogeneity with Posterior Mean Site Frequency Profiles Accelerates Accurate Phylogenomic Estimation. Syst Biol. 2017;67:216–235. doi: 10.1093/sysbio/syx068 PubMed DOI

Nejnovějších 20 citací...

Zobrazit více v
Medvik | PubMed

Mapping the metagenomic diversity of the multi-kingdom glacier-fed stream microbiome

. 2025 Jan ; 10 (1) : 217-230. [epub] 20250102

Reconstructing the last common ancestor of all eukaryotes

. 2024 Nov ; 22 (11) : e3002917. [epub] 20241125

Expanded gene and taxon sampling of diplomonads shows multiple switches to parasitic and free-living lifestyle

. 2024 Sep 27 ; 22 (1) : 217. [epub] 20240927

Comparative genomics of Ascetosporea gives new insight into the evolutionary basis for animal parasitism in Rhizaria

. 2024 May 03 ; 22 (1) : 103. [epub] 20240503

New plastids, old proteins: repeated endosymbiotic acquisitions in kareniacean dinoflagellates

. 2024 Apr ; 25 (4) : 1859-1885. [epub] 20240318

Encyclopedia of Family A DNA Polymerases Localized in Organelles: Evolutionary Contribution of Bacteria Including the Proto-Mitochondrion

. 2024 Feb 01 ; 41 (2) : .

Mitochondrial genomes revisited: why do different lineages retain different genes?

. 2024 Jan 25 ; 22 (1) : 15. [epub] 20240125

Create, Analyze, and Visualize Phylogenomic Datasets Using PhyloFisher

. 2024 Jan ; 4 (1) : e969.

Genomics of Preaxostyla Flagellates Illuminates the Path Towards the Loss of Mitochondria

. 2023 Dec ; 19 (12) : e1011050. [epub] 20231207

Lessons from the deep: mechanisms behind diversification of eukaryotic protein complexes

. 2023 Dec ; 98 (6) : 1910-1927. [epub] 20230619

A Novel Group of Dynamin-Related Proteins Shared by Eukaryotes and Giant Viruses Is Able to Remodel Mitochondria From Within the Matrix

. 2023 Jun 01 ; 40 (6) : .

Reconstruction of Plastid Proteomes of Apicomplexans and Close Relatives Reveals the Major Evolutionary Outcomes of Cryptic Plastids

. 2023 Jan 04 ; 40 (1) : .

Evidence for an Independent Hydrogenosome-to-Mitosome Transition in the CL3 Lineage of Fornicates

. 2022 ; 13 () : 866459. [epub] 20220519

An Enigmatic Stramenopile Sheds Light on Early Evolution in Ochrophyta Plastid Organellogenesis

. 2022 Apr 11 ; 39 (4) : .

Phylogenetic profiling and cellular analyses of ARL16 reveal roles in traffic of IFT140 and INPP5E

. 2022 Apr 01 ; 33 (4) : ar33. [epub] 20220223

Gregarine single-cell transcriptomics reveals differential mitochondrial remodeling and adaptation in apicomplexans

. 2021 Apr 16 ; 19 (1) : 77. [epub] 20210416

Najít záznam

Citační ukazatele

Nahrávání dat ...

Možnosti archivace

Nahrávání dat ...