Create, Analyze, and Visualize Phylogenomic Datasets Using PhyloFisher
Jazyk angličtina Země Spojené státy americké Médium print
Typ dokumentu časopisecké články
Grantová podpora
R01 AI153356
NIAID NIH HHS - United States
PubMed
38265166
PubMed Central
PMC11491051
DOI
10.1002/cpz1.969
Knihovny.cz E-zdroje
- Klíčová slova
- evolution, genomics, systematics, transcriptomics,
- MeSH
- aminokyseliny * MeSH
- biologická evoluce * MeSH
- fylogeneze MeSH
- kultura MeSH
- sekvence aminokyselin MeSH
- Publikační typ
- časopisecké články MeSH
- Názvy látek
- aminokyseliny * MeSH
PhyloFisher is a software package written primarily in Python3 that can be used for the creation, analysis, and visualization of phylogenomic datasets that consist of protein sequences from eukaryotic organisms. Unlike many existing phylogenomic pipelines, PhyloFisher comes with a manually curated database of 240 protein-coding genes, a subset of a previous phylogenetic dataset sampled from 304 eukaryotic taxa. The software package can also utilize a user-created database of eukaryotic proteins, which may be more appropriate for shallow evolutionary questions. PhyloFisher is also equipped with a set of utilities to aid in running routine analyses, such as the prediction of alternative genetic codes, removal of genes and/or taxa based on occupancy/completeness of the dataset, testing for amino acid compositional heterogeneity among sequences, removal of heterotachious and/or fast-evolving sites, removal of fast-evolving taxa, supermatrix creation from randomly resampled genes, and supermatrix creation from nucleotide sequences. © 2024 Wiley Periodicals LLC. Basic Protocol 1: Constructing a phylogenomic dataset Basic Protocol 2: Performing phylogenomic analyses Support Protocol 1: Installing PhyloFisher Support Protocol 2: Creating a custom phylogenomic database.
Department of Biological Sciences Mississippi State University Starkville Mississippi USA
Department of Biological Sciences Texas Tech University Lubbock Texas USA
Department of Biology and Ecology Faculty of Science University of Ostrava Ostrava Czech Republic
Department of Zoology Faculty of Science Charles University Prague Czech Republic
Faculty of Science University of South Bohemia České Budějovice Czech Republic
Institute of Insect Sciences Zhejiang University Hangzhou China
Unité d'Ecologie Systématique et Evolution CNRS Université Paris Saclay France Orsay France
Zobrazit více v PubMed
Baldauf SL, Roger AJ, Wenk-Siefert I, & Doolittle WF (2000). A Kingdom-Level Phylogeny of Eukaryotes Based on Combined Protein Data. Science, 290(5493), 972–977. doi: 10.1126/science.290.5493.972. PubMed DOI
Brown MW, Sharpe SC, Silberman JD, Heiss AA, Lang BF, Simpson AG, Roger AJ (2013). Phylogenomics demonstrates that breviate flagellates are related to opisthokonts and apusomonads. Proceedings of the Royal Society B: Biological Sciences. 280(1769), 20131755. doi: 10.1098/rspb.2013.1755. PubMed DOI PMC
Buchfink B, Reuter K, Drost HG (2021). Sensitive protein alignments at tree-of-life scale using DIAMOND. Nature Methods, 18(4), 366–368. Doi:10.1038/s41592-021-01101-x PubMed DOI PMC
Grüning B, Dale R, Sjödin A, Chapman BA, Rowe J, Tomkins-Tinch CH, Valieris R, Köster J, & The Bioconda Team (2018). Bioconda: sustainable and comprehensive software distribution for the life sciences. Nature Methods, 15(7), 475–476. doi: 10.1038/s41592-018-0046-7. PubMed DOI PMC
Haas B, & Papanicolaou A (2017). TransDecoder. https://Github.Com/TransDecoder/TransDecoder.
Jones RE, Jones EP, Tice AK, & Brown MW (2023). A PhyloFisher utility for nucleotide-based phylogenomic matrix construction; nucl_matrix_constructor.py. bioRxiv. doi: 10.1101/2023.11.30.569490 DOI
Katoh K, & Standley DM (2013). MAFFT multiple sequence alignment software version 7: Improvements in performance and usability. Molecular Biology and Evolution, 30(4), 772–780. doi: 10.1093/molbev/mst010. PubMed DOI PMC
Kurtzer GM, Sochat V, & Bauer MW (2017). Singularity: Scientific containers for mobility of compute. PLOS ONE, 12(5), e0177459. 10.1371/journal.pone.0177459 PubMed DOI PMC
Lartillot N, & Philippe H (2004). A Bayesian Mixture Model for Across-Site Heterogeneities in the Amino-Acid Replacement Process. Molecular Biology and Evolution, 21(6), 1095–1109. doi: 10.1093/MOLBEV/MSH112. PubMed DOI
Leipe DD, Gunderson JH, Nerad TA, & Sogin ML (1993). Small subunit ribosomal RNA+ of Hexamita inflata and the quest for the first branch in the eukaryotic tree. Molecular and Biochemical Parasitology, 59(1), 41–48. doi: 10.1016/0166-6851(93)90005-I. PubMed DOI
Li L, Stoeckert CJ Jr, Roos DS OrthoMCL: identification of ortholog groups for eukaryotic genomes. (2003). Genome Research, 13(9), 2178–2189. doi: 10.1101/gr.1224503. PubMed DOI PMC
Mistry J, Finn RD, Eddy SR, Bateman A, & Punta M (2013). Challenges in homology search: HMMER3 and convergent evolution of coiled-coil regions. Nucleic Acids Research, 41(12), e121–e121. doi: 10.1093/nar/gkt263. PubMed DOI PMC
Nguyen LT, Schmidt HA, von Haeseler A, & Minh BQ (2015). IQ-TREE: A fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Molecular Biology and Evolution, 32(1), 268–274. doi: 10.1093/molbev/msu300. PubMed DOI PMC
Price MN, Dehal PS, & Arkin AP (2010). FastTree 2 - approximately maximum-likelihood trees for large alignments. PLoS ONE, 5(3), 1–10. doi: 10.1371/journal.pone.0009490. PubMed DOI PMC
Tice AK, Shadwick LL, Fiore-Donno AM, Geisen S, Kang S, Schuler GA, Spiegel FW, Wilkinson KA, Bonkowski M, Dumack K, Lahr DJG, Voelcker E, Clauß S, Zhang J, & Brown MW (2016). Expansion of the molecular and morphological diversity of Acanthamoebidae (Centramoebida, Amoebozoa) and identification of a novel life cycle type within the group. Biology Direct, 11(1), 69. doi: 10.1186/s13062-016-0171-0 PubMed DOI PMC
Tice AK, Žihala D, Pánek T, Jones RE, Salomaki ED, Nenarokov S, Burki F, Eliáš M, Eme L, Roger AJ, Rokas A, Shen XX, Strassert JFH, Kolísko M, & Brown MW (2021). PhyloFisher: A phylogenomic package for resolving eukaryotic relationships. PLoS Biology, 19(8), e3001365. Doi : 10.1371/journal.pbio.3001365. PubMed DOI PMC
Le SQ, Lartillot N, & Gascuel O (2008). Phylogenetic mixture models for proteins. Philosophical Transactions of the Royal Society B: Biological Sciences, 363(1512), 3965–3976. Doi: 10.1098/rstb.2008.0180. PubMed DOI PMC
Salomaki E, Eme L, Brown MW, Kolisko M (2020). Releasing uncurated datasets is essential for reproducible phylogenomics. Nature Ecology and Evolution, 4(11), 1435–1437. doi: 10.1038/s41559-020-01296-w. PubMed DOI
Stamatakis A (2014). RAxML version 8: A tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics, 30(9), 1312–1313. doi: 10.1093/bioinformatics/btu033. PubMed DOI PMC
Wang H-C, Minh BQ, Susko E, Roger AJ (2017). Modeling Site Heterogeneity with Posterior Mean Site Frequency Profiles Accelerates Accurate Phylogenomic Estimation. Systematic Biology, 67(2), 216–235. doi: 10.1093/sysbio/syx068. PubMed DOI
https://thebrownlab.github.io/phylofisher-pages/ - The most up-to-date documentation can be found here, the PhyloFisher GitHub Pages website.