Create, Analyze, and Visualize Phylogenomic Datasets Using PhyloFisher

. 2024 Jan ; 4 (1) : e969.

Jazyk angličtina Země Spojené státy americké Médium print

Typ dokumentu časopisecké články

Perzistentní odkaz   https://www.medvik.cz/link/pmid38265166

Grantová podpora
R01 AI153356 NIAID NIH HHS - United States

PhyloFisher is a software package written primarily in Python3 that can be used for the creation, analysis, and visualization of phylogenomic datasets that consist of protein sequences from eukaryotic organisms. Unlike many existing phylogenomic pipelines, PhyloFisher comes with a manually curated database of 240 protein-coding genes, a subset of a previous phylogenetic dataset sampled from 304 eukaryotic taxa. The software package can also utilize a user-created database of eukaryotic proteins, which may be more appropriate for shallow evolutionary questions. PhyloFisher is also equipped with a set of utilities to aid in running routine analyses, such as the prediction of alternative genetic codes, removal of genes and/or taxa based on occupancy/completeness of the dataset, testing for amino acid compositional heterogeneity among sequences, removal of heterotachious and/or fast-evolving sites, removal of fast-evolving taxa, supermatrix creation from randomly resampled genes, and supermatrix creation from nucleotide sequences. © 2024 Wiley Periodicals LLC. Basic Protocol 1: Constructing a phylogenomic dataset Basic Protocol 2: Performing phylogenomic analyses Support Protocol 1: Installing PhyloFisher Support Protocol 2: Creating a custom phylogenomic database.

Zobrazit více v PubMed

Baldauf SL, Roger AJ, Wenk-Siefert I, & Doolittle WF (2000). A Kingdom-Level Phylogeny of Eukaryotes Based on Combined Protein Data. Science, 290(5493), 972–977. doi: 10.1126/science.290.5493.972. PubMed DOI

Brown MW, Sharpe SC, Silberman JD, Heiss AA, Lang BF, Simpson AG, Roger AJ (2013). Phylogenomics demonstrates that breviate flagellates are related to opisthokonts and apusomonads. Proceedings of the Royal Society B: Biological Sciences. 280(1769), 20131755. doi: 10.1098/rspb.2013.1755. PubMed DOI PMC

Buchfink B, Reuter K, Drost HG (2021). Sensitive protein alignments at tree-of-life scale using DIAMOND. Nature Methods, 18(4), 366–368. Doi:10.1038/s41592-021-01101-x PubMed DOI PMC

Grüning B, Dale R, Sjödin A, Chapman BA, Rowe J, Tomkins-Tinch CH, Valieris R, Köster J, & The Bioconda Team (2018). Bioconda: sustainable and comprehensive software distribution for the life sciences. Nature Methods, 15(7), 475–476. doi: 10.1038/s41592-018-0046-7. PubMed DOI PMC

Haas B, & Papanicolaou A (2017). TransDecoder. https://Github.Com/TransDecoder/TransDecoder.

Jones RE, Jones EP, Tice AK, & Brown MW (2023). A PhyloFisher utility for nucleotide-based phylogenomic matrix construction; nucl_matrix_constructor.py. bioRxiv. doi: 10.1101/2023.11.30.569490 DOI

Katoh K, & Standley DM (2013). MAFFT multiple sequence alignment software version 7: Improvements in performance and usability. Molecular Biology and Evolution, 30(4), 772–780. doi: 10.1093/molbev/mst010. PubMed DOI PMC

Kurtzer GM, Sochat V, & Bauer MW (2017). Singularity: Scientific containers for mobility of compute. PLOS ONE, 12(5), e0177459. 10.1371/journal.pone.0177459 PubMed DOI PMC

Lartillot N, & Philippe H (2004). A Bayesian Mixture Model for Across-Site Heterogeneities in the Amino-Acid Replacement Process. Molecular Biology and Evolution, 21(6), 1095–1109. doi: 10.1093/MOLBEV/MSH112. PubMed DOI

Leipe DD, Gunderson JH, Nerad TA, & Sogin ML (1993). Small subunit ribosomal RNA+ of Hexamita inflata and the quest for the first branch in the eukaryotic tree. Molecular and Biochemical Parasitology, 59(1), 41–48. doi: 10.1016/0166-6851(93)90005-I. PubMed DOI

Li L, Stoeckert CJ Jr, Roos DS OrthoMCL: identification of ortholog groups for eukaryotic genomes. (2003). Genome Research, 13(9), 2178–2189. doi: 10.1101/gr.1224503. PubMed DOI PMC

Mistry J, Finn RD, Eddy SR, Bateman A, & Punta M (2013). Challenges in homology search: HMMER3 and convergent evolution of coiled-coil regions. Nucleic Acids Research, 41(12), e121–e121. doi: 10.1093/nar/gkt263. PubMed DOI PMC

Nguyen LT, Schmidt HA, von Haeseler A, & Minh BQ (2015). IQ-TREE: A fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Molecular Biology and Evolution, 32(1), 268–274. doi: 10.1093/molbev/msu300. PubMed DOI PMC

Price MN, Dehal PS, & Arkin AP (2010). FastTree 2 - approximately maximum-likelihood trees for large alignments. PLoS ONE, 5(3), 1–10. doi: 10.1371/journal.pone.0009490. PubMed DOI PMC

Tice AK, Shadwick LL, Fiore-Donno AM, Geisen S, Kang S, Schuler GA, Spiegel FW, Wilkinson KA, Bonkowski M, Dumack K, Lahr DJG, Voelcker E, Clauß S, Zhang J, & Brown MW (2016). Expansion of the molecular and morphological diversity of Acanthamoebidae (Centramoebida, Amoebozoa) and identification of a novel life cycle type within the group. Biology Direct, 11(1), 69. doi: 10.1186/s13062-016-0171-0 PubMed DOI PMC

Tice AK, Žihala D, Pánek T, Jones RE, Salomaki ED, Nenarokov S, Burki F, Eliáš M, Eme L, Roger AJ, Rokas A, Shen XX, Strassert JFH, Kolísko M, & Brown MW (2021). PhyloFisher: A phylogenomic package for resolving eukaryotic relationships. PLoS Biology, 19(8), e3001365. Doi : 10.1371/journal.pbio.3001365. PubMed DOI PMC

Le SQ, Lartillot N, & Gascuel O (2008). Phylogenetic mixture models for proteins. Philosophical Transactions of the Royal Society B: Biological Sciences, 363(1512), 3965–3976. Doi: 10.1098/rstb.2008.0180. PubMed DOI PMC

Salomaki E, Eme L, Brown MW, Kolisko M (2020). Releasing uncurated datasets is essential for reproducible phylogenomics. Nature Ecology and Evolution, 4(11), 1435–1437. doi: 10.1038/s41559-020-01296-w. PubMed DOI

Stamatakis A (2014). RAxML version 8: A tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics, 30(9), 1312–1313. doi: 10.1093/bioinformatics/btu033. PubMed DOI PMC

Wang H-C, Minh BQ, Susko E, Roger AJ (2017). Modeling Site Heterogeneity with Posterior Mean Site Frequency Profiles Accelerates Accurate Phylogenomic Estimation. Systematic Biology, 67(2), 216–235. doi: 10.1093/sysbio/syx068. PubMed DOI

https://thebrownlab.github.io/phylofisher-pages/ - The most up-to-date documentation can be found here, the PhyloFisher GitHub Pages website.

Najít záznam

Citační ukazatele

Nahrávání dat ...

    Možnosti archivace