A chromosome-scale high-contiguity genome assembly of the cheetah (Acinonyx jubatus)
Language English Country United States Media print
Document type Journal Article, Research Support, Non-U.S. Gov't
Grant support
I 5081
Austrian Science Fund FWF - Austria
I5081-B/ GACR
Austrian Science Fund FWF - Austria
PubMed
36869783
PubMed Central
PMC10212127
DOI
10.1093/jhered/esad015
PII: 7069302
Knihovny.cz E-resources
- Keywords
- Felidae, Hi-C, PacBio, conservation genomics, proximity-ligation,
- MeSH
- Acinonyx * genetics MeSH
- Molecular Sequence Annotation MeSH
- Chromosomes genetics MeSH
- Phylogeny MeSH
- Genome MeSH
- Genomics MeSH
- Animals MeSH
- Check Tag
- Animals MeSH
- Publication type
- Journal Article MeSH
- Research Support, Non-U.S. Gov't MeSH
The cheetah (Acinonyx jubatus, SCHREBER 1775) is a large felid and is considered the fastest land animal. Historically, it inhabited open grassland across Africa, the Arabian Peninsula, and southwestern Asia; however, only small and fragmented populations remain today. Here, we present a de novo genome assembly of the cheetah based on PacBio continuous long reads and Hi-C proximity ligation data. The final assembly (VMU_Ajub_asm_v1.0) has a total length of 2.38 Gb, of which 99.7% are anchored into the expected 19 chromosome-scale scaffolds. The contig and scaffold N50 values of 96.8 Mb and 144.4 Mb, respectively, a BUSCO completeness of 95.4% and a k-mer completeness of 98.4%, emphasize the high quality of the assembly. Furthermore, annotation of the assembly identified 23,622 genes and a repeat content of 40.4%. This new highly contiguous and chromosome-scale assembly will greatly benefit conservation and evolutionary genomic analyses and will be a valuable resource, e.g., to gain a detailed understanding of the function and diversity of immune response genes in felids.
Central European Institute of Technology University of Veterinary Sciences Brno Brno Czech Republic
Department of Animal Genetics University of Veterinary Sciences Brno Czech Republic
Ecology and Genetics Research Unit University of Oulu Oulu Finland
Konrad Lorenz Institute of Ethology University of Veterinary Medicine Vienna Vienna Austria
LOEWE Centre for Translational Biodiversity Genomics Frankfurt am Main Germany
Natural History Museum Vienna Central Research Laboratories Vienna Austria
Research Institute of Wildlife Ecology University of Veterinary Medicine Vienna Vienna Austria
South African National Biodiversity Institute National Zoological Garden Pretoria South Africa
See more in PubMed
Bao W, Kojima KK, Kohany O.. Repbase Update, a database of repetitive elements in eukaryotic genomes. Mobile DNA 2015:6(1):11. doi:10.1186/s13100-015-0041-9. PubMed DOI PMC
Belbachir F. Acinonyx jubatus ssp. Hecki. The IUCN Red List of Threatened Species 2008: E.T221A13035738. IUCN; 2008. 10.2305/IUCN.UK.2008.RLTS.T221A13035738.en. DOI
Brandies P, Peel E, Hogg CJ, Belov K.. The value of reference genomes in the conservation of threatened species. Genes 2019:10(11):11. doi:10.3390/genes10110846. PubMed DOI PMC
Bredemeyer KR, Harris AJ, Li G, Zhao L, Foley NM, Roelke-Parker M, O’Brien SJ, Lyons LA, Warren WC, Murphy WJ.. Ultracontinuous single haplotype genome assemblies for the domestic cat (Felis catus) and Asian Leopard Cat (Prionailurus bengalensis). J Hered. 2021:112(2):165–173. doi:10.1093/jhered/esaa057. PubMed DOI PMC
Broad Institute. Picard toolkit.Broad Institute; 2019. [accessed 2022 Aug 10]. http://broadinstitute.github.io/picard/.
Chen S, Zhou Y, Chen Y, Gu J.. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics. 2018:34(17):i884–i890. doi:10.1093/bioinformatics/bty560. PubMed DOI PMC
Chu, J. Jupiter Plot: a Circos-based tool to visualize genome assembly consistency. 2018. doi:10.5281/zenodo.1241235. DOI
Dobrynin P, Liu S, Tamazian G, Xiong Z, Yurchenko AA, Krasheninnikova K, Kliver S, Schmidt-Küntzel A, Koepfli K-P, Johnson W, et al. . Genomic legacy of the African cheetah, Acinonyx jubatus. Genome Biol. 2015:16(1):277. doi:10.1186/s13059-015-0837-4. PubMed DOI PMC
Dudchenko O, Batra SS, Omer AD, Nyquist SK, Hoeger M, Durand NC, Shamim MS, Machol I, Lander ES, Aiden AP, et al. . De novo assembly of the Aedes aegypti genome using Hi-C yields chromosome-length scaffolds. Science. 2017:356(6333):92–95. doi:10.1126/science.aal3327. PubMed DOI PMC
Durand NC, Shamim MS, Machol I, Rao SSP, Huntley MH, Lander ES, Aiden EL.. Juicer provides a one-click system for analyzing loop-resolution Hi-C experiments. Cell Systems. 2016:3(1):95–98. doi:10.1016/j.cels.2016.07.002. PubMed DOI PMC
Durant SM, Mitchell N, Groom R, Pettorelli N, Ipavec A, Jacobson AP, Woodroffe R, Böhm M, Hunter LTB, Becker MS, et al. . The global decline of cheetah Acinonyx jubatus and what it means for conservation. Proc Natl Acad Sci USA. 2017:114(3):528–533. doi:10.1073/pnas.1611122114. PubMed DOI PMC
Durant SM, Groom R, Ipavec A, Mitchell N, Khalatbari L.. IUCN red list of threatened species: acinonyx jubatus. IUCN Red List of Threatened Species. IUCN; 2021. doi:10.2305/IUCN.UK.2022-1.RLTS.T219A124366642.en. DOI
Farhadinia MS, Hunter LTB, Jourabchian A, Hosseini-Zavarei F, Akbari H, Ziaie H, Schaller GB, Jowkar H.. The critically endangered Asiatic cheetah Acinonyx jubatus venaticus in Iran: a review of recent distribution, and conservation status. Biodivers Conserv. 2017:26(5):1027–1046. doi:10.1007/s10531-017-1298-8. DOI
Flynn JM, Hubley R, Goubert C, Rosen J, Clark AG, Feschotte C, Smit AF.. RepeatModeler2 for automated genomic discovery of transposable element families. Proc Natl Acad Sci USA. 2020:117(17):9451–9457. doi:10.1073/pnas.1921046117. PubMed DOI PMC
Gurevich A, Saveliev V, Vyahhi N, Tesler G.. QUAST: quality assessment tool for genome assemblies. Bioinformatics. 2013:29(8):1072–1075. doi:10.1093/bioinformatics/btt086. PubMed DOI PMC
Humble E, Stoffel MA, Dicks K, Ball AD, Gooley RM, Chuven J, Pusey R, Remeithi M, Koepfli K-P, Pukazhenthi B, et al. . Conservation management strategy impacts inbreeding and genetic load in scimitar-horned oryx (p. 2022.06.19.496717). bioRxiv. 2022: doi:10.1101/2022.06.19.496717. PubMed DOI PMC
IUCN Cat Specialist Group. Conservation of the Cheetah Acinonyx Jubatus in Asia and North-Eastern Africa. 5th Meeting of the Sessional Committee of the CMS Scientific Council (ScC-SC5). (2021). https://www.cms.int/dugong/sites/default/files/document/cms_scc-sc5_inf.8_conservation-of%20the-cheetah-in-asia-north-eastern-africa_e.pdf.
Jin J-J, Yu W-B, Yang J-B, Song Y, dePamphilis CW, Yi T-S, Li D-Z.. GetOrganelle: a fast and versatile toolkit for accurate de novo assembly of organelle genomes. Genome Biol. 2020:21(1):241. doi:10.1186/s13059-020-02154-5. PubMed DOI PMC
Jones P, Binns D, Chang H-Y, Fraser M, Li W, McAnulla C, McWilliam H, Maslen J, Mitchell A, Nuka G, et al. . InterProScan 5: genome-scale protein function classification. Bioinformatics. 2014:30(9):1236–1240. doi:10.1093/bioinformatics/btu031. PubMed DOI PMC
Keilwagen J, Wenk M, Erickson JL, Schattat MH, Grau J, Hartung F.. Using intron position conservation for homology-based gene prediction. Nucleic Acids Res. 2016:44(9):e89–e89. doi:10.1093/nar/gkw092. PubMed DOI PMC
Keilwagen J, Hartung F, Paulini M, Twardziok SO, Grau J.. Combining RNA-seq data and homology-based gene prediction for plants, animals and fungi. BMC Bioinf. 2018:19(1). doi:10.1186/s12859-018-2203-5. PubMed DOI PMC
Kolmogorov M, Yuan J, Lin Y, Pevzner PA.. Assembly of long, error-prone reads using repeat graphs. Nat Biotechnol. 2019:37(5):540–546. doi:10.1038/s41587-019-0072-8. PubMed DOI
Laetsch DR, Blaxter ML.. BlobTools: interrogation of genome assemblies. F1000Research. 2017:6:1287. doi:10.12688/f1000research.12232.1. DOI
Li H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. ArXiv Preprint ArXiv:1303.3997. 2013. doi:10.48550/arXiv.1303.3997. DOI
Li H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics. 2018a:34(18):3094–3100. doi:10.1093/bioinformatics/bty191. PubMed DOI PMC
Li, H. seqtk: Toolkit for processing sequences in FASTA/Q formats. 2018b. https://github.com/lh3/seqtk.
Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R; 1000 Genome Project Data Processing Subgroup. The sequence alignment/map format and SAMtools. Bioinformatics. 2009:25(16):2078–2079. doi:10.1093/bioinformatics/btp352. PubMed DOI PMC
Manni M, Berkeley MR, Seppey M, Simão FA, Zdobnov EM.. BUSCO update: novel and streamlined workflows along with broader and deeper phylogenetic coverage for scoring of eukaryotic, prokaryotic, and viral genomes. Mol Biol Evol.2021:38(10):4647–4654. doi:10.1093/molbev/msab199. PubMed DOI PMC
O’Brien SJ, Menninger JC., Nash WG.. Atlas of mammalian chromosomes. John Wiley &Sons;2006.
Okonechnikov K, Conesa A, García-Alcalde F.. Qualimap 2: advanced multi-sample quality control for high-throughput sequencing data. Bioinformatics. 2016:32(2):292–294. doi:10.1093/bioinformatics/btv566. PubMed DOI PMC
Prost S, Machado AP, Zumbroich J, Preier L, Mahtani-Williams S, Meissner R, Guschanski K, Brealey JC, Fernandes CR, Vercammen P, et al. . Genomic analyses show extremely perilous conservation status of African and Asiatic cheetahs (Acinonyx jubatus). Mol Ecol. 2022:31(16):4208–4223. doi:10.1111/mec.16577. PubMed DOI PMC
Quevillon E, Silventoinen V, Pillai S, Harte N, Mulder N, Apweiler R, Lopez R.. InterProScan: Protein domains identifier. Nucleic Acids Res. 2005:33(suppl_2):W116–W120. doi:10.1093/nar/gki442. PubMed DOI PMC
Rhie A, Walenz BP, Koren S, Phillippy AM.. Merqury: reference-free quality, completeness, and phasing assessment for genome assemblies. Genome Biol. 2020:21(1):245. doi:10.1186/s13059-020-02134-9. PubMed DOI PMC
Rhie A, McCarthy SA, Fedrigo O, Damas J, Formenti G, Koren S, Uliano-Silva M, Chow W, Fungtammasan A, Kim J, et al. . Towards complete and error-free genome assemblies of all vertebrate species. Nature. 2021:592(7856):737–746. doi:10.1038/s41586-021-03451-0. PubMed DOI PMC
Sharp NCC. Timed running speed of a cheetah (Acinonyx jubatus). J Zool. 1997:241(3):493–494. doi:10.1111/j.1469-7998.1997.tb04840.x. DOI
Steinegger M, Söding J.. MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets. Nat Biotechnol. 2017:35(11):1026–1028. doi:10.1038/nbt.3988. PubMed DOI
Walker BJ, Abeel T, Shea T, Priest M, Abouelliel A, Sakthikumar S, Cuomo CA, Zeng Q, Wortman J, Young SK, et al. . Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS One. 2014:9(11):e112963. doi:10.1371/journal.pone.0112963. PubMed DOI PMC
Wold J, Koepfli KP, Galla SJ, Eccles D, Hogg CJ, Le Lec MF, Guhlin J, Santure AW, Steeves TE.. Expanding the conservation genomics toolbox: incorporating structural variants to enhance genomic studies for species of conservation concern. Mol Ecol. 2021:30(23):5949–5965. doi:10.1111/mec.16141. PubMed DOI PMC
Wurster-Hill DH, Gray CW.. Giemsa banding patterns in the chromosomes of twelve species of cats (Felidae). Cytogenet Genome Res. 1973:12(6):377–397. doi:10.1159/000130481. PubMed DOI
Xu M, Guo L, Gu S, Wang O, Zhang R, Peters BA, Fan G, Liu X, Xu X, Deng L, et al. . TGS-GapCloser: a fast and accurate gap closer for large genomes with low coverage of error-prone long reads. GigaScience. 2020:9(giaa094). doi:10.1093/gigascience/giaa094. PubMed DOI PMC
Zhang Z, Schwartz S, Wagner L, Miller W.. A greedy algorithm for aligning DNA sequences. J Comput Biol. 2000:7(1–2):203–214. doi:10.1089/10665270050081478. PubMed DOI
Zhou C, McCarthy SA, Durbin R.. YaHS: Yet another Hi-C scaffolding tool. Bioinformatics. 2022:39(1):btac808. doi:10.1093/bioinformatics/btac808. PubMed DOI PMC
Dryad
10.5061/dryad.xksn02vkr