Draft Genome Assembly of the Freshwater Apex Predator Wels Catfish (Silurus glanis) Using Linked-Read Sequencing
Jazyk angličtina Země Anglie, Velká Británie Médium electronic
Typ dokumentu časopisecké články, práce podpořená grantem
PubMed
32917720
PubMed Central
PMC7642921
DOI
10.1534/g3.120.401711
PII: g3.120.401711
Knihovny.cz E-zdroje
- Klíčová slova
- 10X Genomics Chromium linked-reads, Silurus glanis, de novo assembly, teleost, wels catfish, whole genome sequencing,
- MeSH
- ekosystém * MeSH
- genom MeSH
- sladká voda MeSH
- sumci * genetika MeSH
- zvířata MeSH
- Check Tag
- ženské pohlaví MeSH
- zvířata MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Geografické názvy
- Evropa MeSH
The wels catfish (Silurus glanis) is one of the largest freshwater fish species in the world. This top predator plays a key role in ecosystem stability, and represents an iconic trophy-fish for recreational fishermen. S. glanis is also a highly valued species for its high-quality boneless flesh, and has been cultivated for over 100 years in Eastern and Central Europe. The interest in rearing S. glanis continues to grow; the aquaculture production of this species has almost doubled during the last decade. However, despite its high ecological, cultural and economic importance, the available genomic resources for S. glanis are very limited. To fulfill this gap we report a de novo assembly and annotation of the whole genome sequence of a female S. glanis The linked-read based technology with 10X Genomics Chromium chemistry and Supernova assembler produced a highly continuous draft genome of S. glanis: ∼0.8Gb assembly (scaffold N50 = 3.2 Mb; longest individual scaffold = 13.9 Mb; BUSCO completeness = 84.2%), which included 313.3 Mb of putative repeated sequences. In total, 21,316 protein-coding genes were predicted, of which 96% were annotated functionally from either sequence homology or protein signature searches. The highly continuous genome assembly will be an invaluable resource for aquaculture genomics, genetics, conservation, and breeding research of S. glanis.
Biodiversity Unit University of Turku 20014 Finland
Zobrazit více v PubMed
Adamek Z., Grecu I., Metaxa I., Sabarich L., and Blancheton J. P., 2015. Processing traits of European catfish (Silurus glanis Linnaeus, 1758) from outdoor flow-through and indoor recycling aquaculture units. J. Appl. Ichthyology 31: 38–44. 10.1111/jai.12848 DOI
Akiva E., Brown S., Almonacid D. E., Barber A. E. 2nd, Custer A. F. et al. , 2014. The structure-function linkage database. Nucleic Acids Res. 42: D521–D530. 10.1093/nar/gkt1130 PubMed DOI PMC
Alp A., Kara C., Üçkardeş F., Carol J., and García-Berthou E., 2011. Age and growth of the European catfish (Silurus glanis) in a Turkish Reservoir and comparison with introduced populations. Rev. Fish Biol. Fish. 21: 283–294. 10.1007/s11160-010-9168-4 DOI
Attwood T. K., Coletta A., Muirhead G., Pavlopoulou A., Philippou P. B. et al. , 2012. The PRINTS database: a fine-grained protein sequence annotation and analysis resource – its status in 2012. Database (Oxford) 2012: bas019 10.1093/database/bas019 PubMed DOI PMC
Bao W., Kojima K. K., and Kohany O., 2015. Repbase Update, a database of repetitive elements in eukaryotic genomes. Mob. DNA 6: 11 10.1186/s13100-015-0041-9 PubMed DOI PMC
Berg L. S., 1949. Freshwater fishes of the USSR and adjacent countries, Academy of Sciences of the USSR, Zoological Institute, Leningrad.
Bolger A. M., Lohse M., and Usadel B., 2014. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30: 2114–2120. 10.1093/bioinformatics/btu170 PubMed DOI PMC
Boratyn G.M., Camacho C., Cooper P.S., Coulouris G., Fong A. et al. , 2013. BLAST: a more efficient report with usability improvements. Nucleic Acids Res. 41: W29–W33. 10.1093/nar/gkt282 PubMed DOI PMC
Brown W. M., George M. Jr., and Wilson A. C., 1979. Rapid evolution of animal mitochondrial DNA. Proc. Natl. Acad. Sci. USA 76: 1967–1971. 10.1073/pnas.76.4.1967 PubMed DOI PMC
Cingolani P., Platts A., Wang L. L., Coon M., Nguyen T. et al. , 2012. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff. Fly (Austin) 6: 80–92. 10.4161/fly.19695 PubMed DOI PMC
Copp G. H., Robert Britton J., Cucherousset J., García-Berthou E., Kirk R. et al. , 2009. Voracious invader or benign feline? A review of the environmental biology of European catfish Silurus glanis in its native and introduced ranges. Fish Fish. 10: 252–282. 10.1111/j.1467-2979.2008.00321.x DOI
Cucherousset J., Horky P., Slavík O., Ovidio M., Arlinghaus R. et al. , 2018. Ecology, behaviour and management of the European catfish. Rev. Fish Biol. Fish. 28: 177–190. 10.1007/s11160-017-9507-9 DOI
Jianxun C., Xiuhai R., Qixing Y., 1991. Nuclear DNA Content Variation in Fishes. Cytologia (Tokyo) 56: 425–429. 10.1508/cytologia.56.425 DOI
de Lima Morais D. A., Fang H., Rackham O. J., Wilson D., Pethica R. et al. , 2011. SUPERFAMILY 1.75 including a domain-centric gene ontology method. Nucleic Acids Res. 39: D427–D434. 10.1093/nar/gkq1130 PubMed DOI PMC
Eccles D., Chandler J., Camberis M., Henrissat B., Koren S. et al. , 2018. De novo assembly of the complex genome of Nippostrongylus brasiliensis using MinION long reads. BMC Biol. 16: 6 10.1186/s12915-017-0473-4 PubMed DOI PMC
FAO, 2020. Fishery and Aquaculture Statistics. Global aquaculture production 1950–2018 (FishstatJ). In: FAO Fisheries and Aquaculture Department [online]. Rome. Updated 2020. www.fao.org/fishery/statistics/software/fishstatj/en.
Finn R. D., Bateman A., Clements J., Coggill P., Eberhardt R. Y. et al. , 2014. Pfam: the protein families database. Nucleic Acids Res. 42: D222–D230. 10.1093/nar/gkt1223 PubMed DOI PMC
Frimodt C., 1995. Multilingual illustrated guide to the world’s commercial coldwater fish, Fishing News Books Ltd., Oxford, UK.
Froese, R., and D. Pauly. Editors. 2019 FishBase. World Wide Web electronic publication. www.fishbase.org, version (12/2019).
Fu L., Niu B., Zhu Z., Wu S., and Li W., 2012. CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics 28: 3150–3152. 10.1093/bioinformatics/bts565 PubMed DOI PMC
Girgis H. Z., 2015. Red: an intelligent, rapid, accurate tool for detecting repeats de-novo on the genomic scale. BMC Bioinformatics 16: 227 10.1186/s12859-015-0654-5 PubMed DOI PMC
Gong G., Dan C., Xiao S., Guo W., Huang P. et al. , 2018. Chromosomal-level assembly of yellow catfish genome using third-generation DNA sequencing and Hi-C analysis. Gigascience 7: giy120. PubMed PMC
Gremme G., Steinbiss S., and Kurtz S., 2013. GenomeTools: A comprehensive software library for efficient processing of structured genome annotations. IEEE/ACM Trans. Comput. Biol. Bioinformatics 10: 645–656. 10.1109/TCBB.2013.68 PubMed DOI
Gurevich A., Saveliev V., Vyahhi N., and Tesler G., 2013. QUAST: quality assessment tool for genome assemblies. Bioinformatics 29: 1072–1075. 10.1093/bioinformatics/btt086 PubMed DOI PMC
Haas B. J., Papanicolaou A., Yassour M., Grabherr M., Blood P. D. et al. , 2013. De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis. Nat. Protoc. 8: 1494–1512. 10.1038/nprot.2013.084 PubMed DOI PMC
Haft D. H., Selengut J. D., Richter R. A., Harkins D., Basu M. K. et al. , 2013. TIGRFAMs and genome properties in 2013. Nucleic Acids Res. 41: D387–D395. 10.1093/nar/gks1234 PubMed DOI PMC
Hammond S. A., Warren R. L., Vandervalk B. P., Kucuk E., Khan H. et al. , 2017. The North American bullfrog draft genome provides insight into hormonal regulation of long noncoding RNA. Nat. Commun. 8: 1433 10.1038/s41467-017-01316-7 PubMed DOI PMC
Holt C., and Yandell M., 2011. MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects. BMC Bioinformatics 12: 491 10.1186/1471-2105-12-491 PubMed DOI PMC
Howe K., Clark M. D., Torroja C. F., Torrance J., Berthelot C. et al. , 2013. The zebrafish reference genome sequence and its relationship to the human genome. Nature 496: 498–503. 10.1038/nature12111 PubMed DOI PMC
Hubley R., Finn R. D., Clements J., Eddy S. R., Jones T. A. et al. , 2016. The Dfam database of repetitive DNA families. Nucleic Acids Res. 44: D81–D89. 10.1093/nar/gkv1272 PubMed DOI PMC
Hulse-Kemp A. M., Maheshwari S., Stoffel K., Hill T. A., Jaffe D. et al. , 2018. Reference quality assembly of the 3.5-Gb genome of Capsicum annuum from a single linked-read library. Hortic. Res. 5: 4 10.1038/s41438-017-0011-0 PubMed DOI PMC
Jackson A. P., Sanders M., Berry A., McQuillan J., Aslett M. A. et al. , 2010. The genome sequence of Trypanosoma brucei gambiense, causative agent of chronic human African trypanosomiasis. PLoS Negl. Trop. Dis. 4: e658 10.1371/journal.pntd.0000658 PubMed DOI PMC
Jankowska B., Zakęś Z., Żmijewski T., Ulikowski D., and Kowalska A., 2006. Slaughter value and flesh characteristics of European catfish (Silurus glanis) fed natural and formulated feed under different rearing conditions. Eur. Food Res. Technol. 224: 453–459. 10.1007/s00217-006-0349-2 DOI
Jiang W., Lv Y., Cheng L., Yang K., Bian C. et al. , 2019. Whole-genome sequencing of the giant devil catfish, Bagarius yarrelli. Genome Biol. Evol. 11: 2071–2077. 10.1093/gbe/evz143 PubMed DOI PMC
Jones P., Binns D., Chang H. Y., Fraser M., Li W. et al. , 2014. InterProScan 5: genome-scale protein function classification. Bioinformatics 30: 1236–1240. 10.1093/bioinformatics/btu031 PubMed DOI PMC
Jones S. J. M., Taylor G. A., Chan S., Warren R. L., Hammond S. A. et al. , 2017. The genome of the beluga whale (Delphinapterus leucas). Genes (Basel) 8: 378 10.3390/genes8120378 PubMed DOI PMC
Kai W., Kikuchi K., Tohari S., Chew A. K., Tay A. et al. , 2011. Integration of the genetic map and genome assembly of fugu facilitates insights into distinct features of genome evolution in teleosts and mammals. Genome Biol. Evol. 3: 424–442. 10.1093/gbe/evr041 PubMed DOI PMC
Kajitani R., Toshimoto K., Noguchi H., Toyoda A., Ogura Y. et al. , 2014. Efficient de novo assembly of highly heterozygous genomes from whole-genome shotgun short reads. Genome Res. 24: 1384–1395. 10.1101/gr.170720.113 PubMed DOI PMC
Kappas I., Vittas S., Pantzartzi C. N., Drosopoulou E., and Scouras Z. G., 2016. A Time-calibrated mitogenome phylogeny of catfish (Teleostei: Siluriformes). PLoS One 11: e0166988 10.1371/journal.pone.0166988 PubMed DOI PMC
Kiełbasa S. M., Wan R., Sato K., Horton P., and Frith M. C., 2011. Adaptive seeds tame genomic sequence comparison. Genome Res. 21: 487–493. 10.1101/gr.113985.110 PubMed DOI PMC
Kim O. T. P., Nguyen P. T., Shoguchi E., Hisata K., Vo T. T. B. et al. , 2018. A draft genome of the striped catfish, Pangasianodon hypophthalmus, for comparative analysis of genes relevant to development and a resource for aquaculture improvement. BMC Genomics 19: 733 10.1186/s12864-018-5079-x PubMed DOI PMC
Korf I., 2004. Gene finding in novel genomes. BMC Bioinformatics 5: 59 10.1186/1471-2105-5-59 PubMed DOI PMC
Kottelat, M., and J. Freyhof, 2007 Handbook of European freshwater fishes. Publications Kottelat, Cornol, Switzerland.
Krieg F., Triantafyllidis A., and Guyomard R., 2000. Mitochondrial DNA variation in European populations of Silurus glanis. J. Fish Biol. 56: 713–724. 10.1111/j.1095-8649.2000.tb00767.x DOI
Krijgsman W., Tesakov A., Yanina T., Lazarev S., Danukalova G. et al. , 2019. Quaternary time scales for the Pontocaspian domain: Interbasinal connectivity and faunal evolution. Earth Sci. Rev. 188: 1–40. 10.1016/j.earscirev.2018.10.013 DOI
Krzywinski M., Schein J., Birol I., Connors J., Gascoyne R. et al. , 2009. Circos: an information aesthetic for comparative genomics. Genome Res. 19: 1639–1645. 10.1101/gr.092759.109 PubMed DOI PMC
Lahens N. F., Kavakli I. H., Zhang R., Hayer K., Black M. B. et al. , 2014. IVT-seq reveals extreme bias in RNA sequencing. Genome Biol. 15: R86 10.1186/gb-2014-15-6-r86 PubMed DOI PMC
Langmead B., and Salzberg S. L., 2012. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9: 357–359. 10.1038/nmeth.1923 PubMed DOI PMC
Letunic I., Doerks T., and Bork P., 2012. SMART 7: recent updates to the protein domain annotation resource. Nucleic Acids Res. 40: D302–D305. 10.1093/nar/gkr931 PubMed DOI PMC
Li H., 2011. A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data. Bioinformatics 27: 2987–2993. 10.1093/bioinformatics/btr509 PubMed DOI PMC
Li H., and Durbin R., 2011. Inference of human population history from individual whole-genome sequences. Nature 475: 493–496. 10.1038/nature10231 PubMed DOI PMC
Li N., Bao L., Zhou T., Yuan Z., Liu S. et al. , 2018. Genome sequence of walking catfish (Clarias batrachus) provides insights into terrestrial adaptation. BMC Genomics 19: 952 10.1186/s12864-018-5355-9 PubMed DOI PMC
Li W., and Godzik A., 2006. Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics 22: 1658–1659. 10.1093/bioinformatics/btl158 PubMed DOI
Linhart O., Štěch L., Švarc J., Rodina M., Audebert J. P. et al. , 2002. The culture of the European catfish, Silurus glanis, in the Czech Republic and in France. Aquat. Living Resour. 15: 139–144. 10.1016/S0990-7440(02)01153-1 DOI
Liu Z., Liu S., Yao J., Bao L., Zhang J. et al. , 2016. The channel catfish genome sequence provides insights into the evolution of scale formation in teleosts. Nat. Commun. 7: 11757 10.1038/ncomms11757 PubMed DOI PMC
Love M. I., Soneson C., and Patro R., 2018. Swimming downstream: statistical analysis of differential transcript usage following Salmon quantification. F1000 Res. 7: 952 10.12688/f1000research.15398.1 PubMed DOI PMC
Lu L., Zhao J., and Li C., 2020. High-quality genome assembly and annotation of the big-eye mandarin fish (Siniperca knerii). G3 (Bethesda)-. Genes Genom. Genet. 10: 877. PubMed PMC
Marçais G., and Kingsford C., 2011. A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics 27: 764–770. 10.1093/bioinformatics/btr011 PubMed DOI PMC
Mather N., Traves S. M., and Ho S. Y. W., 2020. A practical introduction to sequentially Markovian coalescent methods for estimating demographic history from genomic data. Ecol. Evol. 10: 579–589. 10.1002/ece3.5888 PubMed DOI PMC
Mazet O., Rodríguez W., Grusea S., Boitard S., and Chikhi L., 2016. On the importance of being structured: instantaneous coalescence rates and human evolution – lessons for ancestral population size inference? Heredity 116: 362–371. 10.1038/hdy.2015.104 PubMed DOI PMC
Mohr, D.W., A. Naguib, N. Weisenfeld, V. Kumar, P. Shah et al., 2017 Improved de novo genome assembly: Linked-read sequencing combined with optical mapping produce a high quality mammalian genome at relatively low cost. bioRxiv: 128348. (Preprint posted April 19, 2017) https://doi/org/10.1101/128348 DOI
Nadachowska-Brzyska K., Burri R., Smeds L., and Ellegren H., 2016. PSMC analysis of effective population sizes in molecular ecology and its application to black-and-white Ficedula flycatchers. Mol. Ecol. 25: 1058–1072. 10.1111/mec.13540 PubMed DOI PMC
Ozerov M. Y., Ahmad F., Gross R., Pukk L., Kahar S. et al. , 2018. Highly continuous genome assembly of Eurasian perch (Perca fluviatilis) using linked-read sequencing. G3 (Bethesda)-. Genes Genom. Genet. 8: 3737–3743. PubMed PMC
Pruesse E., Quast C., Knittel K., Fuchs B. M., Ludwig W. et al. , 2007. SILVA: a comprehensive online resource for quality checked and aligned ribosomal RNA sequence data compatible with ARB. Nucleic Acids Res. 35: 7188–7196. 10.1093/nar/gkm864 PubMed DOI PMC
Pruszynski T., and Pistelok F., 1999. Biological and economical evaluation of African and European catfish rearing in water recirculating systems. Arch. Pol. Fisheries 7: 343–352.
Quinlan A. R., and Hall I. M., 2010. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26: 841–842. 10.1093/bioinformatics/btq033 PubMed DOI PMC
Rondeau E. B., Minkley D. R., Leong J. S., Messmer A. M., Jantzen J. R. et al. , 2014. The genome and linkage map of the northern pike (Esox lucius): Conserved synteny revealed between the salmonid sister group and the Neoteleostei. PLoS One 9: e102089 10.1371/journal.pone.0102089 PubMed DOI PMC
Sigrist C. J., de Castro E., Cerutti L., Cuche B. A., Hulo N. et al. , 2013. New and continuing developments at PROSITE. Nucleic Acids Res. 41: D344–D347. 10.1093/nar/gks1067 PubMed DOI PMC
Simão F. A., Waterhouse R. M., Ioannidis P., Kriventseva E. V., and Zdobnov E. M., 2015. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31: 3210–3212. 10.1093/bioinformatics/btv351 PubMed DOI
Smit, A. F. A., and R. Hubley, 2008–2015 RepeatModeler Open-1.0. Available at: http://www.repeatmasker.org. Accessed: June 30, 2019.
Smit A. F. A., R. Hubley, P. Green, 2013–2015 RepeatMasker Open-4.0. Available at: http://www.repeatmasker.org. Accessed: June 30, 2019.
Song L., and Florea L., 2015. Rcorrector: efficient and accurate error correction for Illumina RNA-seq reads. Gigascience 4: 48 10.1186/s13742-015-0089-y PubMed DOI PMC
Stanke M., Schöffmann O., Morgenstern B., and Waack S., 2006. Gene prediction in eukaryotes with a generalized hidden Markov model that uses hints from external sources. BMC Bioinformatics 7: 62 10.1186/1471-2105-7-62 PubMed DOI PMC
Sun H., Ding J., Piednoel M., and Schneeberger K., 2018. findGSE: estimating genome size variation within human and Arabidopsis using k-mer frequencies. Bioinformatics 34: 550–557. 10.1093/bioinformatics/btx637 PubMed DOI
Thomas P. D., Kejariwal A., Campbell M. J., Mi H., Diemer K. et al. , 2003. PANTHER: a browsable database of gene products organized by biological function, using curated protein family and subfamily classification. Nucleic Acids Res. 31: 334–341. 10.1093/nar/gkg115 PubMed DOI PMC
Tudryn A., Leroy S. A. G., Toucanne S., Gibert-Brunet E., Tucholka P. et al. , 2016. The Ponto-Caspian basin as a final trap for southeastern Scandinavian Ice-Sheet meltwater. Quat. Sci. Rev. 148: 29–43. 10.1016/j.quascirev.2016.06.019 DOI
Vejřík L., Vejříková I., Blabolil P., Eloranta A. P., Kočvara L. et al. , 2017. European catfish (Silurus glanis) as a freshwater apex predator drives ecosystem via its diet adaptability. Sci. Rep. 7: 15970 10.1038/s41598-017-16169-9 PubMed DOI PMC
Vij S., Kuhl H., Kuznetsova I. S., Komissarov A., Yurchenko A. A. et al. , 2016. Chromosomal-level assembly of the Asian seabass genome using long sequence reads and multi-layered scaffolding. PLoS Genet. 12: e1005954 10.1371/journal.pgen.1005954 PubMed DOI PMC
Vittas S., Drosopoulou E., Kappas I., Pantzartzi C. N., and Scouras Z. G., 2011. The mitochondrial genome of the European catfish Silurus glanis (Siluriformes, Siluridae). J. Biol. Res. (Thessalon.) 15: 25–35.
Vurture G. W., Sedlazeck F. J., Nattestad M., Underwood C. J., Fang H. et al. , 2017. GenomeScope: fast reference-free genome profiling from short reads. Bioinformatics 33: 2202–2204. 10.1093/bioinformatics/btx153 PubMed DOI PMC
Weisenfeld N. I., Kumar V., Shah P., Church D. M., and Jaffe D. B., 2017. Direct determination of diploid genome sequences. Genome Res. 27: 757–767. 10.1101/gr.214874.116 PubMed DOI PMC
Wood D. E., and Salzberg S. L., 2014. Kraken: ultrafast metagenomic sequence classification using exact alignments. Genome Biol. 15: R46 10.1186/gb-2014-15-3-r46 PubMed DOI PMC
Wright S., 1931. Evolution in Mendelian populations. Genetics 16: 97–159. PubMed PMC
Yuan Z., Zhou T., Bao L., Liu S., Shi H. et al. , 2018. The annotation of repetitive elements in the genome of channel catfish (Ictalurus punctatus). PLoS One 13: e0197371 10.1371/journal.pone.0197371 PubMed DOI PMC
Zhang S., Li J., Qin Q., Liu W., Bian C. et al. , 2018. Whole-genome sequencing of Chinese yellow catfish provides a valuable genetic resource for high-throughput identification of toxin genes. Toxins (Basel) 10: 488 10.3390/toxins10120488 PubMed DOI PMC
Zheng G. X. Y., Lau B. T., Schnall-Levin M., Jarosz M., Bell J. M. et al. , 2016. Haplotyping germline and cancer genomes with high-throughput linked-read sequencing. Nat. Biotechnol. 34: 303–311. 10.1038/nbt.3432 PubMed DOI PMC