Prospects of telomere-to-telomere assembly in barley: Analysis of sequence gaps in the MorexV3 reference genome
Jazyk angličtina Země Anglie, Velká Británie Médium print-electronic
Typ dokumentu časopisecké články, práce podpořená grantem
PubMed
35338551
PubMed Central
PMC9241371
DOI
10.1111/pbi.13816
Knihovny.cz E-zdroje
- Klíčová slova
- CenH3, Cereba, ChIP-seq, PacBio HiFi reads, flow cytometry, nanopore, ribosomal DNA, satellite, telomeric repeats,
- MeSH
- chromozomy rostlin genetika MeSH
- genom rostlinný genetika MeSH
- ječmen (rod) * genetika MeSH
- ribozomální DNA genetika MeSH
- sekvenční analýza DNA MeSH
- telomery genetika MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Názvy látek
- ribozomální DNA MeSH
The first gapless, telomere-to-telomere (T2T) sequence assemblies of plant chromosomes were reported recently. However, sequence assemblies of most plant genomes remain fragmented. Only recent breakthroughs in accurate long-read sequencing have made it possible to achieve highly contiguous sequence assemblies with a few tens of contigs per chromosome, that is a number small enough to allow for a systematic inquiry into the causes of the remaining sequence gaps and the approaches and resources needed to close them. Here, we analyse sequence gaps in the current reference genome sequence of barley cv. Morex (MorexV3). Optical map and sequence raw data, complemented by ChIP-seq data for centromeric histone variant CENH3, were used to estimate the abundance of centromeric, ribosomal DNA, and subtelomeric repeats in the barley genome. These estimates were compared with copy numbers in the MorexV3 pseudomolecule sequence. We found that almost all centromeric sequences and 45S ribosomal DNA repeat arrays were absent from the MorexV3 pseudomolecules and that the majority of sequence gaps can be attributed to assembly breakdown in long stretches of satellite repeats. However, missing sequences cannot fully account for the difference between assembly size and flow cytometric genome size estimates. We discuss the prospects of gap closure with ultra-long sequence reads.
Center for Integrated Breeding Research Georg August University Göttingen Göttingen Germany
German Centre for Integrative Biodiversity Research Halle Jena Leipzig Leipzig Germany
Institute of Experimental Botany of the Czech Academy of Sciences Olomouc Czech Republic
Leibniz Institute of Plant Genetics and Crop Plant Research Gatersleben Seeland Germany
Zobrazit více v PubMed
Aliyeva‐Schnorr, L. , Ma, L. and Houben, A. (2015) A fast air‐dry dropping chromosome preparation method suitable for FISH in plants. J. Vis. Exp. 106, e53470. PubMed PMC
Aliyeva‐Schnorr, L. , Stein, N. and Houben, A. (2016) Collinearity of homoeologous group 3 chromosomes in the genus Hordeum and Secale cereale as revealed by 3H‐derived FISH analysis. Chromosome Res. 24, 231–242. PubMed
Altschul, S.F. , Gish, W. , Miller, W. , Myers, E.W. and Lipman, D.J. (1990) Basic local alignment search tool. J. Mol. Biol. 215, 403–410. PubMed
Arend, D. , Junker, A. , Scholz, U. , Schüler, D. , Wylie, J. and Lange, M. (2016) PGP repository: a plant phenomics and genomics data publication infrastructure. Database, 2016, baw033. PubMed PMC
Belostotsky, D.A. and Ananiev, E.V. (1990) Characterization of relic DNA from barley genome. Theor. Appl. Genet. 80, 374–380. PubMed
Belser, C. , Baurens, F.‐C. , Noel, B. , Martin, G. , Cruaud, C. , Istace, B. , Yahiaoui, N. et al. (2021) Telomere‐to‐telomere gapless chromosomes of banana using nanopore sequencing. Commun. Biol. 4, 1047. PubMed PMC
Benson, G. (1999) Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 27, 573–580. PubMed PMC
Brandes, A. , Röder, M.S. and Ganal, M.W. (1995) Barley telomeres are associated with two different types of satellite DNA sequences. Chromosome Res. 3, 315–320. PubMed
Cowan, C.R. , Carlton, P.M. and Cande, W.Z. (2001) The polar arrangement of telomeres in interphase and meiosis. Rabl Organization and the Bouquet. Plant Physiol. 125, 532. PubMed PMC
Cuadrado, A. and Jouve, N. (2007) The nonrandom distribution of long clusters of all possible classes of trinucleotide repeats in barley chromosomes. Chromosome Res. 15, 711–720. PubMed
Dixon, J.R. , Selvaraj, S. , Yue, F. , Kim, A. , Li, Y. , Shen, Y. , Hu, M. et al. (2012) Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature, 485, 376–380. PubMed PMC
Doležel, J. and Bartoš, J. (2005) Plant DNA flow cytometry and estimation of nuclear genome size. Ann. Bot. 95, 99–110. PubMed PMC
Doležel, J. , Čížková, J. , Šimková, H. and Bartoš, J. (2018) One major challenge of sequencing large plant genomes is to know how big they really are. Int. J. Mol. Sci. 19, 3554. PubMed PMC
Doležel, J. and Greilhuber, J. (2010) Nuclear genome size: are we getting closer? Cytometry Part A, 77A, 635–642. PubMed
Fukui, K. , Kamisugi, Y. and Sakai, F. (1994) Physical mapping of 5S rDNA loci by direct‐cloned biotinylated probes in barley chromosomes. Genome, 37, 105–111. PubMed
Gerlach, W.L. and Bedbrook, J.R. (1979) Cloning and characterization of ribosomal‐RNA genes from wheat and barley. Nucleic Acids Res. 7, 1869–1885. PubMed PMC
Gershman, A. , Sauria, M.E.G. , Guitart, X. , Vollger, M.R. , Hook, P.W. , Hoyt, S.J. , Miten, J. et al. (2022) Epigenetic Patterns in a Complete Human Genome. Science, 376(6588), eabj5089. PubMed PMC
Harris, R.S. , Cechova, M. and Makova, K.D. (2019) Noise‐cancelling repeat finder: uncovering tandem repeats in error‐prone long‐read sequencing data. Bioinformatics, 35, 4809–4811. PubMed PMC
Himmelbach, A. , Ruban, A. , Walde, I. , Šimková, H. , Doležel, J. , Hastie, A. , Stein, N. et al. (2018) Discovery of multi‐megabase polymorphic inversions by chromosome conformation capture sequencing in large‐genome plant species. Plant J. 96(6), 1309–1316 PubMed
Houben, A. , Schroeder‐Reiter, E. , Nagaki, K. , Nasuda, S. , Wanner, G. , Murata, M. and Endo, T.R. (2007) CENH3 interacts with the centromeric retrotransposon cereba and GC‐rich satellites and locates to centromeric substructures in barley. Chromosoma, 116, 275–283. PubMed
Hudakova, S. , Michalek, W. , Presting, G.G. , ten Hoopen, R. , dos Santos, K. , Jasencakova, Z. and Schubert, I. (2001) Sequence organization of barley centromeres. Nucleic Acids Res. 29, 5029–5035. PubMed PMC
Ishii, T. , Karimi‐Ashtiyani, R. , Banaei‐Moghaddam, A.M. , Schubert, V. , Fuchs, J. and Houben, A. (2015) The differential loading of two barley CENH3 variants into distinct centromeric substructures is cell type‐ and development‐specific. Chromosome Res. 23, 277–284. PubMed
Jiao, Y. , Peluso, P. , Shi, J. , Liang, T. , Stitzer, M.C. , Wang, B. , Campbell, M.S. et al. (2017) Improved maize reference genome with single‐molecule technologies. Nature, 546, 524–527. PubMed PMC
Kapusi, E. , Ma, L. , Teo, C.H. , Hensel, G. , Himmelbach, A. , Schubert, I. , Mette, M.F. et al. (2012) Telomere‐mediated truncation of barley chromosomes. Chromosoma, 121, 181–190. PubMed
Kapustová, V. , Tulpová, Z. , Toegelová, H. , Novák, P. , Macas, J. , Karafiátová, M. , Hřibová, E. et al. (2019) The dark matter of large cereal genomes: long tandem repeats. Int. J. Mol. Sci. 20, 2483. PubMed PMC
Kilian, A. , Stiff, C. and Kleinhofs, A. (1995) Barley telomeres shorten during differentiation but grow in callus culture. Proc. Natl Acad. Sci. 92, 9555. PubMed PMC
Leitch, I.J. and Heslop‐Harrison, J.S. (1992) Physical mapping of the 18S–5.8S–26S rRNA genes in barley by in situ hybridization. Genome, 35, 1013–1018.
Leitch, I.J. and Heslop‐Harrison, J.S. (1993) Physical mapping of four sites of 5S rDNA sequences and one site of the α‐amylase‐2 gene in barley (Hordeum vulgare). Genome, 36, 517–523. PubMed
Li, H. (2018) Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics, 34(18), 3094–3100. PubMed PMC
Li, H. , Handsaker, B. , Wysoker, A. , Fennell, T. , Ruan, J. , Homer, N. , Marth, G. et al. (2009) The sequence alignment/map format and SAMtools. Bioinformatics, 25, 2078–2079. PubMed PMC
Li, X. and Waterman, M.S. (2003) Estimating the repeat structure and length of DNA sequences using L‐tuples. Genome Res. 13, 1916–1922. PubMed PMC
Lieberman‐Aiden, E. , van Berkum, N.L. , Williams, L. , Imakaev, M. , Ragoczy, T. , Telling, A. , Amit, I. et al. (2009) Comprehensive mapping of long‐range interactions reveals folding principles of the human genome. Science, 326, 289. PubMed PMC
Liu, J. , Seetharam, A.S. , Chougule, K. , Ou, S. , Swentowsky, K.W. , Gent, J.I. , Llaca, V. et al. (2020) Gapless assembly of maize chromosomes using long‐read technologies. Genome Biol. 21, 121. PubMed PMC
Logsdon, G.A. , Vollger, M.R. , Hsieh, P. , Mao, Y. , Liskovykh, M.A. , Koren, S. , Nurk, S. et al. (2021) The structure, function and evolution of a complete human chromosome 8. Nature, 593, 101–107. PubMed PMC
Marçais, G. and Kingsford, C. (2011) A fast, lock‐free approach for efficient parallel counting of occurrences of k‐mers. Bioinformatics, 27, 764–770. PubMed PMC
Marie, D. and Brown, S.C. (1993) A cytometric exercise in plant DNA histograms, with 2C values for 70 species. Biol. Cell, 78, 41–51. PubMed
Martin, M. (2011) Cutadapt removes adapter sequences from high‐throughput sequencing reads. EMBnet. J. 17, 10–12.
Martis, M.M. , Klemme, S. , Banaei‐Moghaddam, A.M. , Blattner, F.R. , Macas, J. , Schmutzer, T. , Scholz, U. et al. (2012) Selfish supernumerary chromosome reveals its origin as a mosaic of host genome and organellar sequences. Proc. Natl. Acad. Sci. USA, 109, 13343–13346. PubMed PMC
Mascher, M. , Gundlach, H. , Himmelbach, A. , Beier, S. , Twardziok, S.O. , Wicker, T. , Radchuk, V. et al. (2017) A chromosome conformation capture ordered sequence of the barley genome. Nature, 544, 427–433. PubMed
Mascher, M. , Wicker, T. , Jenkins, J. , Plott, C. , Lux, T. , Koh, C.S. , Ens, J. et al. (2021) Long‐read sequence assembly: a technical evaluation in barley. Plant Cell, 33(6), 1888–1906. PubMed PMC
Miga, K.H. , Koren, S. , Rhie, A. , Vollger, M.R. , Gershman, A. , Bzikadze, A. , Brooks, S. et al. (2020) Telomere‐to‐telomere assembly of a complete human X chromosome. Nature, 585, 79–84. PubMed PMC
Monat, C. , Padmarasu, S. , Lux, T. , Wicker, T. , Gundlach, H. , Himmelbach, A. , Ens, J. et al. (2019) TRITEX: chromosome‐scale sequence assembly of Triticeae genomes with open‐source tools. Genome Biol. 20, 284. PubMed PMC
Naish, M. , Alonge, M. , Wlodzimierz, P. , Tock, A.J. , Abramson, B.W. , Schmücker, A. , Mandáková, T. et al. (2021) The genetic and epigenetic landscape of the Arabidopsis centromeres. Science, 374(6569), eabi7489. PubMed PMC
Neumann, P. , Navrátilová, A. , Schroeder‐Reiter, E. , Koblížková, A. , Steinbauerová, V. , Chocholová, E. , Novák, P. et al. (2012) Stretching the rules: monocentric chromosomes with multiple centromere domains. PLoS Genet. 8, e1002777. PubMed PMC
Novák, P. , Neumann, P. and Macas, J. (2020) Global analysis of repetitive DNA from unassembled sequence reads using RepeatExplorer2. Nat. Protoc. 15, 3745–3776. PubMed
Nurk, S. , Koren, S. , Rhie, A. , Rautiainen, M. , Bzikadze, A.V. , Mikheenko, A. and Vollger, M.R. et al. (2022) The complete sequence of a human genome. Science, 376(6588), 44–53. PubMed PMC
Nurk, S. , Walenz, B.P. , Rhie, A. , Vollger, M.R. , Logsdon, G.A. , Grothe, R. , Miga, K.H. et al. (2020) HiCanu: accurate assembly of segmental duplications, satellites, and allelic variants from high‐fidelity long reads. Genome Res. 30, 1291–1305. PubMed PMC
Pfenninger, M. , Schönnenbeck, P. and Schell, T. (2021) Precise estimation of genome size from NGS data. bioRxiv, 2021.2005.2018.444645.
Pflug, J.M. , Holmes, V.R. , Burrus, C. , Johnston, J.S. and Maddison, D.R. (2020) Measuring genome sizes using read‐depth, k‐mers, and Flow cytometry: methodological comparisons in beetles (Coleoptera). G3: Genes|genomes|genetics, 10, 3047. PubMed PMC
Prall, T.M. , Neumann, E.K. , Karl, J.A. , Shortreed, C.G. , Baker, D.A. , Bussan, H.E. , Wiseman, R.W. et al. (2021) Consistent ultra‐long DNA sequencing with automated slow pipetting. BMC Genom. 22, 182. PubMed PMC
Presting, G.G. , Malysheva, L. , Fuchs, J. and Schubert, I. (1998) A TY3/GYPSY retrotransposon‐like sequence localizes to the centromeric regions of cereal chromosomes. Plant J. 16, 721–728. PubMed
Quinlan, A.R. and Hall, I.M. (2010) BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics, 26, 841–842. PubMed PMC
R Core Team . (2017) R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing.
Rabl, C. (1885) Über Zellteilung//Morphologisches Jahrbuch. V 10, 214.
Röder, M.S. , Lapitan, N.L. , Sorrells, M.E. and Tanksley, S.D. (1993) Genetic and physical mapping of barley telomeres. Mol. Gen. Genet. 238, 294–303. PubMed
Sanei, M. , Pickering, R. , Kumke, K. , Nasuda, S. and Houben, A. (2011) Loss of centromeric histone H3 (CENH3) from centromeres precedes uniparental chromosome elimination in interspecific barley hybrids. Proc. Natl. Acad. Sci. USA, 108, E498–505. PubMed PMC
Schneider, V.A. , Graves‐Lindsay, T. , Howe, K. , Bouk, N. , Chen, H.C. , Kitts, P.A. , Murphy, T.D. et al. (2017) Evaluation of GRCh38 and de novo haploid genome assemblies demonstrates the enduring quality of the reference assembly. Genome Res. 27, 849–864. PubMed PMC
Schubert, I. , Shi, F. , Fuchs, J. and Endo, T.R. (1998) An efficient screening for terminal deletions and translocations of barley chromosomes added to common wheat. Plant J. 14, 489–495.
Sun, H. , Ding, J. , Piednoel, M. and Schneeberger, K. (2018) findGSE: estimating genome size variation within human and Arabidopsis using k‐mer frequencies. Bioinformatics, 34, 550–557. PubMed
Sun, H. , Jiao, W.‐B. , Krause, K. , Campoy, J.A. , Goel, M. , Folz‐Donahue, K. , Kukat, C. et al. (2022) Chromosome‐scale and haplotype‐resolved genome assembly of a tetraploid potato cultivar. Nature Genetics, 54, 342–348. PubMed PMC
Szakács, E. and Molnár‐Láng, M. (2007) Development and molecular cytogenetic identification of new winter wheat–winter barley ('Martonvásári 9 kr1' ‐ 'Igri') disomic addition lines. Genome, 50, 43–50. PubMed
Talbert, P.B. and Henikoff, S. (2020) What makes a centromere? Exp. Cell Res. 389, 111895. PubMed
Tange, O. (2018) Gnu Parallel. DOI: https://doi.org/10.5281/zenodo 1146014. DOI
The International Wheat Genome Sequencing Consortium (IWGSC) . (2018). Shifting the limits in wheat research and breeding using a fully annotated reference genome. Science 361, eaar7191. PubMed
Tiersch, T.R. , Chandler, R.W. , Wachtel, S.S. and Elias, S. (1989) Reference standards for flow cytometry and application in comparative studies of nuclear DNA content. Cytometry, 10, 706–710. PubMed
Tulpová, Z. , Kovařík, A. , Toegelová, H. , Navrátilová, P. , Kapustová, V. , Hřibová, E. , Vrána, J. et al. (2022) Fine structure and transcription dynamics of bread wheat ribosomal DNA loci deciphered by a multi‐omics approach. Plant Genome, 15, e20191. PubMed
Vaikonen, J.P.T. (1994) Natural genes and mechanisms for resistance to viruses in cultivated and wild potato species (Solanum spp.). Plant Breeding, 112, 1–16.
Veeckman, E. , Ruttink, T. and Vandepoele, K. (2016) Are we there yet? Reliably estimating the completeness of plant genome sequences. Plant Cell, 28, 1759–1768. PubMed PMC
Vurture, G.W. , Sedlazeck, F.J. , Nattestad, M. , Underwood, C.J. , Fang, H. , Gurtowski, J. and Schatz, M.C. (2017) GenomeScope: fast reference‐free genome profiling from short reads. Bioinformatics (Oxford, England), 33, 2202–2204. PubMed PMC
Zerpa‐Catanho, D. , Zhang, X. , Song, J. , Hernandez, A.G. and Ming, R. (2021) Ultra‐long DNA molecule isolation from plant nuclei for ultra‐long read genome sequencing. STAR Protocols, 2, 100343. PubMed PMC
Zhang, X. , Zhang, S. , Zhao, Q. , Ming, R. and Tang, H. (2019) Assembly of allele‐aware, chromosomal‐scale autopolyploid genomes based on Hi‐C data. Nature Plants, 5, 833–845. PubMed
Zhou, Q. , Tang, D. , Huang, W. , Yang, Z. , Zhang, Y. , Hamilton, J.P. , Visser, R.G.F. et al. (2020) Haplotype‐resolved genome analyses of a heterozygous diploid potato. Nat. Genet. 52, 1018–1023. PubMed PMC
Core promoterome of barley embryo
The genetic mechanism of B chromosome drive in rye illuminated by chromosome-scale assembly