Prospects of telomere-to-telomere assembly in barley: Analysis of sequence gaps in the MorexV3 reference genome

. 2022 Jul ; 20 (7) : 1373-1386. [epub] 20220407

Jazyk angličtina Země Anglie, Velká Británie Médium print-electronic

Typ dokumentu časopisecké články, práce podpořená grantem

Perzistentní odkaz   https://www.medvik.cz/link/pmid35338551

The first gapless, telomere-to-telomere (T2T) sequence assemblies of plant chromosomes were reported recently. However, sequence assemblies of most plant genomes remain fragmented. Only recent breakthroughs in accurate long-read sequencing have made it possible to achieve highly contiguous sequence assemblies with a few tens of contigs per chromosome, that is a number small enough to allow for a systematic inquiry into the causes of the remaining sequence gaps and the approaches and resources needed to close them. Here, we analyse sequence gaps in the current reference genome sequence of barley cv. Morex (MorexV3). Optical map and sequence raw data, complemented by ChIP-seq data for centromeric histone variant CENH3, were used to estimate the abundance of centromeric, ribosomal DNA, and subtelomeric repeats in the barley genome. These estimates were compared with copy numbers in the MorexV3 pseudomolecule sequence. We found that almost all centromeric sequences and 45S ribosomal DNA repeat arrays were absent from the MorexV3 pseudomolecules and that the majority of sequence gaps can be attributed to assembly breakdown in long stretches of satellite repeats. However, missing sequences cannot fully account for the difference between assembly size and flow cytometric genome size estimates. We discuss the prospects of gap closure with ultra-long sequence reads.

Zobrazit více v PubMed

Aliyeva‐Schnorr, L. , Ma, L. and Houben, A. (2015) A fast air‐dry dropping chromosome preparation method suitable for FISH in plants. J. Vis. Exp. 106, e53470. PubMed PMC

Aliyeva‐Schnorr, L. , Stein, N. and Houben, A. (2016) Collinearity of homoeologous group 3 chromosomes in the genus Hordeum and Secale cereale as revealed by 3H‐derived FISH analysis. Chromosome Res. 24, 231–242. PubMed

Altschul, S.F. , Gish, W. , Miller, W. , Myers, E.W. and Lipman, D.J. (1990) Basic local alignment search tool. J. Mol. Biol. 215, 403–410. PubMed

Arend, D. , Junker, A. , Scholz, U. , Schüler, D. , Wylie, J. and Lange, M. (2016) PGP repository: a plant phenomics and genomics data publication infrastructure. Database, 2016, baw033. PubMed PMC

Belostotsky, D.A. and Ananiev, E.V. (1990) Characterization of relic DNA from barley genome. Theor. Appl. Genet. 80, 374–380. PubMed

Belser, C. , Baurens, F.‐C. , Noel, B. , Martin, G. , Cruaud, C. , Istace, B. , Yahiaoui, N. et al. (2021) Telomere‐to‐telomere gapless chromosomes of banana using nanopore sequencing. Commun. Biol. 4, 1047. PubMed PMC

Benson, G. (1999) Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 27, 573–580. PubMed PMC

Brandes, A. , Röder, M.S. and Ganal, M.W. (1995) Barley telomeres are associated with two different types of satellite DNA sequences. Chromosome Res. 3, 315–320. PubMed

Cowan, C.R. , Carlton, P.M. and Cande, W.Z. (2001) The polar arrangement of telomeres in interphase and meiosis. Rabl Organization and the Bouquet. Plant Physiol. 125, 532. PubMed PMC

Cuadrado, A. and Jouve, N. (2007) The nonrandom distribution of long clusters of all possible classes of trinucleotide repeats in barley chromosomes. Chromosome Res. 15, 711–720. PubMed

Dixon, J.R. , Selvaraj, S. , Yue, F. , Kim, A. , Li, Y. , Shen, Y. , Hu, M. et al. (2012) Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature, 485, 376–380. PubMed PMC

Doležel, J. and Bartoš, J. (2005) Plant DNA flow cytometry and estimation of nuclear genome size. Ann. Bot. 95, 99–110. PubMed PMC

Doležel, J. , Čížková, J. , Šimková, H. and Bartoš, J. (2018) One major challenge of sequencing large plant genomes is to know how big they really are. Int. J. Mol. Sci. 19, 3554. PubMed PMC

Doležel, J. and Greilhuber, J. (2010) Nuclear genome size: are we getting closer? Cytometry Part A, 77A, 635–642. PubMed

Fukui, K. , Kamisugi, Y. and Sakai, F. (1994) Physical mapping of 5S rDNA loci by direct‐cloned biotinylated probes in barley chromosomes. Genome, 37, 105–111. PubMed

Gerlach, W.L. and Bedbrook, J.R. (1979) Cloning and characterization of ribosomal‐RNA genes from wheat and barley. Nucleic Acids Res. 7, 1869–1885. PubMed PMC

Gershman, A. , Sauria, M.E.G. , Guitart, X. , Vollger, M.R. , Hook, P.W. , Hoyt, S.J. , Miten, J. et al. (2022) Epigenetic Patterns in a Complete Human Genome. Science, 376(6588), eabj5089. PubMed PMC

Harris, R.S. , Cechova, M. and Makova, K.D. (2019) Noise‐cancelling repeat finder: uncovering tandem repeats in error‐prone long‐read sequencing data. Bioinformatics, 35, 4809–4811. PubMed PMC

Himmelbach, A. , Ruban, A. , Walde, I. , Šimková, H. , Doležel, J. , Hastie, A. , Stein, N. et al. (2018) Discovery of multi‐megabase polymorphic inversions by chromosome conformation capture sequencing in large‐genome plant species. Plant J. 96(6), 1309–1316 PubMed

Houben, A. , Schroeder‐Reiter, E. , Nagaki, K. , Nasuda, S. , Wanner, G. , Murata, M. and Endo, T.R. (2007) CENH3 interacts with the centromeric retrotransposon cereba and GC‐rich satellites and locates to centromeric substructures in barley. Chromosoma, 116, 275–283. PubMed

Hudakova, S. , Michalek, W. , Presting, G.G. , ten Hoopen, R. , dos Santos, K. , Jasencakova, Z. and Schubert, I. (2001) Sequence organization of barley centromeres. Nucleic Acids Res. 29, 5029–5035. PubMed PMC

Ishii, T. , Karimi‐Ashtiyani, R. , Banaei‐Moghaddam, A.M. , Schubert, V. , Fuchs, J. and Houben, A. (2015) The differential loading of two barley CENH3 variants into distinct centromeric substructures is cell type‐ and development‐specific. Chromosome Res. 23, 277–284. PubMed

Jiao, Y. , Peluso, P. , Shi, J. , Liang, T. , Stitzer, M.C. , Wang, B. , Campbell, M.S. et al. (2017) Improved maize reference genome with single‐molecule technologies. Nature, 546, 524–527. PubMed PMC

Kapusi, E. , Ma, L. , Teo, C.H. , Hensel, G. , Himmelbach, A. , Schubert, I. , Mette, M.F. et al. (2012) Telomere‐mediated truncation of barley chromosomes. Chromosoma, 121, 181–190. PubMed

Kapustová, V. , Tulpová, Z. , Toegelová, H. , Novák, P. , Macas, J. , Karafiátová, M. , Hřibová, E. et al. (2019) The dark matter of large cereal genomes: long tandem repeats. Int. J. Mol. Sci. 20, 2483. PubMed PMC

Kilian, A. , Stiff, C. and Kleinhofs, A. (1995) Barley telomeres shorten during differentiation but grow in callus culture. Proc. Natl Acad. Sci. 92, 9555. PubMed PMC

Leitch, I.J. and Heslop‐Harrison, J.S. (1992) Physical mapping of the 18S–5.8S–26S rRNA genes in barley by in situ hybridization. Genome, 35, 1013–1018.

Leitch, I.J. and Heslop‐Harrison, J.S. (1993) Physical mapping of four sites of 5S rDNA sequences and one site of the α‐amylase‐2 gene in barley (Hordeum vulgare). Genome, 36, 517–523. PubMed

Li, H. (2018) Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics, 34(18), 3094–3100. PubMed PMC

Li, H. , Handsaker, B. , Wysoker, A. , Fennell, T. , Ruan, J. , Homer, N. , Marth, G. et al. (2009) The sequence alignment/map format and SAMtools. Bioinformatics, 25, 2078–2079. PubMed PMC

Li, X. and Waterman, M.S. (2003) Estimating the repeat structure and length of DNA sequences using L‐tuples. Genome Res. 13, 1916–1922. PubMed PMC

Lieberman‐Aiden, E. , van Berkum, N.L. , Williams, L. , Imakaev, M. , Ragoczy, T. , Telling, A. , Amit, I. et al. (2009) Comprehensive mapping of long‐range interactions reveals folding principles of the human genome. Science, 326, 289. PubMed PMC

Liu, J. , Seetharam, A.S. , Chougule, K. , Ou, S. , Swentowsky, K.W. , Gent, J.I. , Llaca, V. et al. (2020) Gapless assembly of maize chromosomes using long‐read technologies. Genome Biol. 21, 121. PubMed PMC

Logsdon, G.A. , Vollger, M.R. , Hsieh, P. , Mao, Y. , Liskovykh, M.A. , Koren, S. , Nurk, S. et al. (2021) The structure, function and evolution of a complete human chromosome 8. Nature, 593, 101–107. PubMed PMC

Marçais, G. and Kingsford, C. (2011) A fast, lock‐free approach for efficient parallel counting of occurrences of k‐mers. Bioinformatics, 27, 764–770. PubMed PMC

Marie, D. and Brown, S.C. (1993) A cytometric exercise in plant DNA histograms, with 2C values for 70 species. Biol. Cell, 78, 41–51. PubMed

Martin, M. (2011) Cutadapt removes adapter sequences from high‐throughput sequencing reads. EMBnet. J. 17, 10–12.

Martis, M.M. , Klemme, S. , Banaei‐Moghaddam, A.M. , Blattner, F.R. , Macas, J. , Schmutzer, T. , Scholz, U. et al. (2012) Selfish supernumerary chromosome reveals its origin as a mosaic of host genome and organellar sequences. Proc. Natl. Acad. Sci. USA, 109, 13343–13346. PubMed PMC

Mascher, M. , Gundlach, H. , Himmelbach, A. , Beier, S. , Twardziok, S.O. , Wicker, T. , Radchuk, V. et al. (2017) A chromosome conformation capture ordered sequence of the barley genome. Nature, 544, 427–433. PubMed

Mascher, M. , Wicker, T. , Jenkins, J. , Plott, C. , Lux, T. , Koh, C.S. , Ens, J. et al. (2021) Long‐read sequence assembly: a technical evaluation in barley. Plant Cell, 33(6), 1888–1906. PubMed PMC

Miga, K.H. , Koren, S. , Rhie, A. , Vollger, M.R. , Gershman, A. , Bzikadze, A. , Brooks, S. et al. (2020) Telomere‐to‐telomere assembly of a complete human X chromosome. Nature, 585, 79–84. PubMed PMC

Monat, C. , Padmarasu, S. , Lux, T. , Wicker, T. , Gundlach, H. , Himmelbach, A. , Ens, J. et al. (2019) TRITEX: chromosome‐scale sequence assembly of Triticeae genomes with open‐source tools. Genome Biol. 20, 284. PubMed PMC

Naish, M. , Alonge, M. , Wlodzimierz, P. , Tock, A.J. , Abramson, B.W. , Schmücker, A. , Mandáková, T. et al. (2021) The genetic and epigenetic landscape of the Arabidopsis centromeres. Science, 374(6569), eabi7489. PubMed PMC

Neumann, P. , Navrátilová, A. , Schroeder‐Reiter, E. , Koblížková, A. , Steinbauerová, V. , Chocholová, E. , Novák, P. et al. (2012) Stretching the rules: monocentric chromosomes with multiple centromere domains. PLoS Genet. 8, e1002777. PubMed PMC

Novák, P. , Neumann, P. and Macas, J. (2020) Global analysis of repetitive DNA from unassembled sequence reads using RepeatExplorer2. Nat. Protoc. 15, 3745–3776. PubMed

Nurk, S. , Koren, S. , Rhie, A. , Rautiainen, M. , Bzikadze, A.V. , Mikheenko, A. and Vollger, M.R. et al. (2022) The complete sequence of a human genome. Science, 376(6588), 44–53. PubMed PMC

Nurk, S. , Walenz, B.P. , Rhie, A. , Vollger, M.R. , Logsdon, G.A. , Grothe, R. , Miga, K.H. et al. (2020) HiCanu: accurate assembly of segmental duplications, satellites, and allelic variants from high‐fidelity long reads. Genome Res. 30, 1291–1305. PubMed PMC

Pfenninger, M. , Schönnenbeck, P. and Schell, T. (2021) Precise estimation of genome size from NGS data. bioRxiv, 2021.2005.2018.444645.

Pflug, J.M. , Holmes, V.R. , Burrus, C. , Johnston, J.S. and Maddison, D.R. (2020) Measuring genome sizes using read‐depth, k‐mers, and Flow cytometry: methodological comparisons in beetles (Coleoptera). G3: Genes|genomes|genetics, 10, 3047. PubMed PMC

Prall, T.M. , Neumann, E.K. , Karl, J.A. , Shortreed, C.G. , Baker, D.A. , Bussan, H.E. , Wiseman, R.W. et al. (2021) Consistent ultra‐long DNA sequencing with automated slow pipetting. BMC Genom. 22, 182. PubMed PMC

Presting, G.G. , Malysheva, L. , Fuchs, J. and Schubert, I. (1998) A TY3/GYPSY retrotransposon‐like sequence localizes to the centromeric regions of cereal chromosomes. Plant J. 16, 721–728. PubMed

Quinlan, A.R. and Hall, I.M. (2010) BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics, 26, 841–842. PubMed PMC

R Core Team . (2017) R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing.

Rabl, C. (1885) Über Zellteilung//Morphologisches Jahrbuch. V 10, 214.

Röder, M.S. , Lapitan, N.L. , Sorrells, M.E. and Tanksley, S.D. (1993) Genetic and physical mapping of barley telomeres. Mol. Gen. Genet. 238, 294–303. PubMed

Sanei, M. , Pickering, R. , Kumke, K. , Nasuda, S. and Houben, A. (2011) Loss of centromeric histone H3 (CENH3) from centromeres precedes uniparental chromosome elimination in interspecific barley hybrids. Proc. Natl. Acad. Sci. USA, 108, E498–505. PubMed PMC

Schneider, V.A. , Graves‐Lindsay, T. , Howe, K. , Bouk, N. , Chen, H.C. , Kitts, P.A. , Murphy, T.D. et al. (2017) Evaluation of GRCh38 and de novo haploid genome assemblies demonstrates the enduring quality of the reference assembly. Genome Res. 27, 849–864. PubMed PMC

Schubert, I. , Shi, F. , Fuchs, J. and Endo, T.R. (1998) An efficient screening for terminal deletions and translocations of barley chromosomes added to common wheat. Plant J. 14, 489–495.

Sun, H. , Ding, J. , Piednoel, M. and Schneeberger, K. (2018) findGSE: estimating genome size variation within human and Arabidopsis using k‐mer frequencies. Bioinformatics, 34, 550–557. PubMed

Sun, H. , Jiao, W.‐B. , Krause, K. , Campoy, J.A. , Goel, M. , Folz‐Donahue, K. , Kukat, C. et al. (2022) Chromosome‐scale and haplotype‐resolved genome assembly of a tetraploid potato cultivar. Nature Genetics, 54, 342–348. PubMed PMC

Szakács, E. and Molnár‐Láng, M. (2007) Development and molecular cytogenetic identification of new winter wheat–winter barley ('Martonvásári 9 kr1' ‐ 'Igri') disomic addition lines. Genome, 50, 43–50. PubMed

Talbert, P.B. and Henikoff, S. (2020) What makes a centromere? Exp. Cell Res. 389, 111895. PubMed

Tange, O. (2018) Gnu Parallel. DOI: https://doi.org/10.5281/zenodo 1146014. DOI

The International Wheat Genome Sequencing Consortium (IWGSC) . (2018). Shifting the limits in wheat research and breeding using a fully annotated reference genome. Science 361, eaar7191. PubMed

Tiersch, T.R. , Chandler, R.W. , Wachtel, S.S. and Elias, S. (1989) Reference standards for flow cytometry and application in comparative studies of nuclear DNA content. Cytometry, 10, 706–710. PubMed

Tulpová, Z. , Kovařík, A. , Toegelová, H. , Navrátilová, P. , Kapustová, V. , Hřibová, E. , Vrána, J. et al. (2022) Fine structure and transcription dynamics of bread wheat ribosomal DNA loci deciphered by a multi‐omics approach. Plant Genome, 15, e20191. PubMed

Vaikonen, J.P.T. (1994) Natural genes and mechanisms for resistance to viruses in cultivated and wild potato species (Solanum spp.). Plant Breeding, 112, 1–16.

Veeckman, E. , Ruttink, T. and Vandepoele, K. (2016) Are we there yet? Reliably estimating the completeness of plant genome sequences. Plant Cell, 28, 1759–1768. PubMed PMC

Vurture, G.W. , Sedlazeck, F.J. , Nattestad, M. , Underwood, C.J. , Fang, H. , Gurtowski, J. and Schatz, M.C. (2017) GenomeScope: fast reference‐free genome profiling from short reads. Bioinformatics (Oxford, England), 33, 2202–2204. PubMed PMC

Zerpa‐Catanho, D. , Zhang, X. , Song, J. , Hernandez, A.G. and Ming, R. (2021) Ultra‐long DNA molecule isolation from plant nuclei for ultra‐long read genome sequencing. STAR Protocols, 2, 100343. PubMed PMC

Zhang, X. , Zhang, S. , Zhao, Q. , Ming, R. and Tang, H. (2019) Assembly of allele‐aware, chromosomal‐scale autopolyploid genomes based on Hi‐C data. Nature Plants, 5, 833–845. PubMed

Zhou, Q. , Tang, D. , Huang, W. , Yang, Z. , Zhang, Y. , Hamilton, J.P. , Visser, R.G.F. et al. (2020) Haplotype‐resolved genome analyses of a heterozygous diploid potato. Nat. Genet. 52, 1018–1023. PubMed PMC

Nejnovějších 20 citací...

Zobrazit více v
Medvik | PubMed

Core promoterome of barley embryo

. 2024 Dec ; 23 () : 264-277. [epub] 20231205

The genetic mechanism of B chromosome drive in rye illuminated by chromosome-scale assembly

. 2024 Nov 08 ; 15 (1) : 9686. [epub] 20241108

Najít záznam

Citační ukazatele

Nahrávání dat ...

Možnosti archivace

Nahrávání dat ...