High-Resolution Assembly of the Human Y Chromosome Identifies a Vast Landscape of Inverted Repeats Associated with Structural and Functional Genomic Features
Jazyk angličtina Země Švýcarsko Médium electronic
Typ dokumentu časopisecké články
Grantová podpora
CZ.10.03.01/00/22_003/0000003
European Union, LERCO project via Operational Programme Just Transition
22-21903S
Czech Science Foundation
PubMed
41155472
PubMed Central
PMC12563786
DOI
10.3390/ijms262010180
PII: ijms262010180
Knihovny.cz E-zdroje
- Klíčová slova
- T2T, bioinformatics, chromosome Y, human genome, inverted repeats, non-B DNA structures,
- MeSH
- anotace sekvence MeSH
- exony MeSH
- genom lidský * MeSH
- genomika * metody MeSH
- lidé MeSH
- lidský chromozom Y * genetika MeSH
- obrácené repetice * genetika MeSH
- telomery genetika MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
Recent advances in sequencing methods have led to major progress in the gapless assemblies of the human genome. However, until mid-2023, the complete sequence of the Y chromosome remained elusive. While only a small percentage of autosomal chromosomes were without complete sequences in the broadly used reference assembly of the human genome (GRCh38), around 50% of the chromosome Y DNA sequence was unknown. Using a sophisticated computational approach, we analyzed the presence of short inverted repeats in the current human reference genome (GRCh38) and in the Telomere-to-Telomere (T2T) assembly of chromosome Y. This analysis identified the location of the repeats in chromosome Y and highlighted their association with functionally annotated sequences. The comparison revealed notably more inverted repeats in the T2T assembly compared to GRCh38. These are located abundantly around exons and mobile elements, and, unexpectedly, also within gene annotations. The remarkable abundance of short inverted repeats around exons points to their importance in gene regulation, and their presence in regions associated with recombination suggests crucial roles in recombination processes. Interestingly, the most underestimated sequences in the T2T assembly are inverted repeats with a repeat length of 12-14, which are more than 20 times as frequent as those in the human reference genome GRCh38. These findings indicate that the number of short inverted repeats was significantly underestimated in the current human reference genome (GRCh38). These previously unidentified sites are of great bio-medicinal potential, as inverted repeats are precursors for the formation of cruciform DNA functional epitopes.
Faculty of Chemistry Brno University of Technology Purkyňova 118 612 00 Brno Czech Republic
Institute of Biophysics Czech Academy of Sciences Královopolská 135 612 00 Brno Czech Republic
School of Biological Sciences University of East Anglia Norwich Research Park Norwich NR4 7TJ UK
Zobrazit více v PubMed
Quintana-Murci L., Fellous M. The Human Y Chromosome: The Biological Role of a “Functional Wasteland”. J. Biomed. Biotechnol. 2001;1:18–24. doi: 10.1155/S1110724301000080. PubMed DOI PMC
Nurk S., Koren S., Rhie A., Rautiainen M., Bzikadze A.V., Mikheenko A., Vollger M.R., Altemose N., Uralsky L., Gershman A., et al. The Complete Sequence of a Human Genome. Science. 2022;376:44–53. doi: 10.1126/science.abj6987. PubMed DOI PMC
Schneider V.A., Graves-Lindsay T., Howe K., Bouk N., Chen H.-C., Kitts P.A., Murphy T.D., Pruitt K.D., Thibaud-Nissen F., Albracht D., et al. Evaluation of GRCh38 and de Novo Haploid Genome Assemblies Demonstrates the Enduring Quality of the Reference Assembly. Genome Res. 2017;27:849–864. doi: 10.1101/gr.213611.116. PubMed DOI PMC
Rhie A., Nurk S., Cechova M., Hoyt S.J., Taylor D.J., Altemose N., Hook P.W., Koren S., Rautiainen M., Alexandrov I.A., et al. The Complete Sequence of a Human Y Chromosome. Nature. 2023;621:344–354. doi: 10.1038/s41586-023-06457-y. PubMed DOI PMC
Thomma B.P.H.J., Seidl M.F., Shi-Kunne X., Cook D.E., Bolton M.D., van Kan J.A.L., Faino L. Mind the Gap; Seven Reasons to Close Fragmented Genome Assemblies. Fungal Genet. Biol. 2016;90:24–30. doi: 10.1016/j.fgb.2015.08.010. PubMed DOI
Hallast P., Ebert P., Loftus M., Yilmaz F., Audano P.A., Logsdon G.A., Bonder M.J., Zhou W., Höps W., Kim K., et al. Assembly of 43 Human Y Chromosomes Reveals Extensive Complexity and Variation. Nature. 2023;621:355–364. doi: 10.1038/s41586-023-06425-6. PubMed DOI PMC
Weissensteiner M.H., Cremona M.A., Guiblet W.M., Stoler N., Harris R.S., Cechova M., Eckert K.A., Chiaromonte F., Huang Y.-F., Makova K.D. Accurate Sequencing of DNA Motifs Able to Form Alternative (Non-B) Structures. Genome Res. 2023;33:907–922. doi: 10.1101/gr.277490.122. PubMed DOI PMC
Cer R.Z., Donohue D.E., Mudunuri U.S., Temiz N.A., Loss M.A., Starner N.J., Halusa G.N., Volfovsky N., Yi M., Luke B.T., et al. Non-B DB v2.0: A Database of Predicted Non-B DNA-Forming Motifs and Its Associated Tools. Nucleic Acids Res. 2013;41:D94–D100. doi: 10.1093/nar/gks955. PubMed DOI PMC
Alamro H., Alzamel M., Iliopoulos C.S., Pissis S.P., Watts S. IUPACpal: Efficient Identification of Inverted Repeats in IUPAC-Encoded DNA Sequences. BMC Bioinform. 2021;22:51. doi: 10.1186/s12859-021-03983-2. PubMed DOI PMC
Pyne A.L.B., Noy A., Main K.H.S., Velasco-Berrelleza V., Piperakis M.M., Mitchenall L.A., Cugliandolo F.M., Beton J.G., Stevenson C.E.M., Hoogenboom B.W., et al. Base-Pair Resolution Analysis of the Effect of Supercoiling on DNA Flexibility and Major Groove Recognition by Triplex-Forming Oligonucleotides. Nat. Commun. 2021;12:1053. doi: 10.1038/s41467-021-21243-y. PubMed DOI PMC
Shaheen C., Hastie C., Metera K., Scott S., Zhang Z., Chen S., Gu G., Weber L., Munsky B., Kouzine F., et al. Non-Equilibrium Structural Dynamics of Supercoiled DNA Plasmids Exhibits Asymmetrical Relaxation. Nucleic Acids Res. 2022;50:2754–2764. doi: 10.1093/nar/gkac101. PubMed DOI PMC
Rice P., Longden I., Bleasby A. EMBOSS: The European Molecular Biology Open Software Suite. Trends Genet. 2000;16:276–277. doi: 10.1016/S0168-9525(00)02024-2. PubMed DOI
Brázda V., Kolomazník J., Lýsek J., Hároníková L., Coufal J., Št’astný J. Palindrome Analyser—A New Web-Based Server for Predicting and Evaluating Inverted Repeats in Nucleotide Sequences. Biochem. Biophys. Res. Commun. 2016;478:1739–1745. doi: 10.1016/j.bbrc.2016.09.015. PubMed DOI
Brauburger K., Boehmann Y., Krähling V., Mühlberger E. Transcriptional Regulation in Ebola Virus: Effects of Gene Border Structure and Regulatory Elements on Gene Expression and Polymerase Scanning Behavior. J. Virol. 2016;90:1898–1909. doi: 10.1128/JVI.02341-15. PubMed DOI PMC
Ladoukakis E.D., Eyre-Walker A. The Excess of Small Inverted Repeats in Prokaryotes. J. Mol. Evol. 2008;67:291–300. doi: 10.1007/s00239-008-9151-z. PubMed DOI
Shamanskiy V., Mikhailova A.A., Tretiakov E.O., Ushakova K., Mikhailova A.G., Oreshkov S., Knorre D.A., Ree N., Overdevest J.B., Lukowski S.W., et al. Secondary Structure of the Human Mitochondrial Genome Affects Formation of Deletions. BMC Biol. 2023;21:103. doi: 10.1186/s12915-023-01606-1. PubMed DOI PMC
Xu R., Pan Z., Nakagawa T. Gross Chromosomal Rearrangement at Centromeres. Biomolecules. 2023;14:28. doi: 10.3390/biom14010028. PubMed DOI PMC
Bastos C.A.C., Afreixo V., Rodrigues J.M.O.S., Pinho A.J. Concentration of Inverted Repeats along Human DNA. J. Integr. Bioinform. 2023;20:20220052. doi: 10.1515/jib-2022-0052. PubMed DOI PMC
Brewer B.J., Dunham M.J., Raghuraman M.K. A Unifying Model That Explains the Origins of Human Inverted Copy Number Variants. PLoS Genet. 2024;20:e1011091. doi: 10.1371/journal.pgen.1011091. PubMed DOI PMC
Brázda V., Laister R.C., Jagelská E.B., Arrowsmith C. Cruciform Structures Are a Common DNA Feature Important for Regulating Biological Processes. BMC Mol. Biol. 2011;12:33. doi: 10.1186/1471-2199-12-33. PubMed DOI PMC
Soto D.C., Uribe-Salazar J.M., Shew C.J., Sekar A., McGinty S.P., Dennis M.Y. Genomic Structural Variation: A Complex but Important Driver of Human Evolution. Am. J. Biol. Anthropol. 2023;181((Suppl. S76)):118–144. doi: 10.1002/ajpa.24713. PubMed DOI PMC
Kolb J., Chuzhanova N.A., Högel J., Vasquez K.M., Cooper D.N., Bacolla A., Kehrer-Sawatzki H. Cruciform-Forming Inverted Repeats Appear to Have Mediated Many of the Microinversions That Distinguish the Human and Chimpanzee Genomes. Chromosome Res. 2009;17:469–483. doi: 10.1007/s10577-009-9039-9. PubMed DOI
Yu X.-W., Wei Z.-T., Jiang Y.-T., Zhang S.-L. Y Chromosome Azoospermia Factor Region Microdeletions and Transmission Characteristics in Azoospermic and Severe Oligozoospermic Patients. Int. J. Clin. Exp. Med. 2015;8:14634–14646. PubMed PMC
Ait Saada A., Guo W., Costa A.B., Yang J., Wang J., Lobachev K.S. Widely Spaced and Divergent Inverted Repeats Become a Potent Source of Chromosomal Rearrangements in Long Single-Stranded DNA Regions. Nucleic Acids Res. 2023;51:3722–3734. doi: 10.1093/nar/gkad153. PubMed DOI PMC
Uhlén M., Fagerberg L., Hallström B.M., Lindskog C., Oksvold P., Mardinoglu A., Sivertsson Å., Kampf C., Sjöstedt E., Asplund A., et al. Tissue-Based Map of the Human Proteome. Science. 2015;347:1260419. doi: 10.1126/science.1260419. PubMed DOI
Helena Mangs A., Morris B.J. The Human Pseudoautosomal Region (PAR): Origin, Function and Future. Curr. Genom. 2007;8:129–136. doi: 10.2174/138920207780368141. PubMed DOI PMC
Mosig A., Guofeng M., Stadler B.M.R., Stadler P.F. Evolution of the Vertebrate Y RNA Cluster. Theory Biosci. 2007;126:9–14. doi: 10.1007/s12064-007-0003-y. PubMed DOI
Kowalski M.P., Krude T. Functional Roles of Non-Coding Y RNAs. Int. J. Biochem. Cell Biol. 2015;66:20. doi: 10.1016/j.biocel.2015.07.003. PubMed DOI PMC
Harrison P.W., Amode M.R., Austine-Orimoloye O., Azov A.G., Barba M., Barnes I., Becker A., Bennett R., Berry A., Bhai J., et al. Ensembl 2024. Nucleic Acids Res. 2024;52:D891–D899. doi: 10.1093/nar/gkad1049. PubMed DOI PMC
Fu X.-F., Cheng S.-F., Wang L.-Q., Yin S., De Felici M., Shen W. DAZ Family Proteins, Key Players for Germ Cell Development. Int. J. Biol. Sci. 2015;11:1226–1235. doi: 10.7150/ijbs.11536. PubMed DOI PMC
Hughes J.F., Skaletsky H., Koutseva N., Pyntikova T., Page D.C. Sex Chromosome-to-Autosome Transposition Events Counter Y-Chromosome Gene Loss in Mammals. Genome Biol. 2015;16:104. doi: 10.1186/s13059-015-0667-4. PubMed DOI PMC
Vollger M.R., Guitart X., Dishuck P.C., Mercuri L., Harvey W.T., Gershman A., Diekhans M., Sulovari A., Munson K.M., Lewis A.P., et al. Segmental Duplications and Their Variation in a Complete Human Genome. Science. 2022;376:eabj6965. doi: 10.1126/science.abj6965. PubMed DOI PMC
Skaletsky H., Kuroda-Kawaguchi T., Minx P.J., Cordum H.S., Hillier L., Brown L.G., Repping S., Pyntikova T., Ali J., Bieri T., et al. The Male-Specific Region of the Human Y Chromosome Is a Mosaic of Discrete Sequence Classes. Nature. 2003;423:825–837. doi: 10.1038/nature01722. PubMed DOI
Rozen S., Skaletsky H., Marszalek J.D., Minx P.J., Cordum H.S., Waterston R.H., Wilson R.K., Page D.C. Abundant Gene Conversion between Arms of Palindromes in Human and Ape Y Chromosomes. Nature. 2003;423:873–876. doi: 10.1038/nature01723. PubMed DOI
Bonito M., Ravasini F., Novelletto A., D’Atanasio E., Cruciani F., Trombetta B. Disclosing Complex Mutational Dynamics at a Y Chromosome Palindrome Evolving through Intra- and Inter-Chromosomal Gene Conversion. Hum. Mol. Genet. 2023;32:65–78. doi: 10.1093/hmg/ddac144. PubMed DOI
Carvalho C.M.B., Zhang F., Lupski J.R. Structural Variation of the Human Genome: Mechanisms, Assays, and Role in Male Infertility. Syst. Biol. Reprod. Med. 2011;57:3–16. doi: 10.3109/19396368.2010.527427. PubMed DOI PMC
Dobrovolná M., Mergny J.-L., Brázda V. Complete Analysis of G-Quadruplex Forming Sequences in the Gapless Assembly of Human Chromosome Y. Biochimie. 2024;229:49–57. doi: 10.1016/j.biochi.2024.10.007. PubMed DOI
Brázda V., Bohálová N., Bowater R.P. New Telomere to Telomere Assembly of Human Chromosome 8 Reveals a Previous Underestimation of G-Quadruplex Forming Sequences and Inverted Repeats. Gene. 2022;810:146058. doi: 10.1016/j.gene.2021.146058. PubMed DOI
Bohálová N., Mergny J.-L., Brázda V. Novel G-Quadruplex Prone Sequences Emerge in the Complete Assembly of the Human X Chromosome. Biochimie. 2021;191:87–90. doi: 10.1016/j.biochi.2021.09.004. PubMed DOI
Nguyen T., Li S., Chang J.T.-H., Watters J.W., Ng H., Osunsade A., David Y., Liu S. Chromatin Sequesters Pioneer Transcription Factor Sox2 from Exerting Force on DNA. Nat. Commun. 2022;13:3988. doi: 10.1038/s41467-022-31738-x. PubMed DOI PMC
Lu S., Wang G., Bacolla A., Zhao J., Spitser S., Vasquez K.M. Short Inverted Repeats Are Hotspots for Genetic Instability: Relevance to Cancer Genomes. Cell Rep. 2015;10:1674–1680. doi: 10.1016/j.celrep.2015.02.039. PubMed DOI PMC
Smeds L., Kamali K., Kejnovská I., Kejnovský E., Chiaromonte F., Makova K.D. Non-Canonical DNA in Human and Other Ape Telomere-to-Telomere Genomes. Nucleic Acids Res. 2025;53:gkaf298. doi: 10.1093/nar/gkaf298. PubMed DOI PMC
Swiel Y., Kelso J., Peyrégne S. Resolving the Source of Branch Length Variation in the Y Chromosome Phylogeny. Genome Biol. 2025;26:4. doi: 10.1186/s13059-024-03468-4. PubMed DOI PMC
Robinson J.T., Thorvaldsdóttir H., Winckler W., Guttman M., Lander E.S., Getz G., Mesirov J.P. Integrative Genomics Viewer. Nat. Biotechnol. 2011;29:24–26. doi: 10.1038/nbt.1754. PubMed DOI PMC
Farré D., Roset R., Huerta M., Adsuara J.E., Roselló L., Albà M.M., Messeguer X. Identification of Patterns in Biological Sequences at the ALGGEN Server: PROMO and MALGEN. Nucleic Acids Res. 2003;31:3651–3653. doi: 10.1093/nar/gkg605. PubMed DOI PMC
Messeguer X., Escudero R., Farré D., Nuñez O., Martínez J., Albà M.M. PROMO: Detection of Known Transcription Regulatory Elements Using Species-Tailored Searches. Bioinformatics. 2002;18:333–334. doi: 10.1093/bioinformatics/18.2.333. PubMed DOI
Szklarczyk D., Kirsch R., Koutrouli M., Nastou K., Mehryary F., Hachilif R., Gable A.L., Fang T., Doncheva N.T., Pyysalo S. The STRING Database in 2023: Protein–Protein Association Networks and Functional Enrichment Analyses for Any Sequenced Genome of Interest. Nucleic Acids Res. 2023;51:D638–D646. doi: 10.1093/nar/gkac1000. PubMed DOI PMC