Unraveling the palindromic and nonpalindromic motifs of retroviral integration site sequences by statistical mixture models
Jazyk angličtina Země Spojené státy americké Médium print-electronic
Typ dokumentu časopisecké články, práce podpořená grantem
PubMed
37463751
PubMed Central
PMC10547254
DOI
10.1101/gr.277694.123
PII: gr.277694.123
Knihovny.cz E-zdroje
- MeSH
- integrace viru * genetika MeSH
- nukleotidové motivy MeSH
- nukleotidy * MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Názvy látek
- nukleotidy * MeSH
A weak palindromic nucleotide motif is the hallmark of retroviral integration site alignments. Given that the majority of target sequences are not palindromic, the current model explains the symmetry by an overlap of the nonpalindromic motif present on one of the half-sites of the sequences. Here, we show that the implementation of multicomponent mixture models allows for different interpretations consistent with the existence of both palindromic and nonpalindromic submotifs in the sets of integration site sequences. We further show that the weak palindromic motifs result from freely combined site-specific submotifs restricted to only a few positions proximal to the site of integration. The submotifs are formed by either palindrome-forming nucleotide preference or nucleotide exclusion. Using the mixture models, we also identify HIV-1-favored palindromic sequences in Alu repeats serving as local hotspots for integration. The application of the novel statistical approach provides deeper insight into the selection of retroviral integration sites and may prove to be a valuable tool in the analysis of any type of DNA motifs.
Zobrazit více v PubMed
Aiyer S, Rossi P, Malani N, Schneider WM, Chandar A, Bushman FD, Montelione GT, Roth MJ. 2015. Structural and sequencing analysis of local target DNA recognition by MLV integrase. Nucleic Acids Res 43: 5647–5663. 10.1093/nar/gkv410 PubMed DOI PMC
Ballandras-Colas A, Brown M, Cook NJ, Dewdney TG, Demeler B, Cherepanov P, Lyumkis D, Engelman AN. 2016. Cryo-EM reveals a novel octameric integrase structure for betaretroviral intasome function. Nature 530: 358–361. 10.1038/nature16955 PubMed DOI PMC
Ballandras-Colas A, Maskell DP, Serrao E, Locke J, Swuec P, Jónsson SR, Kotecha A, Cook NJ, Pye VE, Taylor IA, et al. 2017. A supramolecular assembly mediates lentiviral DNA integration. Science 355: 93–95. 10.1126/science.aah7002 PubMed DOI PMC
Ballandras-Colas A, Chivukula V, Gruszka DT, Shan Z, Singh PK, Pye VE, McLean RK, Bedwell GJ, Li W, Nans A, et al. 2022. Multivalent interactions essential for lentiviral integrase function. Nat Commun 13: 2416. 10.1038/s41467-022-29928-8 PubMed DOI PMC
Benleulmi MS, Matysiak J, Henriquez DR, Vaillant C, Lesbats P, Calmels C, Naughtin M, Leon O, Skalka AM, Ruff M, et al. 2015. Intasome architecture and chromatin density modulate retroviral integration into nucleosome. Retrovirology 12: 13. 10.1186/s12977-015-0145-9 PubMed DOI PMC
Bhatt V, Shi K, Salamango DJ, Moeller NH, Pandey KK, Bera S, Bohl HO, Kurniawan F, Orellana K, Zhang W, et al. 2020. Structural basis of host protein hijacking in human T-cell leukemia virus integration. Nat Commun 11: 3121. 10.1038/s41467-020-16963-6 PubMed DOI PMC
Carreira-Perpiñán MA, Renals S. 2000. Practical identifiability of finite mixtures of multivariate Bernoulli distributions. Neural Comput 12: 141–152. 10.1162/089976600300015925 PubMed DOI
Carteau S, Hoffmann C, Bushman F. 1998. Chromosome structure and human immunodeficiency virus type 1 cDNA integration: centromeric alphoid repeats are a disfavored target. J Virol 72: 4005–4014. 10.1128/JVI.72.5.4005-4014.1998 PubMed DOI PMC
Ciuffi A, Llano M, Poeschla E, Hoffmann C, Leipzig J, Shinn P, Ecker JR, Bushman F. 2005. A role for LEDGF/p75 in targeting HIV DNA integration. Nat Med 11: 1287–1289. 10.1038/nm1329 PubMed DOI
Danecek P, Bonfield JK, Liddle J, Marshall J, Ohan V, Pollard MO, Whitwham A, Keane T, McCarthy SA, Davies RM, et al. 2021. Twelve years of SAMtools and BCFtools. GigaScience 10: giab008. 10.1093/gigascience/giab008 PubMed DOI PMC
Datta RR, Rister J. 2022. The power of the (imperfect) palindrome: sequence-specific roles of palindromic motifs in gene regulation. Bioessays 44: e2100191. 10.1002/bies.202100191 PubMed DOI PMC
Demeulemeester J, Vets S, Schrijvers R, Madlala P, De Maeyer M, De Rijck J, Ndung'u T, Debyser Z, Gijsbers R. 2014. HIV-1 integrase variants retarget viral integration and are associated with disease progression in a chronic infection cohort. Cell Host Microbe 16: 651–662. 10.1016/j.chom.2014.09.016 PubMed DOI
De Ravin SS, Su L, Theobald N, Choi U, Macpherson JL, Poidinger M, Symonds G, Pond SM, Ferris AL, Hughes SH, et al. 2014. Enhancers are major targets for murine leukemia virus vector integration. J Virol 88: 4504–4513. 10.1128/JVI.00011-14 PubMed DOI PMC
De Rijck J, de Kogel C, Demeulemeester J, Vets S, El Ashkar S, Malani N, Bushman FD, Landuyt B, Husson SJ, Busschots K, et al. 2013. The BET family of proteins targets Moloney murine leukemia virus integration near transcription start sites. Cell Rep 5: 886–894. 10.1016/j.celrep.2013.09.040 PubMed DOI PMC
Derse D, Crise B, Li Y, Princler G, Lum N, Stewart C, McGrath CF, Hughes SH, Munroe DJ, Wu X. 2007. Human T-cell leukemia virus type 1 integration target sites in the human genome: comparison with those of other retroviruses. J Virol 81: 6731–6741. 10.1128/JVI.02752-06 PubMed DOI PMC
Elleder D, Pavlíček A, Pačes J, Hejnar J. 2002. Preferential integration of human immunodeficiency virus type 1 into genes, cytogenetic R bands and GC-rich DNA regions: insight from the human genome sequence. FEBS Lett 517: 285–286. 10.1016/S0014-5793(02)02612-1 PubMed DOI
Fitzgerald ML, Vora AC, Zeh WG, Grandgenett DP. 1992. Concerted integration of viral DNA termini by purified avian myeloblastosis virus integrase. J Virol 66: 6257–6263. 10.1128/jvi.66.11.6257-6263.1992 PubMed DOI PMC
Gangadharan S, Mularoni L, Fain-Thornton J, Wheelan SJ, Craig NL. 2010. DNA transposon Hermes inserts into DNA in nucleosome-free regions in vivo. Proc Natl Acad Sci 107: 21966–21972. 10.1073/pnas.1016382107 PubMed DOI PMC
Grim J. 2006. EM cluster analysis for categorical data. In Lecture notes in computer science (ed. Yeung DY, et al.), pp. 640–648. Springer, Berlin.
Grim J. 2017. Approximation of unknown multivariate probability distributions by using mixtures of product components: a tutorial. Intern J Pattern Recogniti Artif Intell 31: 1750028. 10.1142/s0218001417500288 DOI
Gupta SS, Maetzig T, Maertens GN, Sharif A, Rothe M, Weidner-Glunde M, Galla M, Schambach A, Cherepanov P, Schulz TF. 2013. Bromo- and extraterminal domain chromatin regulators serve as cofactors for murine leukemia virus integration. J Virol 87: 12721–12736. 10.1128/JVI.01942-13 PubMed DOI PMC
Gyllenberg M, Koski T, Reilink E, Verlaan M. 1994. Non-uniqueness in probabilistic numerical identification of bacteria. J Appl Probab 31: 542–548. 10.2307/3215044 DOI
Hare S, Maertens GN, Cherepanov P. 2012. 3′-Processing and strand transfer catalysed by retroviral integrase in crystallo. EMBO J 31: 3020–3028. 10.1038/emboj.2012.118 PubMed DOI PMC
Harper AL, Skinner LM, Sudol M, Katzman M. 2001. Use of patient-derived human immunodeficiency virus type 1 integrases to identify a protein residue that affects target site selection. J Virol 75: 7756–7762. 10.1128/JVI.75.16.7756-7762.2001 PubMed DOI PMC
Harper AL, Sudol M, Katzman M. 2003. An amino acid in the central catalytic domain of three retroviral integrases that affects target site selection in nonviral DNA. J Virol 77: 3838–3845. 10.1128/JVI.77.6.3838-3845.2003 PubMed DOI PMC
Holman AG, Coffin JM. 2005. Symmetrical base preferences surrounding HIV-1, avian sarcoma/leukosis virus, and murine leukemia virus integration sites. Proc Natl Acad Sci 102: 6103–6107. 10.1073/pnas.0501646102 PubMed DOI PMC
Jóźwik IK, Li W, Zhang D-W, Wong D, Grawenhoff J, Ballandras-Colas A, Aiyer S, Cherepanov P, Engelman AN, Lyumkis D. 2022. B-to-A transition in target DNA during retroviral integration. Nucleic Acids Res 50: 8898–8918. 10.1093/nar/gkac644 PubMed DOI PMC
Kirk PDW, Huvet M, Melamed A, Maertens GN, Bangham CRM. 2016. Retroviruses integrate into a shared, non-palindromic DNA motif. Nat Microbiol 2: 16212. 10.1038/nmicrobiol.2016.212 PubMed DOI PMC
LaFave MC, Varshney GK, Gildea DE, Wolfsberg TG, Baxevanis AD, Burgess SM. 2014. MLV integration site selection is driven by strong enhancers and active promoters. Nucleic Acids Res 42: 4257–4269. 10.1093/nar/gkt1399 PubMed DOI PMC
Langmead B, Salzberg SL. 2012. Fast gapped-read alignment with Bowtie 2. Nat Methods 9: 357–359. 10.1038/nmeth.1923 PubMed DOI PMC
Leavitt AD, Rose RB, Varmus HE. 1992. Both substrate and target oligonucleotide sequences affect in vitro integration mediated by human immunodeficiency virus type 1 integrase protein produced in Saccharomyces cerevisiae. J Virol 66: 2359–2368. 10.1128/jvi.66.4.2359-2368.1992 PubMed DOI PMC
Lesbats P, Serrao E, Maskell DP, Pye VE, O'Reilly N, Lindemann D, Engelman AN, Cherepanov P. 2017. Structural basis for spumavirus GAG tethering to chromatin. Proc Natl Acad Sci 114: 5509–5514. 10.1073/pnas.1621159114 PubMed DOI PMC
Linheiro RS, Bergman CM. 2008. Testing the palindromic target site model for DNA transposon insertion using the Drosophila melanogaster P-element. Nucleic Acids Res 36: 6199–6208. 10.1093/nar/gkn563 PubMed DOI PMC
Maertens GN, Hare S, Cherepanov P. 2010. The mechanism of retroviral integration from X-ray structures of its key intermediates. Nature 468: 326–329. 10.1038/nature09517 PubMed DOI PMC
Malhotra S, Winans S, Lam G, Justice J, Morgan R, Beemon K. 2017. Selection for avian leukosis virus integration sites determines the clonal progression of B-cell lymphomas. PLoS Pathog 13: e1006708. 10.1371/journal.ppat.1006708 PubMed DOI PMC
Melamed A, Fitzgerald TW, Wang Y, Ma J, Birney E, Bangham CRM. 2022. Selective clonal persistence of human retroviruses in vivo: radial chromatin organization, integration site, and host transcription. Sci Adv 8: eabm6210. 10.1126/sciadv.abm6210 PubMed DOI PMC
Michieletto D, Lusic M, Marenduzzo D, Orlandini E. 2019. Physical principles of retroviral integration in the human genome. Nat Commun 10: 575. 10.1038/s41467-019-08333-8 PubMed DOI PMC
Mitchell RS, Beitzel BF, Schroder ARW, Shinn P, Chen H, Berry CC, Ecker JR, Bushman FD. 2004. Retroviral DNA integration: ASLV, HIV, and MLV show distinct target site preferences. PLoS Biol 2: E234. 10.1371/journal.pbio.0020234 PubMed DOI PMC
Miyao A, Tanaka K, Murata K, Sawaki H, Takeda S, Abe K, Shinozuka Y, Onosato K, Hirochika H. 2003. Target site specificity of the Tos17 retrotransposon shows a preference for insertion within genes and against insertion in retrotransposon-rich regions of the genome. Plant Cell 15: 1771–1780. 10.1105/tpc.012559 PubMed DOI PMC
Moiani A, Suerth JD, Gandolfi F, Rizzi E, Severgnini M, De Bellis G, Schambach A, Mavilio F. 2014. Genome-wide analysis of alpharetroviral integration in human hematopoietic stem/progenitor cells. Genes (Basel) 5: 415–429. 10.3390/genes5020415 PubMed DOI PMC
Mularoni L, Zhou Y, Bowen T, Gangadharan S, Wheelan SJ, Boeke JD. 2012. Retrotransposon Ty1 integration targets specifically positioned asymmetric nucleosomal DNA segments in tRNA hotspots. Genome Res 22: 693–703. 10.1101/gr.129460.111 PubMed DOI PMC
Naughtin M, Haftek-Terreau Z, Xavier J, Meyer S, Silvain M, Jaszczyszyn Y, Levy N, Miele V, Benleulmi MS, Ruff M, et al. 2015. DNA physical properties and nucleosome positions are major determinants of HIV-1 integrase selectivity. PLoS One 10: e0129427. 10.1371/journal.pone.0129427 PubMed DOI PMC
Pandey KK, Bera S, Shi K, Rau MJ, Oleru AV, Fitzpatrick JAJ, Engelman AN, Aihara H, Grandgenett DP. 2021. Cryo-EM structure of the Rous sarcoma virus octameric cleaved synaptic complex intasome. Commun Biol 4: 330. 10.1038/s42003-021-01855-2 PubMed DOI PMC
Passos DO, Li M, Yang R, Rebensburg SV, Ghirlando R, Jeon Y, Shkriabai N, Kvaratskhelia M, Craigie R, Lyumkis D. 2017. Cryo-EM structures and atomic model of the HIV-1 strand transfer complex intasome. Science 355: 89–92. 10.1126/science.aah5163 PubMed DOI PMC
Price AL, Eskin E, Pevzner PA. 2004. Whole-genome analysis of Alu repeat elements reveals complex evolutionary history. Genome Res 14: 2245–2252. 10.1101/gr.2693004 PubMed DOI PMC
Pryciak PM, Varmus HE. 1992. Nucleosomes, DNA-binding proteins, and DNA sequence modulate retroviral integration target site selection. Cell 69: 769–780. 10.1016/0092-8674(92)90289-O PubMed DOI
Pryciak PM, Sil A, Varmus HE. 1992. Retroviral integration into minichromosomes in vitro. EMBO J 11: 291–303. 10.1002/j.1460-2075.1992.tb05052.x PubMed DOI PMC
Quinlan AR, Hall IM. 2010. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26: 841–842. 10.1093/bioinformatics/btq033 PubMed DOI PMC
R Core Team. 2023. R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna. https://www.R-project.org/.
Riggs P, Blundell-Hunter G, Hagelberger J, Ren G, Ettwiller L, Berkmen M. 2021. Insertion specificity of the hATx-6 transposase of Hydra magnipapillata. Front Mol Biosci 8: 734154. 10.3389/fmolb.2021.734154 PubMed DOI PMC
Schröder ARW, Shinn P, Chen H, Berry C, Ecker JR, Bushman F. 2002. HIV-1 integration in the human genome favors active genes and local hotspots. Cell 110: 521–529. 10.1016/S0092-8674(02)00864-4 PubMed DOI
Serrao E, Krishnan L, Shun M-C, Li X, Cherepanov P, Engelman A, Maertens GN. 2014. Integrase residues that determine nucleotide preferences at sites of HIV-1 integration: implications for the mechanism of target DNA binding. Nucleic Acids Res 42: 5164–5176. 10.1093/nar/gku136 PubMed DOI PMC
Sharma A, Larue RC, Plumb MR, Malani N, Male F, Slaughter A, Kessl JJ, Shkriabai N, Coward E, Aiyer SS, et al. 2013. BET proteins promote efficient murine leukemia virus integration at transcription start sites. Proc Natl Acad Sci 110: 12036–12041. 10.1073/pnas.1307157110 PubMed DOI PMC
Shen W, Le S, Li Y, Hu F. 2016. SeqKit: a cross-platform and ultrafast toolkit for FASTA/Q file manipulation. PLoS One 11: e0163962. 10.1371/journal.pone.0163962 PubMed DOI PMC
Sowd GA, Serrao E, Wang H, Wang W, Fadel HJ, Poeschla EM, Engelman AN. 2016. A critical role for alternative polyadenylation factor CPSF6 in targeting HIV-1 integration to transcriptionally active chromatin. Proc Natl Acad Sci 113: E1054–E1063. 10.1073/pnas.1524213113 PubMed DOI PMC
Stevens SW, Griffith JD. 1996. Sequence analysis of the human DNA flanking sites of human immunodeficiency virus type 1 integration. J Virol 70: 6459–6462. 10.1128/jvi.70.9.6459-6462.1996 PubMed DOI PMC
Trobridge GD, Miller DG, Jacobs MA, Allen JM, Kiem H-P, Kaul R, Russell DW. 2006. Foamy virus vector integration sites in normal human cells. Proc Natl Acad Sci 103: 1498–1503. 10.1073/pnas.0510046103 PubMed DOI PMC
Vansant G, Chen H-C, Zorita E, Trejbalová K, Miklík D, Filion G, Debyser Z. 2020. The chromatin landscape at the HIV-1 provirus integration site determines viral expression. Nucleic Acids Res 48: 7801–7817. 10.1093/nar/gkaa536 PubMed DOI PMC
Vigdal TJ, Kaufman CD, Izsvák Z, Voytas DF, Ivics Z. 2002. Common physical properties of DNA affecting target site selection of Sleeping Beauty and other Tc1/mariner transposable elements. J Mol Biol 323: 441–452. 10.1016/S0022-2836(02)00991-9 PubMed DOI
Wagih O. 2017. ggseqlogo: a versatile R package for drawing sequence logos. Bioinformatics 33: 3645–3647. 10.1093/bioinformatics/btx469 PubMed DOI
Wilson MD, Renault L, Maskell DP, Ghoneim M, Pye VE, Nans A, Rueda DS, Cherepanov P, Costa A. 2019. Retroviral integration into nucleosomes through DNA looping and sliding along the histone octamer. Nat Commun 10: 4189. 10.1038/s41467-019-12007-w PubMed DOI PMC
Winans S, Larue RC, Abraham CM, Shkriabai N, Skopp A, Winkler D, Kvaratskhelia M, Beemon KL. 2017. The FACT complex promotes avian leukosis virus DNA integration. J Virol 91: e00082-17. 10.1128/JVI.00082-17 PubMed DOI PMC
Wu X, Li Y, Crise B, Burgess SM, Munroe DJ. 2005. Weak palindromic consensus sequences are a common feature found at the integration target sites of many retroviruses. J Virol 79: 5211–5214. 10.1128/JVI.79.8.5211-5214.2005 PubMed DOI PMC
Zhyvoloup A, Melamed A, Anderson I, Planas D, Lee C-H, Kriston-Vizi J, Ketteler R, Merritt A, Routy J-P, Ancuta P, et al. 2017. Digoxin reveals a functional connection between HIV-1 integration preference and T-cell activation. PLoS Pathog 13: e1006460. 10.1371/journal.ppat.1006460 PubMed DOI PMC