Identification of methylation-sensitive human transcription factors using meSMiLE-seq
Status PubMed-not-MEDLINE Jazyk angličtina Země Spojené státy americké Médium electronic
Typ dokumentu časopisecké články, preprinty
Grantová podpora
R01 HG013328
NHGRI NIH HHS - United States
P30 CA008748
NCI NIH HHS - United States
P30 AR070549
NIAMS NIH HHS - United States
U24 HG013078
NHGRI NIH HHS - United States
R01 AI173314
NIAID NIH HHS - United States
R01 AR073228
NIAMS NIH HHS - United States
R21 HG012258
NHGRI NIH HHS - United States
PubMed
39605503
PubMed Central
PMC11601298
DOI
10.1101/2024.11.11.619598
PII: 2024.11.11.619598
Knihovny.cz E-zdroje
- Publikační typ
- časopisecké články MeSH
- preprinty MeSH
Transcription factors (TFs) are key players in eukaryotic gene regulation, but the DNA binding specificity of many TFs remains unknown. Here, we assayed 284 mostly poorly characterized, putative human TFs using selective microfluidics-based ligand enrichment followed by sequencing (SMiLE-seq), revealing 72 new DNA binding motifs. To investigate whether some of the 158 TFs for which we did not find motifs preferably bind epigenetically modified DNA (i.e. methylated CG dinucleotides), we developed methylation-sensitive SMiLE-seq (meSMiLE-seq). This microfluidic assay simultaneously probes the affinity of a protein to methylated and unmethylated DNA, augmenting the capabilities of the original method to infer methylation-aware binding sites. We assayed 114 TFs with meSMiLE-seq and identified DNA-binding models for 48 proteins, including the known methylation-sensitive binding modes for POU5F1 and RFX5. For 11 TFs, binding to methylated DNA was preferred or resulted in the discovery of alternative, methylation-dependent motifs (e.g. PRDM13), while aversion towards methylated sequences was found for 13 TFs (e.g. USF3). Finally, we uncovered a potential role for ZHX2 as a putative binder of Z-DNA, a left-handed helical DNA structure which is adopted more frequently upon CpG methylation. Altogether, our study significantly expands the human TF codebook by identifying DNA binding motifs for 98 TFs, while providing a versatile platform to quantitatively assay the impact of DNA modifications on TF binding.
Department of Medical BioSciences Radboud University Medical Center 6500 HB Nijmegen The Netherlands
Institute of Organic Chemistry and Biochemistry Czech Academy of Sciences Prague Czech Republic
Institute of Protein Research Russian Academy of Sciences Pushchino Russia
Swiss Institute of Bioinformatics Lausanne Switzerland
University of Toronto Toronto Ontario Canada
Vavilov Institute of General Genetics Russian Academy of Sciences Moscow Russia
Zobrazit více v PubMed
Lambert S. A. et al. The Human Transcription Factors. Cell 172, 650–665 (2018). PubMed
Rauluseviciute I. et al. JASPAR 2024: 20th anniversary of the open-access database of transcription factor binding profiles. Nucleic Acids Research 52, D174–D182 (2024). PubMed PMC
Vorontsov I. E. et al. HOCOMOCO in 2024: a rebuild of the curated collection of binding models for human and mouse transcription factors. Nucleic Acids Research 52, D154–D163 (2024). PubMed PMC
Már M., Nitsenko K. & Heidarsson P. O. Multifunctional Intrinsically Disordered Regions in Transcription Factors. Chemistry – A European Journal 29, e202203369 (2023). PubMed
Laptenko O. et al. The p53 C Terminus Controls Site-Specific DNA Binding and Promotes Structural Changes within the Central DNA Binding Domain. Molecular Cell 57, 1034–1046 (2015). PubMed PMC
Baughman H. E. R. et al. An intrinsically disordered transcription activation domain increases the DNA binding affinity and reduces the specificity of NFκB p50/RelA. Journal of Biological Chemistry 298, (2022). PubMed PMC
Aizenshtein-Gazit S. & Orenstein Y. DeepZF: improved DNA-binding prediction of C2H2-zinc-finger proteins by deep transfer learning. Bioinformatics 38, ii62–ii67 (2022). PubMed
Najafabadi H. S. et al. C2H2 zinc finger proteins greatly expand the human regulatory lexicon. Nat Biotechnol 33, 555–562 (2015). PubMed
Weber M. et al. Distribution, silencing potential and evolutionary impact of promoter DNA methylation in the human genome. Nat Genet 39, 457–466 (2007). PubMed
Chatterjee R. & Vinson C. CpG methylation recruits sequence specific transcription factors essential for tissue specific gene expression. Biochimica et Biophysica Acta (BBA) - Gene Regulatory Mechanisms 1819, 763–770 (2012). PubMed PMC
Yin Y. et al. Impact of cytosine methylation on DNA binding specificities of human transcription factors. Science 356, eaaj2239 (2017). PubMed PMC
Du Q., Luu P.-L., Stirzaker C. & Clark S. J. Methyl-CpG-Binding Domain Proteins: Readers of the Epigenome. Epigenomics 7, 1051–1073 (2015). PubMed
Rishi V. et al. CpG methylation of half-CRE sequences creates C/EBPα binding sites that activate some tissue-specific genes. Proceedings of the National Academy of Sciences 107, 20311–20316 (2010). PubMed PMC
Kribelbauer J. F. et al. Quantitative Analysis of the DNA Methylation Sensitivity of Transcription Factor Complexes. Cell Reports 19, 2383–2395 (2017). PubMed PMC
Hu S. et al. DNA methylation presents distinct binding sites for human transcription factors. eLife 2, e00726 (2013). PubMed PMC
Jolma A. et al. Perspectives on Codebook: sequence specificity of uncharacterized human transcription factors. bioRxiv (2024) doi:10.1101/2024.11.11.622097. DOI
Razavi R. et al. Extensive binding of uncharacterized human transcription factors to genomic dark matter. bioRxiv (2024) doi:10.1101/2024.11.11.622123. DOI
Jolma A. et al. GHT-SELEX demonstrates unexpectedly high intrinsic sequence specificity and complex DNA binding of many human transcription factors. bioRxiv (2024) doi:10.1101/2024.11.11.618478. DOI
Maerkl S. J. & Quake S. R. A systems approach to measuring the binding energy landscapes of transcription factors. Science 315, 233–237 (2007). PubMed
Vorontsov I. E. et al. Cross-platform DNA motif discovery and benchmarking to explore binding specificities of poorly studied human transcription factors. bioRxiv (2024) doi:10.1101/2024.11.11.619379. DOI
Isakova A. et al. SMiLE-seq identifies binding motifs of single and dimeric transcription factors. Nat Methods 14, 316–322 (2017). PubMed
Rastogi C. et al. Accurate and sensitive quantification of protein-DNA binding affinity. Proceedings of the National Academy of Sciences 115, E3692–E3701 (2018). PubMed PMC
Rube H. T. et al. Prediction of protein–ligand binding affinity from sequencing data with interpretable machine learning. Nat Biotechnol 1–8 (2022) doi:10.1038/s41587-022-01307-0. PubMed DOI PMC
Heinz S. et al. Simple Combinations of Lineage-Determining Transcription Factors Prime cis-Regulatory Elements Required for Macrophage and B Cell Identities. Molecular Cell 38, 576–589 (2010). PubMed PMC
Grau J., Posch S., Grosse I. & Keilwagen J. A general approach for discriminative de novo motif discovery from high-throughput data. Nucleic Acids Research 41, e197 (2013). PubMed PMC
Bailey T. L., Johnson J., Grant C. E. & Noble W. S. The MEME Suite. Nucleic Acids Research 43, W39–W49 (2015). PubMed PMC
Foat B. C., Morozov A. V. & Bussemaker H. J. Statistical mechanical modeling of genome-wide transcription factor occupancy data by MatrixREDUCE. Bioinformatics 22, e141–e149 (2006). PubMed
Tan D. S. et al. The homeodomain of Oct4 is a dimeric binder of methylated CpG elements. Nucleic Acids Research 51, 1120–1138 (2023). PubMed PMC
Imbeault M., Helleboid P.-Y. & Trono D. KRAB zinc-finger proteins contribute to the evolution of gene regulatory networks. Nature 543, 550–554 (2017). PubMed
Schmitges F. W. et al. Multiparameter functional diversity of human C2H2 zinc finger proteins. Genome Res. 26, 1742–1752 (2016). PubMed PMC
Pratt H. E. et al. Factorbook: an updated catalog of transcription factor motifs and candidate regulatory motif sites. Nucleic Acids Research 50, D141–D149 (2022). PubMed PMC
Wasserman W. W. & Sandelin A. Applied bioinformatics for the identification of regulatory elements. Nat Rev Genet 5, 276–287 (2004). PubMed
Takahashi N. et al. ZNF445 is a primary regulator of genomic imprinting. Genes Dev 33, 49–54 (2019). PubMed PMC
Zhao S. et al. TNRC18 engages H3K9me3 to mediate silencing of endogenous retrotransposons. Nature 623, 633–642 (2023). PubMed PMC
Nuñez J. K. et al. Genome-wide programmable transcriptional memory by CRISPR-based epigenome editing. Cell 184, 2503–2519.e17 (2021). PubMed PMC
Hernandez-Corchado A. & Najafabadi H. S. Toward a base-resolution panorama of the in vivo impact of cytosine methylation on transcription factor binding. Genome Biology 23, 151 (2022). PubMed PMC
Ernst J. & Kellis M. Chromatin-state discovery and genome annotation with ChromHMM. Nat Protoc 12, 2478–2492 (2017). PubMed PMC
Neri F. et al. Intragenic DNA methylation prevents spurious transcription initiation. Nature 543, 72–77 (2017). PubMed
Küçük C. et al. Global Promoter Methylation Analysis Reveals Novel Candidate Tumor Suppressor Genes in Natural Killer Cell Lymphoma. Clinical Cancer Research 21, 1699–1711 (2015). PubMed PMC
Ball M. P. et al. Targeted and genome-scale strategies reveal gene-body methylation signatures in human cells. Nat Biotechnol 27, 361–368 (2009). PubMed PMC
Lister R. et al. Human DNA methylomes at base resolution show widespread epigenomic differences. Nature 462, 315–322 (2009). PubMed PMC
Iqbal J. et al. Genomic analyses reveal global functional alterations that promote tumor growth and novel tumor suppressor genes in natural killer-cell malignancies. Leukemia 23, 1139–1151 (2009). PubMed
Uhlen M. et al. A pathology atlas of the human cancer transcriptome. Science 357, eaan2507 (2017). PubMed
The Human Protein Atlas.
Bird L. E. et al. Novel structural features in two ZHX homeodomains derived from a systematic study of single and multiple domains. BMC Structural Biology 10, 13 (2010). PubMed PMC
Jumper J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021). PubMed PMC
Zhang J. et al. VHL substrate transcription factor ZHX2 as an oncogenic driver in clear cell renal cell carcinoma. Science 361, 290–295 (2018). PubMed PMC
Zhang Y. et al. ZHX2 emerges as a negative regulator of mitochondrial oxidative phosphorylation during acute liver injury. Nat Commun 14, 7527 (2023). PubMed PMC
Zhu L., Ding R., Yan H., Zhang J. & Lin Z. ZHX2 drives cell growth and migration via activating MEK/ERK signal and induces Sunitinib resistance by regulating the autophagy in clear cell Renal Cell Carcinoma. Cell Death Dis 11, 1–12 (2020). PubMed PMC
Fujii S., Wang A. H.-J., van der Marel G., van Boom J. H. & Rich A. Molecular structure of (m 5 dC-dG) 3 : the role of the methyl group on 5-methyl cytosine in stabilizing Z-DNA. Nucleic Acids Research 10, 7879–7892 (1982). PubMed PMC
Mitsui Y. et al. Physical and Enzymatic Studies on Poly d(I–C).Poly d(I–C), an Unusual Double-helical DNA. Nature 228, 1166–1169 (1970). PubMed
Guéron M., Demaret J.-Ph. & Filoche M. A Unified Theory of the B-Z Transition of DNA in High and Low Concentrations of Multivalent Ions. Biophysical Journal 78, 1070–1083 (2000). PubMed PMC
Jovin T. M., Soumpasis D. M. & McIntosh L. P. The Transition Between B-DNA and Z-DNA.
Rich A. & Zhang S. Z-DNA: the long road to biological function. Nat Rev Genet 4, 566–572 (2003). PubMed
Shin S.-I. et al. Z-DNA-forming sites identified by ChIP-Seq are associated with actively transcribed regions in the human genome. DNA Res 23, 477–486 (2016). PubMed PMC
Beknazarov N., Jin S. & Poptsova M. Deep learning approach for predicting functional Z-DNA regions using omics data. Sci Rep 10, 19134 (2020). PubMed PMC
Meng Y. et al. Z-DNA is remodelled by ZBTB43 in prospermatogonia to safeguard the germline genome and epigenome. Nat Cell Biol 24, 1141–1153 (2022). PubMed PMC
Richa R. & Sinha R. P. Hydroxymethylation of DNA: an epigenetic marker. EXCLI J 13, 592–610 (2014). PubMed PMC
Liu X. et al. N6-methyladenine is incorporated into mammalian genome by DNA polymerase. Cell Res 31, 94–97 (2021). PubMed PMC
Jones P. A. & Takai D. The Role of DNA Methylation in Mammalian Epigenetics. Science 293, 1068–1070 (2001). PubMed
Wang Z. et al. Complex impact of DNA methylation on transcriptional dysregulation across 22 human cancer types. Nucleic Acids Research 48, 2287–2302 (2020). PubMed PMC
Schübeler D. Function and information content of DNA methylation. Nature 517, 321–326 (2015). PubMed
Isbel L., Grand R. S. & Schübeler D. Generating specificity in genome regulation through transcription factor sensitivity to chromatin. Nat Rev Genet 23, 728–740 (2022). PubMed
Buzzo J. R. et al. Z-form extracellular DNA is a structural component of the bacterial biofilm matrix. Cell 184, 5740–5758.e17 (2021). PubMed PMC
Zhao C. et al. Polyamine metabolism controls B-to-Z DNA transition to orchestrate DNA sensor cGAS activity. Immunity 56, 2508–2522.e6 (2023). PubMed
Duardo R. C., Guerra F., Pepe S. & Capranico G. Non-B DNA structures as a booster of genome instability. Biochimie 214, 176–192 (2023). PubMed
Isakova A. et al. SMiLE-seq: Selective Microfluidics-based Ligand Enrichment followed by sequencing. Protocol Exchange (2017) doi:10.1038/protex.2016.089. DOI
The pandas development team. pandas-dev/pandas: Pandas. Zenodo; 10.5281/zenodo.10957263 (2024). DOI
McKinney W. Data Structures for Statistical Computing in Python. in 56–61 (Austin, Texas, 2010). doi:10.25080/Majora-92bf1922-00a. DOI
Harris C. R. et al. Array programming with NumPy. Nature 585, 357–362 (2020). PubMed PMC
Virtanen P. et al. SciPy 1.0: fundamental algorithms for scientific computing in Python. Nat Methods 17, 261–272 (2020). PubMed PMC
Hunter J. D. Matplotlib: A 2D Graphics Environment. Comput. Sci. Eng. 9, 90–95 (2007).
Tareen A. & Kinney J. B. Logomaker: beautiful sequence logos in Python. Bioinformatics 36, 2272–2274 (2020). PubMed PMC
Quinlan A. R. & Hall I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010). PubMed PMC
Grant C. E., Bailey T. L. & Noble W. S. FIMO: scanning for occurrences of a given motif. Bioinformatics 27, 1017–1018 (2011). PubMed PMC
Ziller M. J., Hansen K. D., Meissner A. & Aryee M. J. Coverage recommendations for methylation analysis by whole genome bisulfite sequencing. Nat Methods 12, 230–232 (2015). PubMed PMC
Chen E. Y. et al. Enrichr: interactive and collaborative HTML5 gene list enrichment analysis tool. BMC Bioinformatics 14, 128 (2013). PubMed PMC
Kuleshov M. V. et al. Enrichr: a comprehensive gene set enrichment analysis web server 2016 update. Nucleic Acids Research 44, W90–W97 (2016). PubMed PMC
Xie Z. et al. Gene Set Knowledge Discovery with Enrichr. Curr Protoc 1, e90 (2021). PubMed PMC
Malinverni R., Corujo D., Gel B. & Buschbeck M. regioneReloaded: evaluating the association of multiple genomic region sets. Bioinformatics 39, btad704 (2023). PubMed PMC