• This record comes from PubMed

De novo emergence, existence, and demise of a protein-coding gene in murids

. 2022 Dec 08 ; 20 (1) : 272. [epub] 20221208

Language English Country England, Great Britain Media electronic

Document type Journal Article

Grant support
647403 H2020 European Research Council
RVO 68378050-KAV-NPUI Akademie Věd České Republiky
RVO 68378050 Akademie Věd České Republiky
LM2018129 Ministerstvo Školství, Mládeže a Tělovýchovy
CZ.02.1.01/0.0/0.0/18_046/0016045 Ministerstvo Školství, Mládeže a Tělovýchovy
LM2018126 Ministerstvo Školství, Mládeže a Tělovýchovy
e-INFRA CZ LM2018140 Ministerstvo Školství, Mládeže a Tělovýchovy
#KK.01.1.1.01.0010 European Structural and Investment Funds
#KK.01.1.1.01.0009 European Structural and Investment Funds

Links

PubMed 36482406
PubMed Central PMC9733328
DOI 10.1186/s12915-022-01470-5
PII: 10.1186/s12915-022-01470-5
Knihovny.cz E-resources

BACKGROUND: Genes, principal units of genetic information, vary in complexity and evolutionary history. Less-complex genes (e.g., long non-coding RNA (lncRNA) expressing genes) readily emerge de novo from non-genic sequences and have high evolutionary turnover. Genesis of a gene may be facilitated by adoption of functional genic sequences from retrotransposon insertions. However, protein-coding sequences in extant genomes rarely lack any connection to an ancestral protein-coding sequence. RESULTS: We describe remarkable evolution of the murine gene D6Ertd527e and its orthologs in the rodent Muroidea superfamily. The D6Ertd527e emerged in a common ancestor of mice and hamsters most likely as a lncRNA-expressing gene. A major contributing factor was a long terminal repeat (LTR) retrotransposon insertion carrying an oocyte-specific promoter and a 5' terminal exon of the gene. The gene survived as an oocyte-specific lncRNA in several extant rodents while in some others the gene or its expression were lost. In the ancestral lineage of Mus musculus, the gene acquired protein-coding capacity where the bulk of the coding sequence formed through CAG (AGC) trinucleotide repeat expansion and duplications. These events generated a cytoplasmic serine-rich maternal protein. Knock-out of D6Ertd527e in mice has a small but detectable effect on fertility and the maternal transcriptome. CONCLUSIONS: While this evolving gene is not showing a clear function in laboratory mice, its documented evolutionary history in Muroidea during the last ~ 40 million years provides a textbook example of how a several common mutation events can support de novo gene formation, evolution of protein-coding capacity, as well as gene's demise.

See more in PubMed

Johannsen W. Elemente der exakten erblichkeitslehre. Deutsche wesentlich erweiterte ausgabe in fünfundzwanzig vorlesungen. Jena: G. Fischer; 1909. p. 534. https://www.archive.org/download/elementederexakt00joha/page/n4_w509.

Gerstein MB, Bruce C, Rozowsky JS, Zheng D, Du J, Korbel JO, et al. What is a gene, post-ENCODE? History and updated definition. Genome Res. 2007;17(6):669–681. doi: 10.1101/gr.6339607. PubMed DOI

Tatusov RL, Fedorova ND, Jackson JD, Jacobs AR, Kiryutin B, Koonin EV, et al. The COG database: an updated version includes eukaryotes. BMC Bioinformatics. 2003;4:41. doi: 10.1186/1471-2105-4-41. PubMed DOI PMC

Marchler-Bauer A, Zheng C, Chitsaz F, Derbyshire MK, Geer LY, Geer RC, et al. CDD: conserved domains and protein three-dimensional structure. Nucleic Acids Res. 2013;41(Database issue):D348–52. PubMed PMC

Mushegian A. Gene content of LUCA, the last universal common ancestor. Front Biosci. 2008;13:4657–4666. doi: 10.2741/3031. PubMed DOI

Kutter C, Watt S, Stefflova K, Wilson MD, Goncalves A, Ponting CP, et al. Rapid turnover of long noncoding RNAs and the evolution of gene expression. PLoS Genet. 2012;8(7):e1002841. doi: 10.1371/journal.pgen.1002841. PubMed DOI PMC

Hezroni H, Koppstein D, Schwartz MG, Avrutin A, Bartel DP, Ulitsky I. Principles of long noncoding RNA evolution derived from direct comparison of transcriptomes in 17 species. Cell Rep. 2015;11(7):1110–1122. doi: 10.1016/j.celrep.2015.04.023. PubMed DOI PMC

Kapusta A, Feschotte C. Volatile evolution of long noncoding RNA repertoires: mechanisms and biological implications. Trends Genet. 2014;30(10):439–452. doi: 10.1016/j.tig.2014.08.004. PubMed DOI PMC

Elisaphenko EA, Kolesnikov NN, Shevchenko AI, Rogozin IB, Nesterova TB, Brockdorff N, et al. A dual origin of the Xist gene from a protein-coding gene and a set of transposable elements. PLoS ONE. 2008;3(6):e2521. doi: 10.1371/journal.pone.0002521. PubMed DOI PMC

Hezroni H, Ben-Tov Perry R, Meir Z, Housman G, Lubelsky Y, Ulitsky I. A subset of conserved mammalian long non-coding RNAs are fossils of ancestral protein-coding genes. Genome Biol. 2017;18(1):162. doi: 10.1186/s13059-017-1293-0. PubMed DOI PMC

Housman G, Ulitsky I. Methods for distinguishing between protein-coding and long noncoding RNAs and the elusive biological purpose of translation of long noncoding RNAs. Biochim Biophys Acta. 2016;1859(1):31–40. doi: 10.1016/j.bbagrm.2015.07.017. PubMed DOI

Franke V, Ganesh S, Karlic R, Malik R, Pasulka J, Horvat F, et al. Long terminal repeats power evolution of genes and gene expression programs in mammalian oocytes and zygotes. Genome Res. 2017;27(8):1384–1394. doi: 10.1101/gr.216150.116. PubMed DOI PMC

Van Oss SB, Carvunis AR. De novo gene birth. PLoS Genet. 2019;15(5):e1008160. doi: 10.1371/journal.pgen.1008160. PubMed DOI PMC

Yona AH, Alm EJ, Gore J. Random sequences rapidly evolve into de novo promoters. Nat Commun. 2018;9(1):1530. doi: 10.1038/s41467-018-04026-w. PubMed DOI PMC

Gerdes P, Richardson SR, Mager DL, Faulkner GJ. Transposable elements in the mammalian embryo: pioneers surviving through stealth and service. Genome Biol. 2016;17:100. doi: 10.1186/s13059-016-0965-5. PubMed DOI PMC

de Souza FS, Franchini LF, Rubinstein M. Exaptation of transposable elements into novel cis-regulatory elements: is the evidence always strong? Mol Biol Evol. 2013;30(6):1239–1251. doi: 10.1093/molbev/mst045. PubMed DOI PMC

Goke J, Ng HH. CTRL+INSERT: retrotransposons and their contribution to regulation and innovation of the transcriptome. EMBO Rep. 2016;17(8):1131–1144. doi: 10.15252/embr.201642743. PubMed DOI PMC

Kapusta A, Kronenberg Z, Lynch VJ, Zhuo X, Ramsay L, Bourque G, et al. Transposable elements are major contributors to the origin, diversification, and regulation of vertebrate long noncoding RNAs. PLoS Genet. 2013;9(4):e1003470. doi: 10.1371/journal.pgen.1003470. PubMed DOI PMC

Ganesh S, Svoboda P. Retrotransposon-associated long non-coding RNAs in mice and men. Pflugers Arch. 2016;468(6):1049–1060. doi: 10.1007/s00424-016-1818-5. PubMed DOI

Brosius J, Gould SJ. On “genomenclature”: a comprehensive (and respectful) taxonomy for pseudogenes and other “junk DNA.” Proc Natl Acad Sci U S A. 1992;89(22):10706–10. PubMed PMC

Peaston AE, Evsikov AV, Graber JH, de Vries WN, Holbrook AE, Solter D, et al. Retrotransposons regulate host genes in mouse oocytes and preimplantation embryos. Dev Cell. 2004;7(4):597–606. doi: 10.1016/j.devcel.2004.09.004. PubMed DOI

Thompson PJ, Macfarlan TS, Lorincz MC. Long terminal repeats: from parasitic elements to building blocks of the transcriptional regulatory repertoire. Mol Cell. 2016;62(5):766–776. doi: 10.1016/j.molcel.2016.03.029. PubMed DOI PMC

Smit AF. Identification of a new, abundant superfamily of mammalian LTR-transposons. Nucleic Acids Res. 1993;21(8):1863–1872. doi: 10.1093/nar/21.8.1863. PubMed DOI PMC

Waterston RH, Lindblad-Toh K, Birney E, Rogers J, Abril JF, Agarwal P, et al. Initial sequencing and comparative analysis of the mouse genome. Nature. 2002;420(6915):520–562. doi: 10.1038/nature01262. PubMed DOI

Flemr M, Malik R, Franke V, Nejepinska J, Sedlacek R, Vlahovicek K, et al. A retrotransposon-driven dicer isoform directs endogenous small interfering RNA production in mouse oocytes. Cell. 2013;155(4):807–816. doi: 10.1016/j.cell.2013.10.001. PubMed DOI

Piao Y, Ko NT, Lim MK, Ko MS. Construction of long-transcript enriched cDNA libraries from submicrogram amounts of total RNAs by a universal PCR amplification method. Genome Res. 2001;11(9):1553–1558. doi: 10.1101/gr.185501. PubMed DOI PMC

Horvat F, Fulka H, Jankele R, Malik R, Jun M, Solcova K, et al. Role of Cnot6l in maternal mRNA turnover. Life Sci Alliance. 2018;1(4):e201800084. doi: 10.26508/lsa.201800084. PubMed DOI PMC

Kumar S, Hedges SB. A molecular timescale for vertebrate evolution. Nature. 1998;392(6679):917–920. doi: 10.1038/31927. PubMed DOI

Kumar S, Suleski M, Craig JM, Kasprowicz AE, Sanderford M, Li M, et al. TimeTree 5: an expanded resource for species divergence times. Mol Biol Evol. 2022;39(8):msac174. 10.1093/molbev/msac174 PubMed PMC

Steppan SJ, Schenk JJ. Muroid rodent phylogenetics: 900-species tree reveals increasing diversification rates. PLoS ONE. 2017;12(8):e0183070. doi: 10.1371/journal.pone.0183070. PubMed DOI PMC

Abe K, Yamamoto R, Franke V, Cao M, Suzuki Y, Suzuki MG, et al. The first murine zygotic transcription is promiscuous and uncoupled from splicing and 3’ processing. EMBO J. 2015;34(11):1523–37. PubMed PMC

Gahurova L, Tomizawa SI, Smallwood SA, Stewart-Morgan KR, Saadeh H, Kim J, et al. Transcription and chromatin determinants of de novo DNA methylation timing in oocytes. Epigenetics Chromatin. 2017;10:25. doi: 10.1186/s13072-017-0133-5. PubMed DOI PMC

Zhang H, Zhang F, Chen Q, Li M, Lv X, Xiao Y, et al. The piRNA pathway is essential for generating functional oocytes in golden hamsters. Nat Cell Biol. 2021;23(9):1013–1022. doi: 10.1038/s41556-021-00750-6. PubMed DOI

Yue F, Cheng Y, Breschi A, Vierstra J, Wu W, Ryba T, et al. A comparative encyclopedia of DNA elements in the mouse genome. Nature. 2014;515(7527):355–364. doi: 10.1038/nature13992. PubMed DOI PMC

Ganesh S, Horvat F, Drutovic D, Efenberkova M, Pinkas D, Jindrova A, et al. The most abundant maternal lncRNA Sirena1 acts post-transcriptionally and impacts mitochondrial distribution. Nucleic Acids Res. 2020;48(6):3211–3227. doi: 10.1093/nar/gkz1239. PubMed DOI PMC

Mamrot J, Gardner DK, Temple-Smith P, Dickinson H. Embryonic gene transcription in the spiny mouse (Acomys cahirinus): an investigtion into the embryonic genome activation. bioRxiv. 2018:280412. 10.1101/280412.

Wang L, Park HJ, Dasari S, Wang S, Kocher JP, Li W. CPAT: Coding-Potential Assessment Tool using an alignment-free logistic regression model. Nucleic Acids Res. 2013;41(6):e74. doi: 10.1093/nar/gkt006. PubMed DOI PMC

Simon MM, Greenaway S, White JK, Fuchs H, Gailus-Durner V, Wells S, et al. A comparative phenotypic and genomic analysis of C57BL/6J and C57BL/6N mouse strains. Genome Biol. 2013;14(7):R82. doi: 10.1186/gb-2013-14-7-r82. PubMed DOI PMC

Wang S, Kou Z, Jing Z, Zhang Y, Guo X, Dong M, et al. Proteome of mouse oocytes at different developmental stages. Proc Natl Acad Sci U S A. 2010;107(41):17639–17644. doi: 10.1073/pnas.1013185107. PubMed DOI PMC

Pfeiffer MJ, Siatkowski M, Paudel Y, Balbach ST, Baeumer N, Crosetto N, et al. Proteomic analysis of mouse oocytes reveals 28 candidate factors of the “reprogrammome.” J Proteome Res. 2011;10(5):2140–53. PubMed

Wang B, Pfeiffer MJ, Drexler HC, Fuellen G, Boiani M. Proteomic analysis of mouse oocytes identifies PRMT7 as a reprogramming factor that replaces SOX2 in the induction of pluripotent stem cells. J Proteome Res. 2016;15(8):2407–21. PubMed

Israel S, Ernst M, Psathaki OE, Drexler HCA, Casser E, Suzuki Y, et al. An integrated genome-wide multi-omics analysis of gene expression dynamics in the preimplantation mouse embryo. Sci Rep. 2019;9(1):13356. doi: 10.1038/s41598-019-49817-3. PubMed DOI PMC

Jumper J, Evans R, Pritzel A, Green T, Figurnov M, Ronneberger O, et al. Highly accurate protein structure prediction with AlphaFold. Nature. 2021;596(7873):583–589. doi: 10.1038/s41586-021-03819-2. PubMed DOI PMC

Weber EM, Algers B, Wurbel H, Hultgren J, Olsson IA. Influence of strain and parity on the risk of litter loss in laboratory mice. Reprod Domest Anim. 2013;48(2):292–296. doi: 10.1111/j.1439-0531.2012.02147.x. PubMed DOI

Karlic R, Ganesh S, Franke V, Svobodova E, Urbanova J, Suzuki Y, et al. Long non-coding RNA exchange during the oocyte-to-embryo transition in mice. DNA Res. 2017;24(2):129–141. doi: 10.1093/dnares/dsx008. PubMed DOI PMC

Sicinski P, Donaher JL, Geng Y, Parker SB, Gardner H, Park MY, et al. Cyclin D2 is an FSH-responsive gene involved in gonadal cell proliferation and oncogenesis. Nature. 1996;384(6608):470–474. doi: 10.1038/384470a0. PubMed DOI

Long AD, Baldwin-Brown J, Tao Y, Cook VJ, Balderrama-Gutierrez G, Corbett-Detig R, et al. The genome of Peromyscus leucopus, natural host for Lyme disease and other emerging infections. Sci Adv. 2019;5(7):eaaw6441. doi: 10.1126/sciadv.aaw6441. PubMed DOI PMC

Harringmeyer OS, Hoekstra HE. Chromosomal inversion polymorphisms shape the genomic landscape of deer mice. Nat Ecol Evol. 2022:1–15. 10.1038/s41559-022-01890-0. PubMed PMC

Ingolia NT, Brar GA, Stern-Ginossar N, Harris MS, Talhouarne GJ, Jackson SE, et al. Ribosome profiling reveals pervasive translation outside of annotated protein-coding genes. Cell Rep. 2014;8(5):1365–1379. doi: 10.1016/j.celrep.2014.07.045. PubMed DOI PMC

Ji Z, Song R, Regev A, Struhl K. Many lncRNAs, 5’UTRs, and pseudogenes are translated and some are likely to express functional proteins. Elife. 2015;4:e08890. PubMed PMC

Kim JC, Mirkin SM. The balancing act of DNA repeat expansions. Curr Opin Genet Dev. 2013;23(3):280–288. doi: 10.1016/j.gde.2013.04.009. PubMed DOI PMC

Mier P, Alanis-Lobato G, Andrade-Navarro MA. Context characterization of amino acid homorepeats using evolution, position, and order. Proteins. 2017;85(4):709–719. doi: 10.1002/prot.25250. PubMed DOI

Chavali S, Singh AK, Santhanam B, Babu MM. Amino acid homorepeats in proteins. Nat Rev Chem. 2020;4(8):420–434. doi: 10.1038/s41570-020-0204-1. PubMed DOI

Shao J, Diamond MI. Polyglutamine diseases: emerging concepts in pathogenesis and therapy. Hum Mol Genet. 2007;16 Spec No. 2:R115–23. doi: 10.1093/hmg/ddm213. PubMed DOI

Chen L, DeVries AL, Cheng CH. Evolution of antifreeze glycoprotein gene from a trypsinogen gene in Antarctic notothenioid fish. Proc Natl Acad Sci U S A. 1997;94(8):3811–3816. doi: 10.1073/pnas.94.8.3811. PubMed DOI PMC

Chen L, DeVries AL, Cheng CH. Convergent evolution of antifreeze glycoproteins in Antarctic notothenioid fish and Arctic cod. Proc Natl Acad Sci U S A. 1997;94(8):3817–3822. doi: 10.1073/pnas.94.8.3817. PubMed DOI PMC

Carducci F, Biscotti MA, Canapa A. Vitellogenin gene family in vertebrates: evolution and functions. Eur Zoological J. 2019;86(1):233–240. doi: 10.1080/24750263.2019.1631398. DOI

Sun C, Zhang S. Immune-relevant and antioxidant activities of vitellogenin and yolk proteins in fish. Nutrients. 2015;7(10):8818–8829. doi: 10.3390/nu7105432. PubMed DOI PMC

Li H, Zhang S. Functions of vitellogenin in eggs. Results Probl Cell Differ. 2017;63:389–401. doi: 10.1007/978-3-319-60855-6_17. PubMed DOI

Taborsky G. Phosvitin. Adv Inorg Biochem. 1983;5:235–279. PubMed

Finn RN. Vertebrate yolk complexes and the functional implications of phosvitins and other subdomains in vitellogenins. Biol Reprod. 2007;76(6):926–935. doi: 10.1095/biolreprod.106.059766. PubMed DOI

Ishikawa S, Yano Y, Arihara K, Itoh M. Egg yolk phosvitin inhibits hydroxyl radical formation from the fenton reaction. Biosci Biotechnol Biochem. 2004;68(6):1324–1331. doi: 10.1271/bbb.68.1324. PubMed DOI

Brawand D, Wahli W, Kaessmann H. Loss of egg yolk genes in mammals and the origin of lactation and placentation. PLoS Biol. 2008;6(3):e63. doi: 10.1371/journal.pbio.0060063. PubMed DOI PMC

Long M, Betran E, Thornton K, Wang W. The origin of new genes: glimpses from the young and old. Nat Rev Genet. 2003;4(11):865–875. doi: 10.1038/nrg1204. PubMed DOI

McLysaght A, Hurst LD. Open questions in the study of de novo genes: what, how and why. Nat Rev Genet. 2016;17(9):567–578. doi: 10.1038/nrg.2016.78. PubMed DOI

Nagy A. In: Manipulating the mouse embryo : a laboratory manual. 3rd ed. Cold Spring Harbor, N.Y.: Cold Spring Harbor Laboratory Press; 2003. p. x, 764.

Kataruka S, Modrak M, Kinterova V, Malik R, Zeitler DM, Horvat F, et al. MicroRNA dilution during oocyte growth disables the microRNA pathway in mammalian oocytes. Nucleic Acids Res. 2020;48(14):8050–8062. doi: 10.1093/nar/gkaa543. PubMed DOI PMC

Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics. 2013;29(1):15–21. doi: 10.1093/bioinformatics/bts635. PubMed DOI PMC

Kent WJ, Zweig AS, Barber G, Hinrichs AS, Karolchik D. BigWig and BigBed: enabling browsing of large distributed datasets. Bioinformatics. 2010;26(17):2204–2207. doi: 10.1093/bioinformatics/btq351. PubMed DOI PMC

R Core Team . R: a language and environment for statistical computing. Vienna: R Foundation for Statistical Computing; 2013.

Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014;15(12):550. doi: 10.1186/s13059-014-0550-8. PubMed DOI PMC

Horvat F. De novo emergence, existence, and demise of a protein-coding gene in murids. NCBI GEO accession GSE213820. 2022. PubMed PMC

Newest 20 citations...

See more in
Medvik | PubMed

De novo emergence, existence, and demise of a protein-coding gene in murids

. 2022 Dec 08 ; 20 (1) : 272. [epub] 20221208

Find record

Citation metrics

Loading data ...

Archiving options

Loading data ...