BACKGROUND: In trypanosomatids, a group of unicellular eukaryotes that includes numerous important human parasites, cis-splicing has been previously reported for only two genes: a poly(A) polymerase and an RNA helicase. Conversely, trans-splicing, which involves the attachment of a spliced leader sequence, is observed for nearly every protein-coding transcript. So far, our understanding of splicing in this protistan group has stemmed from the analysis of only a few medically relevant species. In this study, we used an extensive dataset encompassing all described trypanosomatid genera to investigate the distribution of intron-containing genes and the evolution of splice sites. RESULTS: We identified a new conserved intron-containing gene encoding an RNA-binding protein that is universally present in Kinetoplastea. We show that Perkinsela sp., a kinetoplastid endosymbiont of Amoebozoa, represents the first eukaryote completely devoid of cis-splicing, yet still preserving trans-splicing. We also provided evidence for reverse transcriptase-mediated intron loss in Kinetoplastea, extensive conservation of 5' splice sites, and the presence of non-coding RNAs within a subset of retained trypanosomatid introns. CONCLUSIONS: All three intron-containing genes identified in Kinetoplastea encode RNA-interacting proteins, with a potential to fine-tune the expression of multiple genes, thus challenging the perception of cis-splicing in these protists as a mere evolutionary relic. We suggest that there is a selective pressure to retain cis-splicing in trypanosomatids and that this is likely associated with overall control of mRNA processing. Our study provides new insights into the evolution of introns and, consequently, the regulation of gene expression in eukaryotes.
- Klíčová slova
- Introns, Kinetoplastea, Poly(A) polymerase, RNA helicase, RNA-binding protein, Splicing, Trypanosomatidae,
- MeSH
- fylogeneze MeSH
- introny * genetika MeSH
- Kinetoplastida genetika MeSH
- molekulární evoluce MeSH
- protozoální geny genetika MeSH
- protozoální proteiny genetika MeSH
- trans-splicing * genetika MeSH
- Trypanosomatina genetika MeSH
- Publikační typ
- časopisecké články MeSH
- Názvy látek
- protozoální proteiny MeSH
Despite the fact that introns mean an energy and time burden for eukaryotic cells, they play an irreplaceable role in the diversification and regulation of protein production. As a common feature of eukaryotic genomes, it has been reported that in protein-coding genes, the longest intron is usually one of the first introns. The goal of our work was to find a possible difference in the biological function of genes that fulfill this common feature compared to genes that do not. Data on the lengths of all introns in genes were extracted from the genomes of six vertebrates (human, mouse, koala, chicken, zebrafish and fugu) and two other model organisms (nematode worm and arabidopsis). We showed that more than 40% of protein-coding genes have the relative position of the longest intron located in the second or third tertile of all introns. Genes divided according to the relative position of the longest intron were found to be significantly increased in different KEGG pathways. Genes with the longest intron in the first tertile predominate in a range of pathways for amino acid and lipid metabolism, various signaling, cell junctions or ABC transporters. Genes with the longest intron in the second or third tertile show increased representation in pathways associated with the formation and function of the spliceosome and ribosomes. In the two groups of genes defined in this way, we further demonstrated the difference in the length of the longest introns and the distribution of their absolute positions. We also pointed out other characteristics, namely the positive correlation between the length of the longest intron and the sum of the lengths of all other introns in the gene and the preservation of the exact same absolute and relative position of the longest intron between orthologous genes.
- Klíčová slova
- Eukaryotes, Gene function, Gene structure, Genome, Longest intron, Ribosome biogenesis, Spliceosome,
- MeSH
- Arabidopsis genetika MeSH
- introny * genetika MeSH
- lidé MeSH
- spliceozomy genetika metabolismus MeSH
- zvířata MeSH
- Check Tag
- lidé MeSH
- zvířata MeSH
- Publikační typ
- časopisecké články MeSH
Germline CHEK2 pathogenic variants confer an increased risk of female breast cancer (FBC). Here we describe a recurrent germline intronic variant c.1009-118_1009-87delinsC, which showed a splice acceptor shift in RNA analysis, introducing a premature stop codon (p.Tyr337PhefsTer37). The variant was found in 21/10,204 (0.21%) Czech FBC patients compared to 1/3250 (0.03%) controls (p = 0.04) and in 4/3639 (0.11%) FBC patients from an independent German dataset. In addition, we found this variant in 5/2966 (0.17%) Czech (but none of the 443 German) ovarian cancer patients, three of whom developed early-onset tumors. Based on these observations, we classified this variant as likely pathogenic.
- Klíčová slova
- Breast cancer, Deep intronic CHEK2 variant, Genetic testing, NGS, RNA analysis,
- MeSH
- checkpoint kinasa 2 * genetika MeSH
- dospělí MeSH
- genetická predispozice k nemoci * genetika MeSH
- introny * genetika MeSH
- lidé středního věku MeSH
- lidé MeSH
- nádory prsu * genetika MeSH
- nádory vaječníků genetika MeSH
- prekurzory RNA genetika MeSH
- sestřih RNA * genetika MeSH
- zárodečné mutace * MeSH
- Check Tag
- dospělí MeSH
- lidé středního věku MeSH
- lidé MeSH
- ženské pohlaví MeSH
- Publikační typ
- časopisecké články MeSH
- Geografické názvy
- Česká republika MeSH
- Německo MeSH
- Názvy látek
- CHEK2 protein, human MeSH Prohlížeč
Giardia lambliacauses giardiasis, one of the most common human infectious diseases globally. Previous studies from our lab have shown that hsp90 gene ofGiardia is split into two halves, namely hspN and hspC. The independent pre-mRNAs of these split genes join by trans-splicing, producing a full-length Hsp90 (FlHsp90) mRNA. Genetic manipulation of the participating genes is necessary to understand the mechanism and significance of such trans-splicing based expression of Hsp90. In this study, we have performed transfection based exogenous expression of hspN and/or hspC in G. lamblia. We electroporated a plasmid containing the Avi-tagged hspN component of Hsp90 and examined its fate in G. lamblia. We show that the exogenously expressed hspN RNA gets trans-spliced to endogenously expressed hspC RNA, giving rise to a hybrid-FlHsp90. We highlight the importance of cis-elements in this trans-splicing reaction through mutational analysis. The episomal plasmid carrying deletions in the intronic region of hspN, showed inhibition of the trans-splicing reaction.Additionally, exogenous hspC RNA also followed the same fate as of exogenous hspN, while upon co-transfection with episomal hspN, they underwent trans-splicing with each other. Using eGFP as a test protein, we have shown that intronic sequences of hsp90 gene can guide trans-splicing mediated repair of any associated exonic sequences. Our study provides in vivo validation of Hsp90 trans-splicing, showing crucial role of cis-elements and importantly highlights the potential of hsp90 intronic sequences to function as a minimal splicing tool.
- Klíčová slova
- Gene expression, Giardia lamblia, Hsp90, RNA splicing, Transfection,
- MeSH
- Giardia lamblia * genetika MeSH
- introny genetika MeSH
- prekurzory RNA genetika MeSH
- proteiny tepelného šoku HSP90 * genetika MeSH
- protozoální proteiny * genetika MeSH
- trans-splicing * genetika MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Názvy látek
- prekurzory RNA MeSH
- proteiny tepelného šoku HSP90 * MeSH
- protozoální proteiny * MeSH
Spliceosome assembly contributes an important but incompletely understood aspect of splicing regulation. Prp45 is a yeast splicing factor which runs as an extended fold through the spliceosome, and which may be important for bringing its components together. We performed a whole genome analysis of the genetic interaction network of the truncated allele of PRP45 (prp45(1-169)) using synthetic genetic array technology and found chromatin remodellers and modifiers as an enriched category. In agreement with related studies, H2A.Z-encoding HTZ1, and the components of SWR1, INO80, and SAGA complexes represented prominent interactors, with htz1 conferring the strongest growth defect. Because the truncation of Prp45 disproportionately affected low copy number transcripts of intron-containing genes, we prepared strains carrying intronless versions of SRB2, VPS75, or HRB1, the most affected cases with transcription-related function. Intron removal from SRB2, but not from the other genes, partly repaired some but not all the growth phenotypes identified in the genetic screen. The interaction of prp45(1-169) and htz1Δ was detectable even in cells with SRB2 intron deleted (srb2Δi). The less truncated variant, prp45(1-330), had a synthetic growth defect with htz1Δ at 16°C, which also persisted in the srb2Δi background. Moreover, htz1Δ enhanced prp45(1-330) dependent pre-mRNA hyper-accumulation of both high and low efficiency splicers, genes ECM33 and COF1, respectively. We conclude that while the expression defects of low expression intron-containing genes contribute to the genetic interactome of prp45(1-169), the genetic interactions between prp45 and htz1 alleles demonstrate the sensitivity of spliceosome assembly, delayed in prp45(1-169), to the chromatin environment.
- Klíčová slova
- H2A.Z, Synthetic genetic array analysis, chromatin modifiers, co-transcriptional splicing, spliceosome assembly,
- MeSH
- fenotyp * MeSH
- histony metabolismus genetika MeSH
- introny * MeSH
- regulace genové exprese u hub MeSH
- Saccharomyces cerevisiae - proteiny * genetika metabolismus MeSH
- Saccharomyces cerevisiae * genetika metabolismus MeSH
- sestřih RNA * MeSH
- sestřihové faktory genetika metabolismus MeSH
- spliceozomy * metabolismus genetika MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Research Support, N.I.H., Extramural MeSH
- Názvy látek
- histony MeSH
- Saccharomyces cerevisiae - proteiny * MeSH
- sestřihové faktory MeSH
Protein tyrosine phosphatase, nonreceptor type 22 (PTPN22), is an archetypal non-HLA autoimmunity gene. It is one of the most prominent genetic contributors to type 1 diabetes mellitus outside the HLA region, and prevalence of its risk variants is subject to enormous geographic variability. Here, we address the genetic background of patients with type 1 diabetes mellitus of Armenian descent. Armenia has a population that has been genetically isolated for 3000 years. We hypothesized that two PTPN22 polymorphisms, rs2476601 and rs1310182, are associated with type 1 diabetes mellitus in persons of Armenian descent. In this association study, we genotyped the allelic frequencies of two risk-associated PTPN22 variants in 96 patients with type 1 diabetes mellitus and 100 controls of Armenian descent. We subsequently examined the associations of PTPN22 variants with the manifestation of type 1 diabetes mellitus and its clinical characteristics. We found that the rs2476601 minor allele (c.1858T) frequency in the control population was very low (q = 0.015), and the trend toward increased frequency of c.1858CT heterozygotes among patients with type 1 diabetes mellitus was not significant (OR 3.34, 95% CI 0.88-12.75; χ2 test p > 0.05). The control population had a high frequency of the minor allele of rs1310182 (q = 0.375). The frequency of c.2054-852TC heterozygotes was significantly higher among the patients with type 1 diabetes mellitus (OR 2.39, 95% CI 1.35-4.24; χ2 test p < 0.001), as was the frequency of the T allele (OR 4.82, 95% CI 2.38-9.76; χ2 test p < 0.001). The rs2476601 c.1858CT genotype and the T allele correlated negatively with the insulin dose needed three to six months after diagnosis. The rs1310182 c.2054-852CC genotype was positively associated with higher HbA1c at diagnosis and 12 months after diagnosis. We have provided the first information on diabetes-associated polymorphisms in PTPN22 in a genetically isolated Armenian population. We found only a limited contribution of the prototypic gain-of-function PTPN22 polymorphism rs2476601. In contrast, we found an unexpectedly close association of type 1 diabetes mellitus with rs1310182.
- MeSH
- diabetes mellitus 1. typu * genetika MeSH
- fosfatasy MeSH
- introny MeSH
- lidé MeSH
- polymorfismus genetický MeSH
- tyrosinfosfatasa nereceptorového typu 22 genetika MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Geografické názvy
- Arménie epidemiologie MeSH
- Názvy látek
- fosfatasy MeSH
- PTPN22 protein, human MeSH Prohlížeč
- tyrosinfosfatasa nereceptorového typu 22 MeSH
Among alternative splicing events in the human transcriptome, tandem NAGNAG acceptor splice sites represent an appreciable proportion. Both proximal and distal NAG can be used to produce two splicing isoforms differing by three nucleotides. In some cases, the upstream exon can be alternatively spliced as well, which further increases the number of possible transcripts. In this study, we showed that NAG choice in tandem splice site depends considerably not only on the concerned acceptor, but also on the upstream donor splice site sequence. Using an extensive set of experiments with systematically modified two-exonic minigene systems of AFAP1L2 or CSTD gene, we recognized the third and fifth intronic upstream donor splice site position and the tandem acceptor splice site region spanning from -10 to +2, including NAGNAG itself, as the main drivers. In addition, competition between different branch points and their composition were also shown to play a significant role in NAG choice. All these nucleotide effects appeared almost additive, which explained the high variability in proximal versus distal NAG usage.
- Klíčová slova
- AFAP1L2, Alternative splicing, NAG choice, RNA splicing, Splicing isoform,
- MeSH
- alternativní sestřih genetika MeSH
- exony genetika MeSH
- HeLa buňky MeSH
- introny genetika MeSH
- lidé MeSH
- místa sestřihu RNA genetika MeSH
- nádorové buněčné linie MeSH
- nukleotidy genetika MeSH
- tandemové repetitivní sekvence genetika MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
- Názvy látek
- místa sestřihu RNA MeSH
- nukleotidy MeSH
Several bioinformatic tools have been developed for genome-wide identification of orthologous and paralogous genes. However, no corresponding tool allows the detection of exon homology relationships. Here, we present ExOrthist, a fully reproducible Nextflow-based software enabling inference of exon homologs and orthogroups, visualization of evolution of exon-intron structures, and assessment of conservation of alternative splicing patterns. ExOrthist evaluates exon sequence conservation and considers the surrounding exon-intron context to derive genome-wide multi-species exon homologies at any evolutionary distance. We demonstrate its use in different evolutionary scenarios: whole genome duplication in frogs and convergence of Nova-regulated splicing networks ( https://github.com/biocorecrg/ExOrthist ).
- Klíčová slova
- Alternative splicing, Intron-exon structures, Orthology, Paralogy,
- MeSH
- alternativní sestřih MeSH
- exony * MeSH
- genom MeSH
- introny MeSH
- konzervovaná sekvence MeSH
- lidé MeSH
- molekulární evoluce * MeSH
- myši MeSH
- software * MeSH
- výpočetní biologie * MeSH
- zvířata MeSH
- Check Tag
- lidé MeSH
- myši MeSH
- zvířata MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
Spliceosomal introns interrupt nuclear genes and are removed from RNA transcripts ("spliced") by machinery called spliceosomes. Although the vast majority of spliceosomal introns are removed by the so-called major (or "U2") spliceosome, diverse eukaryotes also contain a rare second form, the minor ("U12") spliceosome, and associated ("U12-type") introns.1-3 In all characterized species, U12-type introns are distinguished by several features, including being rare in the genome (∼0.5% of all introns),4-6 containing extended evolutionarily conserved splicing motifs,4,5,7,8 being generally ancient,9,10 and being inefficiently spliced.11-13 Here, we report a remarkable exception in the slime mold Physarum polycephalum. The P. polycephalum genome contains >20,000 U12-type introns-25 times more than any other species-enriched in a diversity of non-canonical splice boundaries as well as transformed splicing signals that appear to have co-evolved with the spliceosome due to massive gain of efficiently spliced U12-type introns. These results reveal an unappreciated dynamism of minor spliceosomal introns and spliceosomal introns in general.
- Klíčová slova
- U12, U12-type introns, bioinformatics, comparative genomics, evolution, genomics, intron evolution, intron gain, minor introns, minor spliceosome,
- MeSH
- introny * MeSH
- Physarum polycephalum * genetika MeSH
- RNA malá jaderná genetika metabolismus MeSH
- sestřih RNA MeSH
- spliceozomy * genetika metabolismus MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Research Support, U.S. Gov't, Non-P.H.S. MeSH
- Názvy látek
- RNA malá jaderná MeSH
Euglenids represent a group of protists with diverse modes of feeding. To date, only a partial genomic sequence of Euglena gracilis and transcriptomes of several phototrophic and secondarily osmotrophic species are available, while primarily heterotrophic euglenids are seriously undersampled. In this work, we begin to fill this gap by presenting genomic and transcriptomic drafts of a primary osmotroph, Rhabdomonas costata. The current genomic assembly length of 100 Mbp is 14× smaller than that of E. gracilis. Despite being too fragmented for comprehensive gene prediction it provided fragments of the mitochondrial genome and comparison of the transcriptomic and genomic data revealed features of its introns, including several candidates for nonconventional types. A set of 39,456 putative R. costata proteins was predicted from the transcriptome. Annotation of the mitochondrial core metabolism provides the first data on the facultatively anaerobic mitochondrion of R. costata, which in most respects resembles the mitochondrion of E. gracilis with a certain level of streamlining. R. costata can synthetise thiamine by enzymes of heterogenous provenances and haem by a mitochondrial-cytoplasmic C4 pathway with enzymes orthologous to those found in E. gracilis. The low percentage of green algae-affiliated genes supports the ancestrally osmotrophic status of this species.
- MeSH
- biologická evoluce MeSH
- Chromatium genetika metabolismus MeSH
- Euglenida genetika metabolismus MeSH
- exony genetika MeSH
- fylogeneze MeSH
- genom MeSH
- heterotrofní procesy MeSH
- introny genetika MeSH
- mitochondrie genetika MeSH
- sekvenční analýza DNA metody MeSH
- transkriptom genetika MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH