Protein-coding sequences
Dotaz
Zobrazit nápovědu
Many nascent long non-coding RNAs (lncRNAs) undergo the same maturation steps as pre-mRNAs of protein-coding genes (PCGs), but they are often poorly spliced. To identify the underlying mechanisms for this phenomenon, we searched for putative splicing inhibitory sequences using the ncRNA-a2 as a model. Genome-wide analyses of intergenic lncRNAs (lincRNAs) revealed that lincRNA splicing efficiency positively correlates with 5'ss strength while no such correlation was identified for PCGs. In addition, efficiently spliced lincRNAs have higher thymidine content in the polypyrimidine tract (PPT) compared to efficiently spliced PCGs. Using model lincRNAs, we provide experimental evidence that strengthening the 5'ss and increasing the T content in PPT significantly enhances lincRNA splicing. We further showed that lincRNA exons contain less putative binding sites for SR proteins. To map binding of SR proteins to lincRNAs, we performed iCLIP with SRSF2, SRSF5 and SRSF6 and analyzed eCLIP data for SRSF1, SRSF7 and SRSF9. All examined SR proteins bind lincRNA exons to a much lower extent than expression-matched PCGs. We propose that lincRNAs lack the cooperative interaction network that enhances splicing, which renders their splicing outcome more dependent on the optimality of splice sites.
Nonsense mutations turn a coding (sense) codon into an in-frame stop codon that is assumed to result in a truncated protein product. Thus, nonsense substitutions are the hallmark of pseudogenes and are used to identify them. Here we show that in-frame stop codons within bacterial protein-coding genes are widespread. Their evolutionary conservation suggests that many of them are not pseudogenes, since they maintain dN/dS values (ratios of substitution rates at non-synonymous and synonymous sites) significantly lower than 1 (this is a signature of purifying selection in protein-coding regions). We also found that double substitutions in codons-where an intermediate step is a nonsense substitution-show a higher rate of evolution compared to null models, indicating that a stop codon was introduced and then changed back to sense via positive selection. This further supports the notion that nonsense substitutions in bacteria are relatively common and do not necessarily cause pseudogenization. In-frame stop codons may be an important mechanism of regulation: Such codons are likely to cause a substantial decrease of protein expression levels.
- MeSH
- Bacteria klasifikace genetika MeSH
- bakteriální proteiny klasifikace genetika MeSH
- bodová mutace MeSH
- fylogeneze MeSH
- modely genetické MeSH
- molekulární evoluce MeSH
- nesmyslný kodon * MeSH
- otevřené čtecí rámce genetika MeSH
- prokaryotické buňky metabolismus MeSH
- pseudogeny genetika MeSH
- sekvence nukleotidů MeSH
- sekvenční homologie nukleových kyselin MeSH
- selekce (genetika) MeSH
- terminační kodon genetika MeSH
- Publikační typ
- časopisecké články MeSH
Two variants of an mRNA sequence are identified that are expressed at high levels in rat ameloblasts during the formation of the enamel matrix. The sequences contain open reading frames for 407 and 324 amino acid residues, respectively. The encoded proteins, which we call amelins, are rich in proline, glycine, leucine, and alanine residues and contain the peptide domain DGEA, an integrin recognition sequence. The sequences coding for the C-terminal 305 amino acid residues, the 3' nontranslated part, and a microsatellite repeat at the nontranslated 5' region are identical in both mRNA variants. The remaining 5' regions contain 338 nucleotides unique to the long variant, 54 common nucleotides, and 46 nucleotides present only in the short variant. Eleven nucleotides have the potential to code for 5 amino acids of both proteins in different reading frames. The reading frame of the longer variant includes codons for a typical N-terminal signal peptide. The amelins are likely to be constituents of the enamel matrix and the only proteins that have so far been implicated in binding interactions between the ameloblast surface and its extracellular matrix.
- MeSH
- ameloblasty * fyziologie MeSH
- cytoskeletální proteiny * genetika metabolismus MeSH
- genetický kód MeSH
- genomová knihovna * MeSH
- hybridizace in situ MeSH
- krysa rodu rattus MeSH
- molekulární sekvence - údaje MeSH
- oligonukleotidové sondy MeSH
- potkani Sprague-Dawley MeSH
- proteiny nervové tkáně * genetika metabolismus MeSH
- sekvence aminokyselin MeSH
- sekvence nukleotidů MeSH
- terciární struktura proteinů * MeSH
- vazba proteinů MeSH
- vývojová regulace genové exprese * fyziologie MeSH
- zvířata MeSH
- Check Tag
- krysa rodu rattus MeSH
- zvířata MeSH
- Publikační typ
- práce podpořená grantem MeSH
The complete mitochondrial genome of the recently discovered beetle family Iberobaeniidae is described and compared with known coleopteran mitogenomes. The mitochondrial sequence was obtained by shotgun metagenomic sequencing using the Illumina Miseq technology and resulted in an average coverage of 130 × and a minimum coverage of 35×. The mitochondrial genome of Iberobaeniidae includes 13 protein-coding genes, 2 rRNAs, 22 tRNAs genes, and 1 putative control region, and showed a unique rearrangement of protein-coding genes. This is the first rearrangement affecting the relative position of protein-coding and ribosomal genes reported for the order Coleoptera.
- MeSH
- brouci genetika MeSH
- fylogeneze * MeSH
- genom hmyzu MeSH
- genom mitochondriální * MeSH
- genomika MeSH
- mitochondriální DNA MeSH
- mitochondriální geny * MeSH
- pořadí genů MeSH
- sekvenční analýza DNA * MeSH
- zvířata MeSH
- Check Tag
- zvířata MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
Wild emmer wheat (Triticum turgidum ssp. dicoccoides) is the progenitor of wheat. We performed chromosome-based survey sequencing of the 14 chromosomes, examining repetitive sequences, protein-coding genes, miRNA/target pairs and tRNA genes, as well as syntenic relationships with related grasses. We found considerable differences in the content and distribution of repetitive sequences between the A and B subgenomes. The gene contents of individual chromosomes varied widely, not necessarily correlating with chromosome size. We catalogued candidate agronomically important loci, along with new alleles and flanking sequences that can be used to design exome sequencing. Syntenic relationships and virtual gene orders revealed several small-scale evolutionary rearrangements, in addition to providing evidence for the 4AL-5AL-7BS translocation in wild emmer wheat. Chromosome-based sequence assemblies contained five novel miRNA families, among 59 families putatively encoded in the entire genome which provide insight into the domestication of wheat and an overview of the genome content and organization.
- MeSH
- chromozomy rostlin genetika MeSH
- genetické lokusy genetika MeSH
- genom rostlinný genetika MeSH
- konzervovaná sekvence genetika MeSH
- lipnicovité genetika MeSH
- mikro RNA genetika MeSH
- nekódující RNA genetika MeSH
- polyploidie MeSH
- průtoková cytometrie MeSH
- pšenice genetika MeSH
- repetitivní sekvence nukleových kyselin genetika MeSH
- rostlinné geny genetika MeSH
- tetraploidie MeSH
- vysoce účinné nukleotidové sekvenování MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
Lx mutation in SHR.Lx rat manifests in homozygotes as hindlimb preaxial polydactyly. It was previously mapped to a chromosome 8 segment containing the Plzf gene. Plzf (promyelocytic leukemia zinc finger protein) influences limb development as a direct repressor of posterior HoxD genes. However, the Plzf coding sequence is intact in the Lx mutants. Using linkage mapping in F2 hybrids, we downsized the segment containing Lx to 155 kb and sequenced conserved noncoding elements (CNEs) inside. A 2,964-bp deletion in Plzf intron 2, never detected in control animals, is the only candidate for Lx. The deletion removes the most deeply conserved CNE in the 155-kb segment, suggesting a regulatory influence on Plzf expression. Correspondingly, using in situ hybridization and quantitative real-time polymerase chain reaction, we found a decrease of Plzf expression in Lx/Lx limb buds with concomitant anterior expansion of expression domains of its targets, Hoxd10-13 genes, in the absence of ectopic Sonic hedgehog expression. Upstream regulation of Plzf in limb buds is currently unknown. We present here the first candidate Plzf cis-regulatory sequence. (c) 2009 Wiley-Liss, Inc.
- MeSH
- delece genu MeSH
- DNA vazebné proteiny genetika metabolismus MeSH
- down regulace genetika MeSH
- embryo savčí embryologie metabolismus MeSH
- financování organizované MeSH
- introny genetika MeSH
- končetinové pupeny abnormality metabolismus MeSH
- konzervovaná sekvence MeSH
- krysa rodu rattus MeSH
- messenger RNA genetika MeSH
- nekódující RNA genetika MeSH
- polydaktylie genetika metabolismus MeSH
- rozvržení tělního plánu MeSH
- sekvence nukleotidů MeSH
- vývojová regulace genové exprese MeSH
- zvířata MeSH
- Check Tag
- krysa rodu rattus MeSH
- zvířata MeSH
The complete nucleotide sequence (1448 nucleotides) of RNA 2 of a Czechoslovakian isolate TpM-34 of red clover necrotic mosaic virus (RCNMV-TpM-34) has been determined. The sequence contained one major open reading frame (ORF) with the potential to encode a protein of 326 amino acids (Mr 35755), designated P2. The nucleotide sequence of RNA 2 of RCNMV-TpM-34 and the previously published sequence of RNA 2 of an Australian isolate of the virus (RCNMV-Aus) were 83% identical and there was 80% amino acid sequence identity between the P2 proteins of these isolates. However the N-terminal two-thirds of the P2 proteins shared a higher degree of similarity than the C-terminal regions which were predicted to have a more flexible structure. An ORF in the 3' portion of RNA 2 of RCNMV-Aus, which could encode a protein of Mr 5000, was not present in RNA 2 of RCNMV-TpM-34. RNAs 1 and 2 of RCNMV-TpM-34 and RCNMV-Aus are bilaterally compatible.
- MeSH
- konformace proteinů MeSH
- molekulární sekvence - údaje MeSH
- otevřené čtecí rámce MeSH
- RNA virová genetika MeSH
- sekvence aminokyselin MeSH
- sekvence nukleotidů MeSH
- sekvenční homologie nukleových kyselin MeSH
- virové proteiny genetika MeSH
- viry mozaiky genetika izolace a purifikace MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- srovnávací studie MeSH
- Geografické názvy
- Československo MeSH
Mastomys natalensis is a rodent of African origin afflicted with a very high incidence of skin tumors (keratoacanthomas and squamous carcinomas), which are associated with a papillomavirus, M. natalensis papillomavirus (MnPV). We have determined the genomic sequence of MnPV, which has a size of 7687 bp. The genomic organization is similar to that of other papillomaviruses, with open reading frames E6, E7, E1, E2, and E4 in the early and L2 and L1 in the late region. Due to an unusually large hinge region, the transcriptional activator E2 has a size of 542 amino acids rather than 400 to 460 amino acids, as in other papillomaviruses. An open reading frame E5 coding for a small hydrophobic membrane protein is missing, as is the case for some cutaneous human papillomaviruses (HPV). This fact, together with the composition of cis-responsive elements in its long control region and phylogenetic evaluation of segments of its E6, E1, and L1 genes, indicates a relationship of MnPV to the cottontail rabbit papillomavirus and several HPV types found in lesions of cutaneous epithelia, in particular to those that are associated with epidermodysplasia verruciformis. MnPV may be a useful model system for tumorigenesis of cutaneous epithelia in humans.
- MeSH
- genom virový * MeSH
- keratoakantom virologie MeSH
- klonování DNA MeSH
- molekulární sekvence - údaje MeSH
- Muridae * virologie MeSH
- nádory kůže virologie MeSH
- otevřené čtecí rámce MeSH
- Papillomaviridae * genetika klasifikace MeSH
- regulační oblasti nukleových kyselin genetika MeSH
- sekvence aminokyselin MeSH
- sekvence nukleotidů MeSH
- sekvenční analýza DNA MeSH
- sekvenční homologie nukleových kyselin MeSH
- sekvenční seřazení MeSH
- spinocelulární karcinom virologie MeSH
- virové geny * MeSH
- virové proteiny genetika MeSH
- zvířata MeSH
- Check Tag
- zvířata MeSH
- Publikační typ
- srovnávací studie MeSH