Length distribution of long interspersed nucleotide elements (LINEs) and processed pseudogenes of human endogenous retroviruses: implications for retrotransposition and pseudogene detection
Language English Country Netherlands Media print
Document type Journal Article, Research Support, Non-U.S. Gov't
PubMed
12468100
DOI
10.1016/s0378-1119(02)01047-8
PII: S0378111902010478
Knihovny.cz E-resources
- MeSH
- Long Interspersed Nucleotide Elements genetics MeSH
- Endogenous Retroviruses genetics MeSH
- Genome, Human MeSH
- Mutagenesis, Insertional MeSH
- Humans MeSH
- Mutation MeSH
- Pseudogenes genetics MeSH
- Retroelements genetics MeSH
- Check Tag
- Humans MeSH
- Publication type
- Journal Article MeSH
- Research Support, Non-U.S. Gov't MeSH
- Names of Substances
- Retroelements MeSH
Deciphering the human genome includes reliable identification and structural characterization of individual retrotransposon elements. The most active group of autonomous transposable elements, the long interspersed nuclear elements (LINE), transpose themselves as well as other RNAs, including those of human endogenous retroviruses (HERV). During this transposition, however, the LINE-encoded reverse transcriptase (RT) often abortively dissociates from the RNA template, leaving a prematurely terminated, 5' truncated copy. We have analyzed the length distributions of LINEs and of processed pseudogenes derived from HERV-W. As expected, we have found that the majority of 5' truncated LINEs and HERV-W processed pseudogenes show a prevalence of very short elements terminated close to the 3' end. On the other hand, the number of complete elements is far above the expectation. The characteristic distribution in both cases indicates two important conclusions: (i) dissociation of LINE RT from the template cannot be fully explained by low processivity of RT modelled as a stochastic, Poisson-type process. (ii) Currently cited numbers of pseudogenes within the human genome are underestimated, since a large percentage of pseudogenes are terminated in the 3' untranslated region and remain undetectable in translated homology searches of protein databases against the human genome.
References provided by Crossref.org