• Something wrong with this record ?

Dissecting a hidden gene duplication: the Arabidopsis thaliana SEC10 locus

N. Vukašinović, F. Cvrčková, M. Eliáš, R. Cole, JE. Fowler, V. Žárský, L. Synek,

. 2014 ; 9 (4) : e94077.

Language English Country United States

Document type Journal Article, Research Support, Non-U.S. Gov't

Repetitive sequences present a challenge for genome sequence assembly, and highly similar segmental duplications may disappear from assembled genome sequences. Having found a surprising lack of observable phenotypic deviations and non-Mendelian segregation in Arabidopsis thaliana mutants in SEC10, a gene encoding a core subunit of the exocyst tethering complex, we examined whether this could be explained by a hidden gene duplication. Re-sequencing and manual assembly of the Arabidopsis thaliana SEC10 (At5g12370) locus revealed that this locus, comprising a single gene in the reference genome assembly, indeed contains two paralogous genes in tandem, SEC10a and SEC10b, and that a sequence segment of 7 kb in length is missing from the reference genome sequence. Differences between the two paralogs are concentrated in non-coding regions, while the predicted protein sequences exhibit 99% identity, differing only by substitution of five amino acid residues and an indel of four residues. Both SEC10 genes are expressed, although varying transcript levels suggest differential regulation. Homozygous T-DNA insertion mutants in either paralog exhibit a wild-type phenotype, consistent with proposed extensive functional redundancy of the two genes. By these observations we demonstrate that recently duplicated genes may remain hidden even in well-characterized genomes, such as that of A. thaliana. Moreover, we show that the use of the existing A. thaliana reference genome sequence as a guide for sequence assembly of new Arabidopsis accessions or related species has at least in some cases led to error propagation.

References provided by Crossref.org

000      
00000naa a2200000 a 4500
001      
bmc15014399
003      
CZ-PrNML
005      
20150421092152.0
007      
ta
008      
150420s2014 xxu f 000 0|eng||
009      
AR
024    7_
$a 10.1371/journal.pone.0094077 $2 doi
035    __
$a (PubMed)24728280
040    __
$a ABA008 $b cze $d ABA008 $e AACR2
041    0_
$a eng
044    __
$a xxu
100    1_
$a Vukašinović, Nemanja $u Institute of Experimental Botany, Academy of Sciences of the Czech Republic, Prague, Czech Republic; Department of Experimental Plant Biology, Faculty of Science, Charles University in Prague, Prague, Czech Republic.
245    10
$a Dissecting a hidden gene duplication: the Arabidopsis thaliana SEC10 locus / $c N. Vukašinović, F. Cvrčková, M. Eliáš, R. Cole, JE. Fowler, V. Žárský, L. Synek,
520    9_
$a Repetitive sequences present a challenge for genome sequence assembly, and highly similar segmental duplications may disappear from assembled genome sequences. Having found a surprising lack of observable phenotypic deviations and non-Mendelian segregation in Arabidopsis thaliana mutants in SEC10, a gene encoding a core subunit of the exocyst tethering complex, we examined whether this could be explained by a hidden gene duplication. Re-sequencing and manual assembly of the Arabidopsis thaliana SEC10 (At5g12370) locus revealed that this locus, comprising a single gene in the reference genome assembly, indeed contains two paralogous genes in tandem, SEC10a and SEC10b, and that a sequence segment of 7 kb in length is missing from the reference genome sequence. Differences between the two paralogs are concentrated in non-coding regions, while the predicted protein sequences exhibit 99% identity, differing only by substitution of five amino acid residues and an indel of four residues. Both SEC10 genes are expressed, although varying transcript levels suggest differential regulation. Homozygous T-DNA insertion mutants in either paralog exhibit a wild-type phenotype, consistent with proposed extensive functional redundancy of the two genes. By these observations we demonstrate that recently duplicated genes may remain hidden even in well-characterized genomes, such as that of A. thaliana. Moreover, we show that the use of the existing A. thaliana reference genome sequence as a guide for sequence assembly of new Arabidopsis accessions or related species has at least in some cases led to error propagation.
650    _2
$a Arabidopsis $x genetika $7 D017360
650    _2
$a proteiny huseníčku $x genetika $7 D029681
650    _2
$a DNA bakterií $x genetika $7 D004269
650    _2
$a duplikace genu $x genetika $7 D020440
650    _2
$a inzerční mutageneze $x genetika $7 D016254
655    _2
$a časopisecké články $7 D016428
655    _2
$a práce podpořená grantem $7 D013485
700    1_
$a Cvrčková, Fatima $u Department of Experimental Plant Biology, Faculty of Science, Charles University in Prague, Prague, Czech Republic.
700    1_
$a Eliáš, Marek $u Department of Botany, Faculty of Science, Charles University in Prague, Prague, Czech Republic; Department of Biology and Ecology, Faculty of Science, University of Ostrava, Ostrava, Czech Republic.
700    1_
$a Cole, Rex $u Department of Botany and Plant Pathology and Center for Genome Research and Biocomputing, Oregon State University, Corvallis, Oregon, United States of America.
700    1_
$a Fowler, John E $u Department of Botany and Plant Pathology and Center for Genome Research and Biocomputing, Oregon State University, Corvallis, Oregon, United States of America.
700    1_
$a Žárský, Viktor $u Institute of Experimental Botany, Academy of Sciences of the Czech Republic, Prague, Czech Republic; Department of Experimental Plant Biology, Faculty of Science, Charles University in Prague, Prague, Czech Republic.
700    1_
$a Synek, Lukáš $u Institute of Experimental Botany, Academy of Sciences of the Czech Republic, Prague, Czech Republic.
773    0_
$w MED00180950 $t PloS one $x 1932-6203 $g Roč. 9, č. 4 (2014), s. e94077
856    41
$u https://pubmed.ncbi.nlm.nih.gov/24728280 $y Pubmed
910    __
$a ABA008 $b sig $c sign $y a $z 0
990    __
$a 20150420 $b ABA008
991    __
$a 20150421092450 $b ABA008
999    __
$a ok $b bmc $g 1071980 $s 897277
BAS    __
$a 3
BAS    __
$a PreBMC
BMC    __
$a 2014 $b 9 $c 4 $d e94077 $i 1932-6203 $m PLoS One $n PLoS One $x MED00180950
LZP    __
$a Pubmed-20150420

Find record

Citation metrics

Loading data ...

Archiving options

Loading data ...