• Something wrong with this record ?

Prot2HG: a database of protein domains mapped to the human genome

D. Stanek, DM. Bis-Brewer, C. Saghira, MC. Danzi, P. Seeman, P. Lassuthova, S. Zuchner

. 2020 ; 2020 (-) : . [pub] 20200101

Language English Country Great Britain

Document type Journal Article, Research Support, Non-U.S. Gov't

Grant support
R01 NS105755 NINDS NIH HHS - United States

Genetic variation occurring within conserved functional protein domains warrants special attention when examining DNA variation in the context of disease causation. Here we introduce a resource, freely available at www.prot2hg.com, that addresses the question of whether a particular variant falls onto an annotated protein domain and directly translates chromosomal coordinates onto protein residues. The tool can perform a multiple-site query in a simple way, and the whole dataset is available for download as well as incorporated into our own accessible pipeline. To create this resource, National Center for Biotechnology Information protein data were retrieved using the Entrez Programming Utilities. After processing all human protein domains, residue positions were reverse translated and mapped to the reference genome hg19 and stored in a MySQL database. In total, 760 487 protein domains from 42 371 protein models were mapped to hg19 coordinates and made publicly available for search or download (www.prot2hg.com). In addition, this annotation was implemented into the genomics research platform GENESIS in order to query nearly 8000 exomes and genomes of families with rare Mendelian disorders (tgp-foundation.org). When applied to patient genetic data, we found that rare (<1%) variants in the Genome Aggregation Database were significantly more annotated onto a protein domain in comparison to common (>1%) variants. Similarly, variants described as pathogenic or likely pathogenic in ClinVar were more likely to be annotated onto a domain. In addition, we tested a dataset consisting of 60 causal variants in a cohort of patients with epileptic encephalopathy and found that 71% of them (43 variants) were propagated onto protein domains. In summary, we developed a resource that annotates variants in the coding part of the genome onto conserved protein domains in order to increase variant prioritization efficiency.Database URL: www.prot2hg.com.

References provided by Crossref.org

000      
00000naa a2200000 a 4500
001      
bmc21012951
003      
CZ-PrNML
005      
20210507101958.0
007      
ta
008      
210420s2020 xxk f 000 0|eng||
009      
AR
024    7_
$a 10.1093/database/baz161 $2 doi
035    __
$a (PubMed)32293014
040    __
$a ABA008 $b cze $d ABA008 $e AACR2
041    0_
$a eng
044    __
$a xxk
100    1_
$a Stanek, David $u DNA Laboratory, Department of Paediatric Neurology, 2nd Faculty of Medicine, Charles University in Prague and University Hospital Motol, Prague, V Úvalu 84, 150 06 Czech Republic
245    10
$a Prot2HG: a database of protein domains mapped to the human genome / $c D. Stanek, DM. Bis-Brewer, C. Saghira, MC. Danzi, P. Seeman, P. Lassuthova, S. Zuchner
520    9_
$a Genetic variation occurring within conserved functional protein domains warrants special attention when examining DNA variation in the context of disease causation. Here we introduce a resource, freely available at www.prot2hg.com, that addresses the question of whether a particular variant falls onto an annotated protein domain and directly translates chromosomal coordinates onto protein residues. The tool can perform a multiple-site query in a simple way, and the whole dataset is available for download as well as incorporated into our own accessible pipeline. To create this resource, National Center for Biotechnology Information protein data were retrieved using the Entrez Programming Utilities. After processing all human protein domains, residue positions were reverse translated and mapped to the reference genome hg19 and stored in a MySQL database. In total, 760 487 protein domains from 42 371 protein models were mapped to hg19 coordinates and made publicly available for search or download (www.prot2hg.com). In addition, this annotation was implemented into the genomics research platform GENESIS in order to query nearly 8000 exomes and genomes of families with rare Mendelian disorders (tgp-foundation.org). When applied to patient genetic data, we found that rare (<1%) variants in the Genome Aggregation Database were significantly more annotated onto a protein domain in comparison to common (>1%) variants. Similarly, variants described as pathogenic or likely pathogenic in ClinVar were more likely to be annotated onto a domain. In addition, we tested a dataset consisting of 60 causal variants in a cohort of patients with epileptic encephalopathy and found that 71% of them (43 variants) were propagated onto protein domains. In summary, we developed a resource that annotates variants in the coding part of the genome onto conserved protein domains in order to increase variant prioritization efficiency.Database URL: www.prot2hg.com.
650    _2
$a výpočetní biologie $x metody $7 D019295
650    _2
$a datové kurátorství $x metody $7 D066289
650    _2
$a data mining $x metody $7 D057225
650    12
$a databáze genetické $7 D030541
650    12
$a genetická variace $7 D014644
650    _2
$a genom lidský $x genetika $7 D015894
650    _2
$a genomika $x metody $7 D023281
650    _2
$a lidé $7 D006801
650    _2
$a internet $7 D020407
650    _2
$a anotace sekvence $x metody $7 D058977
650    _2
$a proteinové domény $x genetika $7 D000072417
650    _2
$a proteiny $x chemie $x genetika $x metabolismus $7 D011506
655    _2
$a časopisecké články $7 D016428
655    _2
$a práce podpořená grantem $7 D013485
700    1_
$a Bis-Brewer, Dana M $u Department of Human Genetics and John P. Hussman Institute for Human Genomics, Miller School of Medicine, University of Miami, Miami, FL 33136, USA
700    1_
$a Saghira, Cima $u Department of Human Genetics and John P. Hussman Institute for Human Genomics, Miller School of Medicine, University of Miami, Miami, FL 33136, USA
700    1_
$a Danzi, Matt C $u Department of Human Genetics and John P. Hussman Institute for Human Genomics, Miller School of Medicine, University of Miami, Miami, FL 33136, USA
700    1_
$a Seeman, Pavel $u DNA Laboratory, Department of Paediatric Neurology, 2nd Faculty of Medicine, Charles University in Prague and University Hospital Motol, Prague, V Úvalu 84, 150 06 Czech Republic
700    1_
$a Lassuthova, Petra $u DNA Laboratory, Department of Paediatric Neurology, 2nd Faculty of Medicine, Charles University in Prague and University Hospital Motol, Prague, V Úvalu 84, 150 06 Czech Republic
700    1_
$a Zuchner, Stephan $u Department of Human Genetics and John P. Hussman Institute for Human Genomics, Miller School of Medicine, University of Miami, Miami, FL 33136, USA
773    0_
$w MED00170507 $t Database : the journal of biological databases and curation $x 1758-0463 $g Roč. 2020, č. - (2020)
856    41
$u https://pubmed.ncbi.nlm.nih.gov/32293014 $y Pubmed
910    __
$a ABA008 $b sig $c sign $y p $z 0
990    __
$a 20210420 $b ABA008
991    __
$a 20210507101958 $b ABA008
999    __
$a ok $b bmc $g 1651183 $s 1133330
BAS    __
$a 3
BAS    __
$a PreBMC
BMC    __
$a 2020 $b 2020 $c - $e 20200101 $i 1758-0463 $m Database $n Database $x MED00170507
GRA    __
$a R01 NS105755 $p NINDS NIH HHS $2 United States
LZP    __
$a Pubmed-20210420

Find record

Citation metrics

Loading data ...

Archiving options

Loading data ...