Detail
Článek
Článek online
FT
Medvik - BMČ
  • Je něco špatně v tomto záznamu ?

PredictSNP2: A Unified Platform for Accurately Evaluating SNP Effects by Exploiting the Different Characteristics of Variants in Distinct Genomic Regions

J. Bendl, M. Musil, J. Štourač, J. Zendulka, J. Damborský, J. Brezovský,

. 2016 ; 12 (5) : e1004962. [pub] 20160525

Jazyk angličtina Země Spojené státy americké

Typ dokumentu časopisecké články, práce podpořená grantem

Perzistentní odkaz   https://www.medvik.cz/link/bmc17013826

An important message taken from human genome sequencing projects is that the human population exhibits approximately 99.9% genetic similarity. Variations in the remaining parts of the genome determine our identity, trace our history and reveal our heritage. The precise delineation of phenotypically causal variants plays a key role in providing accurate personalized diagnosis, prognosis, and treatment of inherited diseases. Several computational methods for achieving such delineation have been reported recently. However, their ability to pinpoint potentially deleterious variants is limited by the fact that their mechanisms of prediction do not account for the existence of different categories of variants. Consequently, their output is biased towards the variant categories that are most strongly represented in the variant databases. Moreover, most such methods provide numeric scores but not binary predictions of the deleteriousness of variants or confidence scores that would be more easily understood by users. We have constructed three datasets covering different types of disease-related variants, which were divided across five categories: (i) regulatory, (ii) splicing, (iii) missense, (iv) synonymous, and (v) nonsense variants. These datasets were used to develop category-optimal decision thresholds and to evaluate six tools for variant prioritization: CADD, DANN, FATHMM, FitCons, FunSeq2 and GWAVA. This evaluation revealed some important advantages of the category-based approach. The results obtained with the five best-performing tools were then combined into a consensus score. Additional comparative analyses showed that in the case of missense variations, protein-based predictors perform better than DNA sequence-based predictors. A user-friendly web interface was developed that provides easy access to the five tools' predictions, and their consensus scores, in a user-understandable format tailored to the specific features of different categories of variations. To enable comprehensive evaluation of variants, the predictions are complemented with annotations from eight databases. The web server is freely available to the community at http://loschmidt.chemi.muni.cz/predictsnp2.

Citace poskytuje Crossref.org

000      
00000naa a2200000 a 4500
001      
bmc17013826
003      
CZ-PrNML
005      
20170426093702.0
007      
ta
008      
170413s2016 xxu f 000 0|eng||
009      
AR
024    7_
$a 10.1371/journal.pcbi.1004962 $2 doi
035    __
$a (PubMed)27224906
040    __
$a ABA008 $b cze $d ABA008 $e AACR2
041    0_
$a eng
044    __
$a xxu
100    1_
$a Bendl, Jaroslav $u Loschmidt Laboratories, Department of Experimental Biology and Research Centre for Toxic Compounds in the Environment RECETOX, Masaryk University, Brno, Czech Republic. Department of Information Systems, Faculty of Information Technology, Brno University of Technology, Brno, Czech Republic. International Clinical Research Center, St. Anne's University Hospital Brno, Brno, Czech Republic.
245    10
$a PredictSNP2: A Unified Platform for Accurately Evaluating SNP Effects by Exploiting the Different Characteristics of Variants in Distinct Genomic Regions / $c J. Bendl, M. Musil, J. Štourač, J. Zendulka, J. Damborský, J. Brezovský,
520    9_
$a An important message taken from human genome sequencing projects is that the human population exhibits approximately 99.9% genetic similarity. Variations in the remaining parts of the genome determine our identity, trace our history and reveal our heritage. The precise delineation of phenotypically causal variants plays a key role in providing accurate personalized diagnosis, prognosis, and treatment of inherited diseases. Several computational methods for achieving such delineation have been reported recently. However, their ability to pinpoint potentially deleterious variants is limited by the fact that their mechanisms of prediction do not account for the existence of different categories of variants. Consequently, their output is biased towards the variant categories that are most strongly represented in the variant databases. Moreover, most such methods provide numeric scores but not binary predictions of the deleteriousness of variants or confidence scores that would be more easily understood by users. We have constructed three datasets covering different types of disease-related variants, which were divided across five categories: (i) regulatory, (ii) splicing, (iii) missense, (iv) synonymous, and (v) nonsense variants. These datasets were used to develop category-optimal decision thresholds and to evaluate six tools for variant prioritization: CADD, DANN, FATHMM, FitCons, FunSeq2 and GWAVA. This evaluation revealed some important advantages of the category-based approach. The results obtained with the five best-performing tools were then combined into a consensus score. Additional comparative analyses showed that in the case of missense variations, protein-based predictors perform better than DNA sequence-based predictors. A user-friendly web interface was developed that provides easy access to the five tools' predictions, and their consensus scores, in a user-understandable format tailored to the specific features of different categories of variations. To enable comprehensive evaluation of variants, the predictions are complemented with annotations from eight databases. The web server is freely available to the community at http://loschmidt.chemi.muni.cz/predictsnp2.
650    _2
$a výpočetní biologie $7 D019295
650    _2
$a databáze nukleových kyselin $7 D030561
650    _2
$a databáze proteinů $7 D030562
650    _2
$a genetická variace $7 D014644
650    _2
$a genom lidský $7 D015894
650    _2
$a genomika $x statistika a číselné údaje $7 D023281
650    _2
$a lidé $7 D006801
650    12
$a jednonukleotidový polymorfismus $7 D020641
650    12
$a software $7 D012984
655    _2
$a časopisecké články $7 D016428
655    _2
$a práce podpořená grantem $7 D013485
700    1_
$a Musil, Miloš $u Loschmidt Laboratories, Department of Experimental Biology and Research Centre for Toxic Compounds in the Environment RECETOX, Masaryk University, Brno, Czech Republic. Department of Information Systems, Faculty of Information Technology, Brno University of Technology, Brno, Czech Republic.
700    1_
$a Štourač, Jan $u Loschmidt Laboratories, Department of Experimental Biology and Research Centre for Toxic Compounds in the Environment RECETOX, Masaryk University, Brno, Czech Republic. International Clinical Research Center, St. Anne's University Hospital Brno, Brno, Czech Republic.
700    1_
$a Zendulka, Jaroslav $u Department of Information Systems, Faculty of Information Technology, Brno University of Technology, Brno, Czech Republic.
700    1_
$a Damborský, Jiří $u Loschmidt Laboratories, Department of Experimental Biology and Research Centre for Toxic Compounds in the Environment RECETOX, Masaryk University, Brno, Czech Republic. International Clinical Research Center, St. Anne's University Hospital Brno, Brno, Czech Republic.
700    1_
$a Brezovský, Jan $u Loschmidt Laboratories, Department of Experimental Biology and Research Centre for Toxic Compounds in the Environment RECETOX, Masaryk University, Brno, Czech Republic. International Clinical Research Center, St. Anne's University Hospital Brno, Brno, Czech Republic.
773    0_
$w MED00008919 $t PLoS computational biology $x 1553-7358 $g Roč. 12, č. 5 (2016), s. e1004962
856    41
$u https://pubmed.ncbi.nlm.nih.gov/27224906 $y Pubmed
910    __
$a ABA008 $b sig $c sign $y a $z 0
990    __
$a 20170413 $b ABA008
991    __
$a 20170426094021 $b ABA008
999    __
$a ok $b bmc $g 1200291 $s 974604
BAS    __
$a 3
BAS    __
$a PreBMC
BMC    __
$a 2016 $b 12 $c 5 $d e1004962 $e 20160525 $i 1553-7358 $m PLoS computational biology $n PLoS Comput Biol $x MED00008919
LZP    __
$a Pubmed-20170413

Najít záznam

Citační ukazatele

Nahrávání dat ...

Možnosti archivace

Nahrávání dat ...