Detail
Článek
Článek online
FT
Medvik - BMČ
  • Je něco špatně v tomto záznamu ?

CATH: increased structural coverage of functional space

I. Sillitoe, N. Bordin, N. Dawson, VP. Waman, P. Ashford, HM. Scholes, CSM. Pang, L. Woodridge, C. Rauer, N. Sen, M. Abbasian, S. Le Cornu, SD. Lam, K. Berka, IH. Varekova, R. Svobodova, J. Lees, CA. Orengo

. 2021 ; 49 (D1) : D266-D273. [pub] 20210108

Jazyk angličtina Země Velká Británie

Typ dokumentu časopisecké články, práce podpořená grantem

Perzistentní odkaz   https://www.medvik.cz/link/bmc21011618

Grantová podpora
104960/Z/14/Z Wellcome Trust - United Kingdom
203780/Z/16/A Wellcome Trust - United Kingdom

CATH (https://www.cathdb.info) identifies domains in protein structures from wwPDB and classifies these into evolutionary superfamilies, thereby providing structural and functional annotations. There are two levels: CATH-B, a daily snapshot of the latest domain structures and superfamily assignments, and CATH+, with additional derived data, such as predicted sequence domains, and functionally coherent sequence subsets (Functional Families or FunFams). The latest CATH+ release, version 4.3, significantly increases coverage of structural and sequence data, with an addition of 65,351 fully-classified domains structures (+15%), providing 500 238 structural domains, and 151 million predicted sequence domains (+59%) assigned to 5481 superfamilies. The FunFam generation pipeline has been re-engineered to cope with the increased influx of data. Three times more sequences are captured in FunFams, with a concomitant increase in functional purity, information content and structural coverage. FunFam expansion increases the structural annotations provided for experimental GO terms (+59%). We also present CATH-FunVar web-pages displaying variations in protein sequences and their proximity to known or predicted functional sites. We present two case studies (1) putative cancer drivers and (2) SARS-CoV-2 proteins. Finally, we have improved links to and from CATH including SCOP, InterPro, Aquaria and 2DProt.

Citace poskytuje Crossref.org

000      
00000naa a2200000 a 4500
001      
bmc21011618
003      
CZ-PrNML
005      
20210507104008.0
007      
ta
008      
210420s2021 xxk f 000 0|eng||
009      
AR
024    7_
$a 10.1093/nar/gkaa1079 $2 doi
035    __
$a (PubMed)33237325
040    __
$a ABA008 $b cze $d ABA008 $e AACR2
041    0_
$a eng
044    __
$a xxk
100    1_
$a Sillitoe, Ian $u Institute of Structural and Molecular Biology, University College London, London WC1E 6BT, UK
245    10
$a CATH: increased structural coverage of functional space / $c I. Sillitoe, N. Bordin, N. Dawson, VP. Waman, P. Ashford, HM. Scholes, CSM. Pang, L. Woodridge, C. Rauer, N. Sen, M. Abbasian, S. Le Cornu, SD. Lam, K. Berka, IH. Varekova, R. Svobodova, J. Lees, CA. Orengo
520    9_
$a CATH (https://www.cathdb.info) identifies domains in protein structures from wwPDB and classifies these into evolutionary superfamilies, thereby providing structural and functional annotations. There are two levels: CATH-B, a daily snapshot of the latest domain structures and superfamily assignments, and CATH+, with additional derived data, such as predicted sequence domains, and functionally coherent sequence subsets (Functional Families or FunFams). The latest CATH+ release, version 4.3, significantly increases coverage of structural and sequence data, with an addition of 65,351 fully-classified domains structures (+15%), providing 500 238 structural domains, and 151 million predicted sequence domains (+59%) assigned to 5481 superfamilies. The FunFam generation pipeline has been re-engineered to cope with the increased influx of data. Three times more sequences are captured in FunFams, with a concomitant increase in functional purity, information content and structural coverage. FunFam expansion increases the structural annotations provided for experimental GO terms (+59%). We also present CATH-FunVar web-pages displaying variations in protein sequences and their proximity to known or predicted functional sites. We present two case studies (1) putative cancer drivers and (2) SARS-CoV-2 proteins. Finally, we have improved links to and from CATH including SCOP, InterPro, Aquaria and 2DProt.
650    _2
$a sekvence aminokyselin $7 D000595
650    _2
$a COVID-19 $x epidemiologie $x prevence a kontrola $x virologie $7 D000086382
650    _2
$a výpočetní biologie $x metody $x statistika a číselné údaje $7 D019295
650    _2
$a databáze proteinů $x statistika a číselné údaje $7 D030562
650    _2
$a epidemie $7 D058872
650    _2
$a lidé $7 D006801
650    _2
$a internet $7 D020407
650    _2
$a anotace sekvence $7 D058977
650    12
$a proteinové domény $7 D000072417
650    _2
$a proteiny $x chemie $x genetika $x metabolismus $7 D011506
650    _2
$a SARS-CoV-2 $x genetika $x metabolismus $x fyziologie $7 D000086402
650    _2
$a sekvenční analýza proteinů $x metody $7 D020539
650    _2
$a sekvenční homologie aminokyselin $7 D017386
650    _2
$a virové proteiny $x chemie $x genetika $x metabolismus $7 D014764
655    _2
$a časopisecké články $7 D016428
655    _2
$a práce podpořená grantem $7 D013485
700    1_
$a Bordin, Nicola $u Institute of Structural and Molecular Biology, University College London, London WC1E 6BT, UK
700    1_
$a Dawson, Natalie $u Institute of Structural and Molecular Biology, University College London, London WC1E 6BT, UK
700    1_
$a Waman, Vaishali P $u Institute of Structural and Molecular Biology, University College London, London WC1E 6BT, UK
700    1_
$a Ashford, Paul $u Institute of Structural and Molecular Biology, University College London, London WC1E 6BT, UK
700    1_
$a Scholes, Harry M $u Institute of Structural and Molecular Biology, University College London, London WC1E 6BT, UK
700    1_
$a Pang, Camilla S M $u Institute of Structural and Molecular Biology, University College London, London WC1E 6BT, UK
700    1_
$a Woodridge, Laurel $u Institute of Structural and Molecular Biology, University College London, London WC1E 6BT, UK
700    1_
$a Rauer, Clemens $u Institute of Structural and Molecular Biology, University College London, London WC1E 6BT, UK
700    1_
$a Sen, Neeladri $u Institute of Structural and Molecular Biology, University College London, London WC1E 6BT, UK
700    1_
$a Abbasian, Mahnaz $u Institute of Structural and Molecular Biology, University College London, London WC1E 6BT, UK
700    1_
$a Le Cornu, Sean $u Institute of Structural and Molecular Biology, University College London, London WC1E 6BT, UK
700    1_
$a Lam, Su Datt $u Department of Applied Physics, Faculty of Science and Technology, Universiti Kebangsaan Malaysia, Bangi, Selangor 43600, Malaysia
700    1_
$a Berka, Karel $u Regional Centre of Advanced Technologies and Materials, Department of Physical Chemistry, Faculty of Science, Palacký University Olomouc, Olomouc 771 46, Czech Republic
700    1_
$a Varekova, Ivana Hutařová $u National Centre for Biomolecular Research, Faculty of Science, Masaryk University, Brno 602 00, Czech Republic
700    1_
$a Svobodova, Radka $u Central European Institute of Technology, Masaryk University, Brno 625 00, Czech Republic| National Centre for Biomolecular Research, Faculty of Science, Masaryk University, Brno 602 00, Czech Republic
700    1_
$a Lees, Jon $u Department of Biological and Medical Sciences, Faculty of Health and Life Sciences, Oxford Brookes University, Oxford OX3 0BP, UK
700    1_
$a Orengo, Christine A $u Institute of Structural and Molecular Biology, University College London, London WC1E 6BT, UK
773    0_
$w MED00003554 $t Nucleic acids research $x 1362-4962 $g Roč. 49, č. D1 (2021), s. D266-D273
856    41
$u https://pubmed.ncbi.nlm.nih.gov/33237325 $y Pubmed
910    __
$a ABA008 $b sig $c sign $y p $z 0
990    __
$a 20210420 $b ABA008
991    __
$a 20210507104007 $b ABA008
999    __
$a ok $b bmc $g 1650092 $s 1131997
BAS    __
$a 3
BAS    __
$a PreBMC
BMC    __
$a 2021 $b 49 $c D1 $d D266-D273 $e 20210108 $i 1362-4962 $m Nucleic acids research $n Nucleic Acids Res $x MED00003554
GRA    __
$a 104960/Z/14/Z $p Wellcome Trust $2 United Kingdom
GRA    __
$a 203780/Z/16/A $p Wellcome Trust $2 United Kingdom
LZP    __
$a Pubmed-20210420

Najít záznam

Citační ukazatele

Nahrávání dat ...

Možnosti archivace

Nahrávání dat ...