-
Something wrong with this record ?
CATH: increased structural coverage of functional space
I. Sillitoe, N. Bordin, N. Dawson, VP. Waman, P. Ashford, HM. Scholes, CSM. Pang, L. Woodridge, C. Rauer, N. Sen, M. Abbasian, S. Le Cornu, SD. Lam, K. Berka, IH. Varekova, R. Svobodova, J. Lees, CA. Orengo
Language English Country Great Britain
Document type Journal Article, Research Support, Non-U.S. Gov't
Grant support
104960/Z/14/Z
Wellcome Trust - United Kingdom
203780/Z/16/A
Wellcome Trust - United Kingdom
NLK
Directory of Open Access Journals
from 2005
Free Medical Journals
from 1996
PubMed Central
from 1974
Europe PubMed Central
from 1974
Open Access Digital Library
from 1996-01-01 to 2030-12-31
Open Access Digital Library
from 1974-01-01
Open Access Digital Library
from 1996-01-01
Open Access Digital Library
from 1996-01-01
Medline Complete (EBSCOhost)
from 1996-01-01
Oxford Journals Open Access Collection
from 1996-01-01
ROAD: Directory of Open Access Scholarly Resources
from 1974
PubMed
33237325
DOI
10.1093/nar/gkaa1079
Knihovny.cz E-resources
- MeSH
- Molecular Sequence Annotation MeSH
- COVID-19 epidemiology prevention & control virology MeSH
- Databases, Protein statistics & numerical data MeSH
- Epidemics MeSH
- Internet MeSH
- Humans MeSH
- Protein Domains * MeSH
- Proteins chemistry genetics metabolism MeSH
- SARS-CoV-2 genetics metabolism physiology MeSH
- Amino Acid Sequence MeSH
- Sequence Analysis, Protein methods MeSH
- Sequence Homology, Amino Acid MeSH
- Viral Proteins chemistry genetics metabolism MeSH
- Computational Biology methods statistics & numerical data MeSH
- Check Tag
- Humans MeSH
- Publication type
- Journal Article MeSH
- Research Support, Non-U.S. Gov't MeSH
CATH (https://www.cathdb.info) identifies domains in protein structures from wwPDB and classifies these into evolutionary superfamilies, thereby providing structural and functional annotations. There are two levels: CATH-B, a daily snapshot of the latest domain structures and superfamily assignments, and CATH+, with additional derived data, such as predicted sequence domains, and functionally coherent sequence subsets (Functional Families or FunFams). The latest CATH+ release, version 4.3, significantly increases coverage of structural and sequence data, with an addition of 65,351 fully-classified domains structures (+15%), providing 500 238 structural domains, and 151 million predicted sequence domains (+59%) assigned to 5481 superfamilies. The FunFam generation pipeline has been re-engineered to cope with the increased influx of data. Three times more sequences are captured in FunFams, with a concomitant increase in functional purity, information content and structural coverage. FunFam expansion increases the structural annotations provided for experimental GO terms (+59%). We also present CATH-FunVar web-pages displaying variations in protein sequences and their proximity to known or predicted functional sites. We present two case studies (1) putative cancer drivers and (2) SARS-CoV-2 proteins. Finally, we have improved links to and from CATH including SCOP, InterPro, Aquaria and 2DProt.
References provided by Crossref.org
- 000
- 00000naa a2200000 a 4500
- 001
- bmc21011618
- 003
- CZ-PrNML
- 005
- 20210507104008.0
- 007
- ta
- 008
- 210420s2021 xxk f 000 0|eng||
- 009
- AR
- 024 7_
- $a 10.1093/nar/gkaa1079 $2 doi
- 035 __
- $a (PubMed)33237325
- 040 __
- $a ABA008 $b cze $d ABA008 $e AACR2
- 041 0_
- $a eng
- 044 __
- $a xxk
- 100 1_
- $a Sillitoe, Ian $u Institute of Structural and Molecular Biology, University College London, London WC1E 6BT, UK
- 245 10
- $a CATH: increased structural coverage of functional space / $c I. Sillitoe, N. Bordin, N. Dawson, VP. Waman, P. Ashford, HM. Scholes, CSM. Pang, L. Woodridge, C. Rauer, N. Sen, M. Abbasian, S. Le Cornu, SD. Lam, K. Berka, IH. Varekova, R. Svobodova, J. Lees, CA. Orengo
- 520 9_
- $a CATH (https://www.cathdb.info) identifies domains in protein structures from wwPDB and classifies these into evolutionary superfamilies, thereby providing structural and functional annotations. There are two levels: CATH-B, a daily snapshot of the latest domain structures and superfamily assignments, and CATH+, with additional derived data, such as predicted sequence domains, and functionally coherent sequence subsets (Functional Families or FunFams). The latest CATH+ release, version 4.3, significantly increases coverage of structural and sequence data, with an addition of 65,351 fully-classified domains structures (+15%), providing 500 238 structural domains, and 151 million predicted sequence domains (+59%) assigned to 5481 superfamilies. The FunFam generation pipeline has been re-engineered to cope with the increased influx of data. Three times more sequences are captured in FunFams, with a concomitant increase in functional purity, information content and structural coverage. FunFam expansion increases the structural annotations provided for experimental GO terms (+59%). We also present CATH-FunVar web-pages displaying variations in protein sequences and their proximity to known or predicted functional sites. We present two case studies (1) putative cancer drivers and (2) SARS-CoV-2 proteins. Finally, we have improved links to and from CATH including SCOP, InterPro, Aquaria and 2DProt.
- 650 _2
- $a sekvence aminokyselin $7 D000595
- 650 _2
- $a COVID-19 $x epidemiologie $x prevence a kontrola $x virologie $7 D000086382
- 650 _2
- $a výpočetní biologie $x metody $x statistika a číselné údaje $7 D019295
- 650 _2
- $a databáze proteinů $x statistika a číselné údaje $7 D030562
- 650 _2
- $a epidemie $7 D058872
- 650 _2
- $a lidé $7 D006801
- 650 _2
- $a internet $7 D020407
- 650 _2
- $a anotace sekvence $7 D058977
- 650 12
- $a proteinové domény $7 D000072417
- 650 _2
- $a proteiny $x chemie $x genetika $x metabolismus $7 D011506
- 650 _2
- $a SARS-CoV-2 $x genetika $x metabolismus $x fyziologie $7 D000086402
- 650 _2
- $a sekvenční analýza proteinů $x metody $7 D020539
- 650 _2
- $a sekvenční homologie aminokyselin $7 D017386
- 650 _2
- $a virové proteiny $x chemie $x genetika $x metabolismus $7 D014764
- 655 _2
- $a časopisecké články $7 D016428
- 655 _2
- $a práce podpořená grantem $7 D013485
- 700 1_
- $a Bordin, Nicola $u Institute of Structural and Molecular Biology, University College London, London WC1E 6BT, UK
- 700 1_
- $a Dawson, Natalie $u Institute of Structural and Molecular Biology, University College London, London WC1E 6BT, UK
- 700 1_
- $a Waman, Vaishali P $u Institute of Structural and Molecular Biology, University College London, London WC1E 6BT, UK
- 700 1_
- $a Ashford, Paul $u Institute of Structural and Molecular Biology, University College London, London WC1E 6BT, UK
- 700 1_
- $a Scholes, Harry M $u Institute of Structural and Molecular Biology, University College London, London WC1E 6BT, UK
- 700 1_
- $a Pang, Camilla S M $u Institute of Structural and Molecular Biology, University College London, London WC1E 6BT, UK
- 700 1_
- $a Woodridge, Laurel $u Institute of Structural and Molecular Biology, University College London, London WC1E 6BT, UK
- 700 1_
- $a Rauer, Clemens $u Institute of Structural and Molecular Biology, University College London, London WC1E 6BT, UK
- 700 1_
- $a Sen, Neeladri $u Institute of Structural and Molecular Biology, University College London, London WC1E 6BT, UK
- 700 1_
- $a Abbasian, Mahnaz $u Institute of Structural and Molecular Biology, University College London, London WC1E 6BT, UK
- 700 1_
- $a Le Cornu, Sean $u Institute of Structural and Molecular Biology, University College London, London WC1E 6BT, UK
- 700 1_
- $a Lam, Su Datt $u Department of Applied Physics, Faculty of Science and Technology, Universiti Kebangsaan Malaysia, Bangi, Selangor 43600, Malaysia
- 700 1_
- $a Berka, Karel $u Regional Centre of Advanced Technologies and Materials, Department of Physical Chemistry, Faculty of Science, Palacký University Olomouc, Olomouc 771 46, Czech Republic
- 700 1_
- $a Varekova, Ivana Hutařová $u National Centre for Biomolecular Research, Faculty of Science, Masaryk University, Brno 602 00, Czech Republic
- 700 1_
- $a Svobodova, Radka $u Central European Institute of Technology, Masaryk University, Brno 625 00, Czech Republic| National Centre for Biomolecular Research, Faculty of Science, Masaryk University, Brno 602 00, Czech Republic
- 700 1_
- $a Lees, Jon $u Department of Biological and Medical Sciences, Faculty of Health and Life Sciences, Oxford Brookes University, Oxford OX3 0BP, UK
- 700 1_
- $a Orengo, Christine A $u Institute of Structural and Molecular Biology, University College London, London WC1E 6BT, UK
- 773 0_
- $w MED00003554 $t Nucleic acids research $x 1362-4962 $g Roč. 49, č. D1 (2021), s. D266-D273
- 856 41
- $u https://pubmed.ncbi.nlm.nih.gov/33237325 $y Pubmed
- 910 __
- $a ABA008 $b sig $c sign $y p $z 0
- 990 __
- $a 20210420 $b ABA008
- 991 __
- $a 20210507104007 $b ABA008
- 999 __
- $a ok $b bmc $g 1650092 $s 1131997
- BAS __
- $a 3
- BAS __
- $a PreBMC
- BMC __
- $a 2021 $b 49 $c D1 $d D266-D273 $e 20210108 $i 1362-4962 $m Nucleic acids research $n Nucleic Acids Res $x MED00003554
- GRA __
- $a 104960/Z/14/Z $p Wellcome Trust $2 United Kingdom
- GRA __
- $a 203780/Z/16/A $p Wellcome Trust $2 United Kingdom
- LZP __
- $a Pubmed-20210420