Reusability
Dotaz
Zobrazit nápovědu
The European Chemical Biology Database (ECBD, https://ecbd.eu) serves as the central repository for data generated by the EU-OPENSCREEN research infrastructure consortium. It is developed according to FAIR principles, which emphasize findability, accessibility, interoperability and reusability of data. This data is made available to the scientific community following open access principles. The ECBD stores both positive and negative results from the entire chemical biology project pipeline, including data from primary or counter-screening assays. The assays utilize a defined and diverse library of over 107 000 compounds, the annotations of which are continuously enriched by external user supported screening projects and by internal EU-OPENSCREEN bioprofiling efforts. These compounds were screened in 89 currently deposited datasets (assays), with 48 already being publicly accessible, while the remaining will be published after a publication embargo period of up to 3 years. Together these datasets encompass ∼4.3 million experimental data points. All public data within ECBD can be accessed through its user interface, API or by database dump under the CC-BY 4.0 license.
BACKGROUND: Antineutrophil cytoplasmic antibody (ANCA)-associated vasculitis is a heterogenous autoimmune disease. While traditionally stratified into two conditions, granulomatosis with polyangiitis (GPA) and microscopic polyangiitis (MPA), the subclassification of ANCA-associated vasculitis is subject to continued debate. Here we aim to identify phenotypically distinct subgroups and develop a data-driven subclassification of ANCA-associated vasculitis, using a large real-world dataset. METHODS: In the collaborative data reuse project FAIRVASC (Findable, Accessible, Interoperable, Reusable, Vasculitis), registry records of patients with ANCA-associated vasculitis were retrieved from six European vasculitis registries: the Czech Registry of ANCA-associated vasculitis (Czech Republic), the French Vasculitis Study Group Registry (FVSG; France), the Joint Vasculitis Registry in German-speaking Countries (GeVas; Germany), the Polish Vasculitis Registry (POLVAS; Poland), the Irish Rare Kidney Disease Registry (RKD; Ireland), and the Skåne Vasculitis Cohort (Sweden). We performed model-based clustering of 17 mixed-type clinical variables using a parsimonious mixture of two latent Gaussian variable models. Clinical validation of the optimal cluster solution was made through summary statistics of the clusters' demography, phenotypic and serological characteristics, and outcome. The predictive value of models featuring the cluster affiliations were compared with classifications based on clinical diagnosis and ANCA specificity. People with lived experience were involved throughout the FAIRVASVC project. FINDINGS: A total of 3868 patients diagnosed with ANCA-associated vasculitis between Nov 1, 1966, and March 1, 2023, were included in the study across the six registries (Czech Registry n=371, FVSG n=1780, GeVas n=135, POLVAS n=792, RKD n=439, and Skåne Vasculitis Cohort n=351). There were 2434 (62·9%) patients with GPA and 1434 (37·1%) with MPA. Mean age at diagnosis was 57·2 years (SD 16·4); 2006 (51·9%) of 3867 patients were men and 1861 (48·1%) were women. We identified five clusters, with distinct phenotype, biochemical presentation, and disease outcome. Three clusters were characterised by kidney involvement: one severe kidney cluster (555 [14·3%] of 3868 patients) with high C-reactive protein (CRP) and serum creatinine concentrations, and variable ANCA specificity (SK cluster); one myeloperoxidase (MPO)-ANCA-positive kidney involvement cluster (782 [20·2%]) with limited extrarenal disease (MPO-K cluster); and one proteinase 3 (PR3)-ANCA-positive kidney involvement cluster (683 [17·7%]) with widespread extrarenal disease (PR3-K cluster). Two clusters were characterised by relative absence of kidney involvement: one was a predominantly PR3-ANCA-positive cluster (1202 [31·1%]) with inflammatory multisystem disease (IMS cluster), and one was a cluster (646 [16·7%]) with predominantly ear-nose-throat involvement and low CRP, with mainly younger patients (YR cluster). Compared with models fitted with clinical diagnosis or ANCA status, cluster-assigned models demonstrated improved predictive power with respect to both patient and kidney survival. INTERPRETATION: Our study reinforces the view that ANCA-associated vasculitis is not merely a binary construct. Data-driven subclassification of ANCA-associated vasculitis exhibits higher predictive value than current approaches for key outcomes. FUNDING: European Union's Horizon 2020 research and innovation programme under the European Joint Programme on Rare Diseases.
- MeSH
- ANCA-asociované vaskulitidy * klasifikace diagnóza epidemiologie krev imunologie MeSH
- dospělí MeSH
- kohortové studie MeSH
- lidé středního věku MeSH
- lidé MeSH
- mikroskopická polyangiitida klasifikace epidemiologie krev diagnóza imunologie MeSH
- registrace * statistika a číselné údaje MeSH
- senioři MeSH
- shluková analýza MeSH
- Check Tag
- dospělí MeSH
- lidé středního věku MeSH
- lidé MeSH
- mužské pohlaví MeSH
- senioři MeSH
- ženské pohlaví MeSH
- Publikační typ
- časopisecké články MeSH
- Geografické názvy
- Evropa MeSH
The purpose of the current study was to evaluate the functional activity and storage viability (at 4 °C and 35 °C) of an immobilized as well as lyophilized multienzyme, viz., pectinase, cellulase, and amylase (PCA) that was produced by Bacillus subtilis NG105 under solid state fermentation (SSF) at 35 °C for 10 days using mosambi peel as a substrate. After SSF, the culture media was divided into two aliquots. From the first aliquot, the produced ME was extracted, precipitated, and further immobilized on calcium alginate beads (MEICA). In order to immobilize on mosambi peel matrix, the second aliquot was mixed with acetone and subsequently lyophilized (MELMP). Thus, ready MEICA and MELMP extracted 87.5 and 91.5% juice from mango pulp, respectively. In the reusability study, after 5 cycles, MEICA exhibited 23.8%, 24.4%, and 36.5% PCA activity, respectively. The PCA activity of MEICA and MELMP was examined after 60 days of storage at 4 °C. The result revealed that the PCA for MEICA declined from 100 to 66%, 58.2%, and 64.5%, respectively, while for MELMP, it dropped from 100 to 84.2%, 82.1%, and 69.7%, respectively. Further, after 60 days of storage, the reduction of total protein content (TPC) in free multienzyme (FME), MEICA, and MELMP was 92.2%, 91.5%, and 36.3% observed, respectively. In the localization study, the maximum levels of multienzyme activity were found in cell exudates. This study demonstrated that immobilizing of multienzyme through lyophilization on waste substrates like mosambi peel boosted its stability and shelf-life along with greatly reducing the cost of products.
BACKGROUND: The advancement of sequencing technologies results in the rapid release of hundreds of new genome assemblies a year providing unprecedented resources for the study of genome evolution. Within this context, the significance of in-depth analyses of repetitive elements, transposable elements (TEs) in particular, is increasingly recognized in understanding genome evolution. Despite the plethora of available bioinformatic tools for identifying and annotating TEs, the phylogenetic distance of the target species from a curated and classified database of repetitive element sequences constrains any automated annotation effort. Moreover, manual curation of raw repeat libraries is deemed essential due to the frequent incompleteness of automatically generated consensus sequences. RESULTS: Here, we present an example of a crowd-sourcing effort aimed at curating and annotating TE libraries of two non-model species built around a collaborative, peer-reviewed teaching process. Manual curation and classification are time-consuming processes that offer limited short-term academic rewards and are typically confined to a few research groups where methods are taught through hands-on experience. Crowd-sourcing efforts could therefore offer a significant opportunity to bridge the gap between learning the methods of curation effectively and empowering the scientific community with high-quality, reusable repeat libraries. CONCLUSIONS: The collaborative manual curation of TEs from two tardigrade species, for which there were no TE libraries available, resulted in the successful characterization of hundreds of new and diverse TEs in a reasonable time frame. Our crowd-sourcing setting can be used as a teaching reference guide for similar projects: A hidden treasure awaits discovery within non-model organisms.
- Publikační typ
- časopisecké články MeSH
OBJECTIVES: This study aims to describe the data structure and harmonisation process, explore data quality and define characteristics, treatment, and outcomes of patients across six federated antineutrophil cytoplasmic antibody-associated vasculitis (AAV) registries. METHODS: Through creation of the vasculitis-specific Findable, Accessible, Interoperable, Reusable, VASCulitis ontology, we harmonised the registries and enabled semantic interoperability. We assessed data quality across the domains of uniqueness, consistency, completeness and correctness. Aggregated data were retrieved using the semantic query language SPARQL Protocol and Resource Description Framework Query Language (SPARQL) and outcome rates were assessed through random effects meta-analysis. RESULTS: A total of 5282 cases of AAV were identified. Uniqueness and data-type consistency were 100% across all assessed variables. Completeness and correctness varied from 49%-100% to 60%-100%, respectively. There were 2754 (52.1%) cases classified as granulomatosis with polyangiitis (GPA), 1580 (29.9%) as microscopic polyangiitis and 937 (17.7%) as eosinophilic GPA. The pattern of organ involvement included: lung in 3281 (65.1%), ear-nose-throat in 2860 (56.7%) and kidney in 2534 (50.2%). Intravenous cyclophosphamide was used as remission induction therapy in 982 (50.7%), rituximab in 505 (17.7%) and pulsed intravenous glucocorticoid use was highly variable (11%-91%). Overall mortality and incidence rates of end-stage kidney disease were 28.8 (95% CI 19.7 to 42.2) and 24.8 (95% CI 19.7 to 31.1) per 1000 patient-years, respectively. CONCLUSIONS: In the largest reported AAV cohort-study, we federated patient registries using semantic web technologies and highlighted concerns about data quality. The comparison of patient characteristics, treatment and outcomes was hampered by heterogeneous recruitment settings.
- MeSH
- ANCA-asociované vaskulitidy * farmakoterapie epidemiologie komplikace MeSH
- granulomatóza s polyangiitidou * farmakoterapie epidemiologie komplikace MeSH
- lidé MeSH
- mikroskopická polyangiitida * farmakoterapie epidemiologie MeSH
- protilátky proti cytoplazmě neutrofilů MeSH
- registrace MeSH
- správnost dat MeSH
- ukládání a vyhledávání informací MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
- metaanalýza MeSH
Among medical specialties, laboratory medicine is the largest producer of structured data and must play a crucial role for the efficient and safe implementation of big data and artificial intelligence in healthcare. The area of personalized therapies and precision medicine has now arrived, with huge data sets not only used for experimental and research approaches, but also in the "real world". Analysis of real world data requires development of legal, procedural and technical infrastructure. The integration of all clinical data sets for any given patient is important and necessary in order to develop a patient-centered treatment approach. Data-driven research comes with its own challenges and solutions. The Findability, Accessibility, Interoperability, and Reusability (FAIR) Guiding Principles provide guidelines to make data findable, accessible, interoperable and reusable to the research community. Federated learning, standards and ontologies are useful to improve robustness of artificial intelligence algorithms working on big data and to increase trust in these algorithms. When dealing with big data, the univariate statistical approach changes to multivariate statistical methods significantly shifting the potential of big data. Combining multiple omics gives previously unsuspected information and provides understanding of scientific questions, an approach which is also called the systems biology approach. Big data and artificial intelligence also offer opportunities for laboratories and the In Vitro Diagnostic industry to optimize the productivity of the laboratory, the quality of laboratory results and ultimately patient outcomes, through tools such as predictive maintenance and "moving average" based on the aggregate of patient results.
- MeSH
- algoritmy MeSH
- big data * MeSH
- individualizovaná medicína metody MeSH
- lidé MeSH
- poskytování zdravotní péče MeSH
- umělá inteligence * MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
- Klíčová slova
- pracovní právo,
- MeSH
- elektronický odpad * ekonomika MeSH
- hygiena práce MeSH
- látky znečišťující životní prostředí škodlivé účinky MeSH
- lidé MeSH
- nebezpečný odpad MeSH
- opakované použití vybavení MeSH
- otrava olovem prevence a kontrola MeSH
- pracovní expozice škodlivé účinky MeSH
- recyklace * MeSH
- řízení rizik MeSH
- zákonodárství jako téma MeSH
- Check Tag
- lidé MeSH
- MeSH
- katétry * ekonomika klasifikace statistika a číselné údaje MeSH
- lidé MeSH
- opakované použití vybavení * ekonomika zákonodárství a právo MeSH
- výdaje na zdravotnictví klasifikace zákonodárství a právo MeSH
- zákonodárství jako téma MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- novinové články MeSH
- Geografické názvy
- Česká republika MeSH
Current biological and chemical research is increasingly dependent on the reusability of previously acquired data, which typically come from various sources. Consequently, there is a growing need for database systems and databases stored in them to be interoperable with each other. One of the possible solutions to address this issue is to use systems based on Semantic Web technologies, namely on the Resource Description Framework (RDF) to express data and on the SPARQL query language to retrieve the data. Many existing biological and chemical databases are stored in the form of a relational database (RDB). Converting a relational database into the RDF form and storing it in a native RDF database system may not be desirable in many cases. It may be necessary to preserve the original database form, and having two versions of the same data may not be convenient. A solution may be to use a system mapping the relational database to the RDF form. Such a system keeps data in their original relational form and translates incoming SPARQL queries to equivalent SQL queries, which are evaluated by a relational-database system. This review compares different RDB-to-RDF mapping systems with a primary focus on those that can be used free of charge. In addition, it compares different approaches to expressing RDB-to-RDF mappings. The review shows that these systems represent a viable method providing sufficient performance. Their real-life performance is demonstrated on data and queries coming from the neXtProt project.
- Publikační typ
- časopisecké články MeSH
- přehledy MeSH