FAIR principles Dotaz Zobrazit nápovědu
The European Chemical Biology Database (ECBD, https://ecbd.eu) serves as the central repository for data generated by the EU-OPENSCREEN research infrastructure consortium. It is developed according to FAIR principles, which emphasize findability, accessibility, interoperability and reusability of data. This data is made available to the scientific community following open access principles. The ECBD stores both positive and negative results from the entire chemical biology project pipeline, including data from primary or counter-screening assays. The assays utilize a defined and diverse library of over 107 000 compounds, the annotations of which are continuously enriched by external user supported screening projects and by internal EU-OPENSCREEN bioprofiling efforts. These compounds were screened in 89 currently deposited datasets (assays), with 48 already being publicly accessible, while the remaining will be published after a publication embargo period of up to 3 years. Together these datasets encompass ∼4.3 million experimental data points. All public data within ECBD can be accessed through its user interface, API or by database dump under the CC-BY 4.0 license.
Research data management (RDM) is central to the implementation of the FAIR (Findable Accessible, Interoperable, Reusable) and Open Science principles. Recognising the importance of RDM, ELIXIR Platforms and Nodes have invested in RDM and launched various projects and initiatives to ensure good data management practices for scientific excellence. These projects have resulted in a rich set of tools and resources highly valuable for FAIR data management. However, these resources remain scattered across projects and ELIXIR structures, making their dissemination and application challenging. Therefore, it becomes imminent to coordinate these efforts for sustainable and harmonised RDM practices with dedicated forums for RDM professionals to exchange knowledge and share resources. The proposed ELIXIR RDM Community will bring together RDM experts to develop ELIXIR's vision and coordinate its activities, taking advantage of the available assets. It aims to coordinate RDM best practices and illustrate how to use the existing ELIXIR RDM services. The Community will be built around three integral pillars, namely, a network of RDM professionals, RDM knowledge management and RDM training expertise and resources. It will also engage with external stakeholders to leverage benefits and provide a forum to RDM professionals for regular knowledge exchange, capacity building and development of harmonised RDM practices, keeping in line with the overall scope of the RDM Community. In the short term, the Community aims to build upon the existing resources and ensure that the content of these remain up to date and fit for purpose. In the long run, the Community will aim to strengthen the skills and knowledge of its RDM professionals to support the emerging needs of the scientific community. The Community will also devise an effective strategy to engage with other ELIXIR structures and international stakeholders to influence and align with developments and solutions in the RDM field.
- Klíčová slova
- Common best practices, Data management, Data management plans, Data management training, Data stewardship, FAIR principles, Research data life cycle, community standards,
- MeSH
- data management * metody MeSH
- lidé MeSH
- výzkum MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
BACKGROUND: The EURO-NMD Registry collects data from all neuromuscular patients seen at EURO-NMD's expert centres. In-kind contributions from three patient organisations have ensured that the registry is patient-centred, meaningful, and impactful. The consenting process covers other uses, such as research, cohort finding and trial readiness. RESULTS: The registry has three-layered datasets, with European Commission-mandated data elements (EU-CDEs), a set of cross-neuromuscular data elements (NMD-CDEs) and a dataset of disease-specific data elements that function modularly (DS-DEs). The registry captures clinical, neuromuscular imaging, neuromuscular histopathology, biological and genetic data and patient-reported outcomes in a computer-interpretable format using selected ontologies and classifications. The EURO-NMD registry is connected to the EURO-NMD Registry Hub through an interoperability layer. The Hub provides an entry point to other neuromuscular registries that follow the FAIR data stewardship principles and enable GDPR-compliant information exchange. Four national or disease-specific patient registries are interoperable with the EURO-NMD Registry, allowing for federated analysis across these different resources. CONCLUSIONS: Collectively, the Registry Hub brings together data that are currently siloed and fragmented to improve healthcare and advance research for neuromuscular diseases.
- Klíčová slova
- FAIR data, Neuromuscular Diseases, Rare Diseases, Registry, Registry Hub,
- MeSH
- lidé MeSH
- neuromuskulární nemoci * genetika MeSH
- registrace MeSH
- vzácné nemoci MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
- přehledy MeSH
AI development in biotechnology relies on high-quality data to train and validate algorithms. The FAIR principles (Findable, Accessible, Interoperable, and Reusable) and regulatory frameworks such as the In Vitro Diagnostic Regulation (IVDR) and the Medical Device Regulation (MDR) specify requirements on specimen and data provenance to ensure the quality and traceability of data used in AI development. In this paper, a framework is presented for recording and publishing provenance information to meet these requirements. The framework is based on the use of standardized models and protocols, such as the W3C PROV model and the ISO 23494 series, to capture and record provenance information at various stages of the data generation and analysis process. The framework and use case illustrate the role of provenance information in supporting the development of high-quality AI algorithms in biotechnology. Finally, the principles of the framework are illustrated in a simple computational pathology use case, showing how specimen and data provenance can be used in the development and documentation of an AI algorithm. The use case demonstrates the importance of managing and integrating distributed provenance information and highlights the complex task of considering factors such as semantic interoperability, confidentiality, and the verification of authenticity and integrity.
- Klíčová slova
- Artificial intelligence, Biological material, Provenance, Traceability,
- MeSH
- algoritmy * MeSH
- biotechnologie * MeSH
- umělá inteligence MeSH
- Publikační typ
- časopisecké články MeSH
Among medical specialties, laboratory medicine is the largest producer of structured data and must play a crucial role for the efficient and safe implementation of big data and artificial intelligence in healthcare. The area of personalized therapies and precision medicine has now arrived, with huge data sets not only used for experimental and research approaches, but also in the "real world". Analysis of real world data requires development of legal, procedural and technical infrastructure. The integration of all clinical data sets for any given patient is important and necessary in order to develop a patient-centered treatment approach. Data-driven research comes with its own challenges and solutions. The Findability, Accessibility, Interoperability, and Reusability (FAIR) Guiding Principles provide guidelines to make data findable, accessible, interoperable and reusable to the research community. Federated learning, standards and ontologies are useful to improve robustness of artificial intelligence algorithms working on big data and to increase trust in these algorithms. When dealing with big data, the univariate statistical approach changes to multivariate statistical methods significantly shifting the potential of big data. Combining multiple omics gives previously unsuspected information and provides understanding of scientific questions, an approach which is also called the systems biology approach. Big data and artificial intelligence also offer opportunities for laboratories and the In Vitro Diagnostic industry to optimize the productivity of the laboratory, the quality of laboratory results and ultimately patient outcomes, through tools such as predictive maintenance and "moving average" based on the aggregate of patient results.
- Klíčová slova
- artificial intelligence, big data, data science, patient outcomes, personalized healthcare, precision medicine,
- MeSH
- algoritmy MeSH
- big data * MeSH
- individualizovaná medicína metody MeSH
- lidé MeSH
- poskytování zdravotní péče MeSH
- umělá inteligence * MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
In this paper, possibilities for network traffic protection in future hybrid passive optical networks are presented, and reasons for realizing and utilizing advanced network traffic protection schemes for various network traffic classes in these networks are analyzed. Next, principles of the Prediction-based Fair Wavelength and Bandwidth Allocation (PFWBA) algorithm are introduced in detail, focusing on the Prediction-based Fair Excessive Bandwidth Reallocation (PFEBR) algorithm with the Early Dynamic Bandwidth Allocation (E-DBA) mechanism and subsequent Dynamic Wavelength Allocation (DWA) scheme. For analyzing various wavelength allocation possibilities in Hybrid Passive Optical Networks (HPON) networks, a simulation program with the enhancement of the PFWBA algorithm is realized. Finally, a comparison of different methods of the wavelength allocation in conjunction with specific network traffic classes is executed for future HPON networks with considered protection schemes. Subsequently, three methods are presented from the viewpoint of HPON network traffic protection possibilities, including a new approach for the wavelength allocation based on network traffic protection assumptions.
Poor lifestyle leads potentially to chronic diseases and low-grade physical and mental fitness. However, ahead of time, we can measure and analyze multiple aspects of physical and mental health, such as body parameters, health risk factors, degrees of motivation, and the overall willingness to change the current lifestyle. In conjunction with data representing human brain activity, we can obtain and identify human health problems resulting from a long-term lifestyle more precisely and, where appropriate, improve the quality and length of human life. Currently, brain and physical health-related data are not commonly collected and evaluated together. However, doing that is supposed to be an interesting and viable concept, especially when followed by a more detailed definition and description of their whole processing lifecycle. Moreover, when best practices are used to store, annotate, analyze, and evaluate such data collections, the necessary infrastructure development and more intense cooperation among scientific teams and laboratories are facilitated. This approach also improves the reproducibility of experimental work. As a result, large collections of physical and brain health-related data could provide a robust basis for better interpretation of a person's overall health. This work aims to overview and reflect some best practices used within global communities to ensure the reproducibility of experiments, collected datasets and related workflows. These best practices concern, e.g., data lifecycle models, FAIR principles, and definitions and implementations of terminologies and ontologies. Then, an example of how an automated workflow system could be created to support the collection, annotation, storage, analysis, and publication of findings is shown. The Body in Numbers pilot system, also utilizing software engineering best practices, was developed to implement the concept of such an automated workflow system. It is unique just due to the combination of the processing and evaluation of physical and brain (electrophysiological) data. Its implementation is explored in greater detail, and opportunities to use the gained findings and results throughout various application domains are discussed.
- Klíčová slova
- best practices, brain data, data lifecycle, health information system, health-related data, ontology, physical data, workflow,
- Publikační typ
- časopisecké články MeSH
- přehledy MeSH
A pair of popular thermodynamic models for pharmaceutical applications, namely, the perturbed-chain statistical associating fluid theory (PC-SAFT) equation of state and the conductor-like screening model for real solvents (COSMO-RS) are thoroughly benchmarked for their performance in predicting the solubility of active pharmaceutical ingredients (APIs) in pure solvents. The ultimate goal is to provide an illustration of what to expect from these progressive frameworks when applied to the thermodynamic solubility of APIs based on activity coefficients in a purely predictive regime without specific experimental solubility data (the fusion properties of pure APIs were taken from experiments). While this kind of prediction represents the typical modus operandi of the first-principles-aided COSMO-RS, PC-SAFT is a relatively highly parametrized model that relies on experimental data, against which its pure-substance and binary interaction parameters (kij) are fitted. Therefore, to make this benchmark as fair as possible, we omitted any binary parameters of PC-SAFT (i.e., kij = 0 in all cases) and preferred pure-substance parameter sets for APIs not trained to experimental solubility data. This computational approach, together with a detailed assessment of the obtained solubility predictions against a large experimental data set, revealed that COSMO-RS convincingly outperformed PC-SAFT both qualitatively (i.e., COSMO-RS was better in solvent ranking) and quantitatively, even though the former is independent of both substance- and mixture-specific experimental data. Regarding quantitative comparison, COSMO-RS outperformed PC-SAFT for 9 of the 10 APIs and for 63% of the API-solvent systems, with root-mean-square deviations of the predicted data from the entire experimental data set being 0.82 and 1.44 log units, respectively. The results were further analyzed to expand the picture of the performance of both models with respect to the individual APIs and solvents. Interestingly, in many cases, both models were found to qualitatively incorrectly predict the direction of deviations from ideality. Furthermore, we examined how the solubility predictions from both models are sensitive to different API parametrizations.
- Klíčová slova
- COSMO-RS, PC-SAFT, benchmark, pharmaceuticals, prediction, solubility,
- MeSH
- léčivé přípravky MeSH
- rozpouštědla MeSH
- rozpustnost * MeSH
- termodynamika MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Názvy látek
- léčivé přípravky MeSH
- rozpouštědla MeSH
Electronic Health Record (EHR) systems currently in use are not designed for widely interoperable longitudinal health data. Therefore, EHR data cannot be properly shared, managed and analyzed. In this article, we propose two approaches to making EHR data more comprehensive and FAIR (Findable, Accessible, Interoperable, and Reusable) and thus more useful for diagnosis and clinical research. Firstly, the data modeling based on the LinkML framework makes the data interoperability more realistic in diverse environments with various experts involved. We show the first results of how diverse health data can be integrated based on an easy-to-understand data model and without loss of available clinical knowledge. Secondly, decentralizing EHRs contributes to the higher availability of comprehensive and consistent EHR data. We propose a technology stack for decentralized EHRs and the reasons behind this proposal. Moreover, the two proposed approaches empower patients because their EHR data can become more available, understandable, and usable for them, and they can share their data according to their needs and preferences. Finally, we explore how the users of the proposed solution could be involved in the process of its validation and adoption.
- Klíčová slova
- Distributed electronic health records, FAIR principles, HL7 FHIR, bio-data management, ontology,
- MeSH
- data management MeSH
- elektronické zdravotní záznamy * MeSH
- lidé MeSH
- sémantický web * MeSH
- software MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
Improving reproducibility and replicability in preclinical research is a widely discussed and pertinent topic, especially regarding ethical responsibility in animal research. INFRAFRONTIER, the European Research Infrastructure for the generation, phenotyping, archiving, and distribution of model mammalian genomes, is addressing this issue by developing internal quality principles for its different service areas, that provides a quality framework for its operational activities. This article introduces the INFRAFRONTIER Quality Principles in Systemic Phenotyping of genetically altered mouse models. A total of 11 key principles are included, ranging from general requirements for compliance with guidelines on animal testing, to the need for well-trained personnel and more specific standards such as the exchange of reference lines. Recently established requirements such as the provision of FAIR (Findable, Accessible, Interoperable, Reusable) data are also addressed. For each quality principle, we have outlined the specific context, requirements, further recommendations, and key references.