Toward a common standard for data and specimen provenance in life sciences

. 2024 Jan ; 8 (1) : e10365. [epub] 20230418

Status PubMed-not-MEDLINE Jazyk angličtina Země Spojené státy americké Médium electronic-ecollection

Typ dokumentu časopisecké články

Perzistentní odkaz   https://www.medvik.cz/link/pmid38249839

Grantová podpora
R01 GM122424 NIGMS NIH HHS - United States
OT2 OD026671 NIH HHS - United States
U24 EB028887 NIBIB NIH HHS - United States
Wellcome Trust - United Kingdom
RD840027 EPA - United States CEP - Centrální evidence projektů
U01 CA200059 NCI NIH HHS - United States

Open and practical exchange, dissemination, and reuse of specimens and data have become a fundamental requirement for life sciences research. The quality of the data obtained and thus the findings and knowledge derived is thus significantly influenced by the quality of the samples, the experimental methods, and the data analysis. Therefore, a comprehensive and precise documentation of the pre-analytical conditions, the analytical procedures, and the data processing are essential to be able to assess the validity of the research results. With the increasing importance of the exchange, reuse, and sharing of data and samples, procedures are required that enable cross-organizational documentation, traceability, and non-repudiation. At present, this information on the provenance of samples and data is mostly either sparse, incomplete, or incoherent. Since there is no uniform framework, this information is usually only provided within the organization and not interoperably. At the same time, the collection and sharing of biological and environmental specimens increasingly require definition and documentation of benefit sharing and compliance to regulatory requirements rather than consideration of pure scientific needs. In this publication, we present an ongoing standardization effort to provide trustworthy machine-actionable documentation of the data lineage and specimens. We would like to invite experts from the biotechnology and biomedical fields to further contribute to the standard.

BBMRI ERIC Graz Austria

Biocomplexity Institute Indiana University Bloomington Indiana USA

Biological Resource Center of Institut Pasteur Paris France

Biosystems and Biomaterials Division NIST Gaithersburg Maryland USA

Canadian Primary Care Sentinel Surveillance Network Department of Family Medicine Queen's University Kingston Ontario Canada

Centre for Gene Regulation and Expression and Division of Computational Biology School of Life Sciences University of Dundee Dundee UK

CRS4 Center for Advanced Studies Research and Development in Sardinia Pula Italy

Department of Biomedicine University of Basel Basel Switzerland

Department of Computer Science University of Manchester Manchester UK

Department of Social Statistics School of Social Sciences University of Manchester Manchester UK

EMBL's European Bioinformatics Institute Cambridge UK

Flanders Marine Institute EMBRC Belgium Ostend Belgium

German BioImaging Gesellschaft für Mikroskopie und Bildanalyse e 5 Konstanz Germany

Heidelberg Institute for Theoretical Studies Heidelberg Germany

Independent consultant

Informatics Institute University of Amsterdam Amsterdam The Netherlands

INSERM Institut National de la Sante et de la Recherche Medicale Paris France

Institute of Computer Science and Faculty of Informatics Masaryk University Brno Czechia

Interdisciplinary Bank of Biomaterials and Data Würzburg Würzburg Germany

ITTM S A Esch sur Alzette Luxembourg

Japan bio Measurement and Analysis Consortium Tokyo Japan

King's College London London UK

Medical University Graz Graz Austria

National Institute of Standards and Technology Gaithersburg Maryland USA

Ontario Institute for Cancer Research Toronto Ontario Canada

Plentzia Marine Station University of the Basque Country EMBRC Spain Bilbao Spain

Program in Molecular Medicine University of Massachusetts Chan Medical School Worcester Massachusetts USA

SCI STI MM École Politechnique Fédérale de Lausanne Lausanne Switzerland

Standards Council of Canada Ottawa Ontario Canada

University of Klagenfurt Klagenfurt Austria

University of Southampton Southampton UK

US Department of Agriculture Washington District of Columbia USA

Zobrazit více v PubMed

Begley CG, Ioannidis JP. Reproducibility in science. Circ Res. 2015;116:116‐126. doi:10.1161/CIRCRESAHA.114.303819 PubMed DOI

Servick K, Enserink M. The pandemic's first major research scandal erupts. Science. 2020;368:1041‐1042. doi:10.1126/science.368.6495.1041 PubMed DOI

Lagoze C. Big data, data integrity, and the fracturing of the control zone. Big Data Soc. 2014;1:2053951714558281. doi:10.1177/2053951714558281 DOI

Mobley A, Linder SK, Braeuer R, et al. A survey on data reproducibility in cancer research provides insights into our limited ability to translate findings from the laboratory to the clinic. PLOS One. 2013;8:1‐4. doi:10.1371/journal.pone.0063221 PubMed DOI PMC

Morrison SJ. Time to do something about reproducibility. Elife. 2014;3:1‐4. doi:10.7554/eLife.03981 PubMed DOI PMC

Byrne JA, Grima N, Capes‐Davis A, Labbé C. The possibility of systematic research fraud targeting under‐studied human genes: causes, consequences, and potential solutions. Biomarker Insights. 2019;14. doi:10.1177/1177271919829162 PubMed DOI PMC

Prinz F, Schlange T, Asadullah K. Believe it or not: how much can we rely on published data on potential drug targets? Nat Rev Drug Discov. 2011;10. Number: 9 Publisher: Nature Publishing Group:712–2. doi:10.1038/nrd3439-c1 PubMed DOI

Stupple A, Singerman D, Celi LA. The reproducibility crisis in the age of digital medicine. NPJ Digit Med. 2019;2:2. doi:10.1038/s41746-019-0079-z PubMed DOI PMC

Sheen MR, Fields JL, Northan B, et al. Replication study: biomechanical remod‐eling of the microenvironment by stromal caveolin‐1 favors tumor invasion and metastasis. Elife. 2019;8. Ed. by Sean J M and Joan M: e45120. doi:10.7554/eLife.45120 PubMed DOI PMC

Errington TM, Denis A, Perfito N, et al. Reproducibility in Cancer Biology: Chal‐lenges for assessing replicability in preclinical cancer biology. Elife. 2021;10. Ed. by Rodgers P and Franco E: e67995. doi:10.7554/eLife.67995 PubMed DOI PMC

Tiwari K, Kananathan S, Roberts MG, et al. Reproducibility in systems biology modelling. Mol Syst Biol. 2021;17:e9982. doi:10.15252/msb.20209982 PubMed DOI PMC

Freedman LP, Cockburn IM, Simcoe TS. The economics of reproducibility in preclinical research. PLoS Biol. 2015;13:1‐9. doi:10.1371/journal.pbio.1002165 PubMed DOI PMC

Nickerson D, Atalag K, de Bono B, et al. The human Physiome: how standards, software and innovative service infrastructures are providing the building blocks to make it achievable. Interface Focus. 2016;6. doi:10.1098/rsfs.2015.0103 PubMed DOI PMC

Mahase E. Covid‐19: 146 researchers raise concerns over chloroquine study that halted WHO trial. BMJ. 2020;369:369. doi:10.1136/bmj.m2197 PubMed DOI

Chaplin S. Research misconduct: how bad is it and what can be done? Future Prescriber. 2012;13:5‐76. doi:10.1002/fps.88 DOI

Committee on Responsible Science, Committee on Science, Engineering, Medicine, and Public Policy, Policy and Global Affairs, et al. Fostering Integrity in Research. Washington, D.C.: National Academies Press; 2017:21896. doi:10.17226/21896 PubMed DOI

Simeon‐Dubach D, Perren A. Better provenance for biobank samples. Nature. 2011;475:454‐455. doi:10.1038/475454d PubMed DOI

Holub P, Kohlmayer F, Prasser F, et al. Enhancing reuse of data and biological material in medical research: from FAIR to FAIR‐health. Biopreserv Biobank. 2018;16:97‐105. doi:10.1089/bio.2017.0110 PubMed DOI PMC

Müller H, Reihs R, Zatloukal K, et al. State‐of‐the‐Art and Future Challenges in the Integration of Biobank Catalogues, 13. doi:10.1007/978-3-319-16226-3_11 DOI

Ioannidis JP, Greenland S, Hlatky MA, et al. Increasing value and reducing waste in research design, conduct, and analysis. Lancet. 2014;383:166‐175. doi:10.1016/S0140-6736(13)62227-8 PubMed DOI PMC

Freedman LP, Inglese J. The increasing urgency for standards in basic biologic research. Cancer Res. 2014;74:4024‐4029. doi:10.1158/0008-5472.CAN-14-0925 PubMed DOI PMC

Begley CG, Ellis LM. Drug development: raise standards for preclinical cancer research. Nature. 2012;483:531‐533. doi:10.1038/483531a arXiv: 9907372v1. PubMed DOI

Landis SC, Amara SG, Asadullah K, et al. A call for transparent reporting to optimize the predictive value of preclinical research. Nature. 2012;490. nature11556[PII]:187–91:187‐191. doi:10.1038/nature11556 PubMed DOI PMC

Consortium of European Taxonomic Facilities (CETAF) . Code of Conduct and Best Practice for Access and Benefit‐Sharing. https://ec.europa.eu/environment/nature/biodiversity/international/abs/pdf/CETAF%20Best%20Practice%20‐%20Annex%20to%20Commission%20Decision%20C(2019)%203380%20final.pdf (visited on December 30, 2022)

Benson EE, Harding K, Mackenzie‐dodds J. A new quality management perspective for biodiversity conservation and research: investigating Biospecimen reporting for improved study quality (BRISQ) and the standard PRE‐analytical code (SPREC) using Natural History Museum and culture collections as case studies. System Biodivers. 2016;14:525‐547. doi:10.1080/14772000.2016.1201167 DOI

A‐E K, Tillin H. The EMBRC guide to ABS compliance. Recommendations to marine biological resources collections' and users' institutions. A handbook produced by the European marine biological resource Centre. Eur Mar Biol Resource Centre. 2020; https://bluebiobank.eu/docs/EMBRCGuideABS.pdf

Villanueva AG, Cook‐Deegan R, Koenig BA, et al. Characterizing the biomedical data‐sharing landscape. J Law Med Ethics. 2019;47:21‐30. doi:10.1177/1073110519840481 PubMed DOI PMC

Hulsen T. Sharing is caring‐data sharing initiatives in healthcare. Int J Environ Res Public Health. 2020;17:E3046. doi:10.3390/ijerph17093046 PubMed DOI PMC

Banzi R, Canham S, Kuchinke W, Krleza‐Jeric K, Demotes‐Mainard J, Ohmann C. Evaluation of repositories for sharing individual‐participant data from clinical studies. Trials. 2019;20:169. doi:10.1186/s13063-019-3253-3 PubMed DOI PMC

Toh S. Analytic and data sharing options in real‐world multidatabase studies of comparative effectiveness and safety of medical products. Clin Pharmacol Ther. 2020;107:834‐842. doi:10.1002/cpt.1754 PubMed DOI PMC

Grossman RL. Data Lakes, clouds, and commons: A review of platforms for analyzing and sharing genomic data. Trends Genet. 2019;35:223‐234. doi:10.1016/j.tig.2018.12.006 PubMed DOI PMC

Wilson SL, Way GP, Bittremieux W, Armache JP, Haendel MA, Hoffman MM. Sharing biological data: why, when, and how. FEBS Lett. 2021;595:847‐863. doi:10.1002/1873-3468.14067 PubMed DOI PMC

Wittner R, Mascia C, Gallo M, et al. Lightweight distributed provenance model for complex real–world environments. Sci Data. 2022;9:503. doi:10.1038/s41597-022-01537-6 PubMed DOI PMC

Wilkinson MD, Dumontier M, Aalbersberg IJ, et al. E FAIR guiding principles for scientific data management and stewardship. Sci Data. 2016;3:160018. doi:10.1038/sdata.2016.18 PubMed DOI PMC

Groth P, Moreau L. PROV‐Overview: An Overview of the PROV Family of Documents. 2013. https://www.w3.org/TR/2013/NOTE-prov-overview-20130430/

Huynh TD, Groth P, Zednik S. PROV Implementation Report. 2013. http://www.w3.org/TR/2013/NOTE-prov-implementations-20130430/

Khan FZ, Soiland‐Reyes S, Sinnott RO, Lonie A, Goble C, Crusoe MR. Sharing interoperable workflow provenance: A review of best practices and their practical application in CWLProv. GigaScience. 2019;8:giz095. doi:10.1093/gigascience/giz095 PubMed DOI PMC

Mammoliti A, Smirnov P, Safikhani Z, Ba‐Alawi W, Haibe‐Kains B. Creating reproducible pharmacogenomic analysis pipelines. Sci Data. 2019;6:166. doi:10.1038/s41597-019-0174-7 PubMed DOI PMC

McClatchey R, Shamdasani J, Branson A, et al. Traceability and Provenance in Big Data Medical Systems. In: 2015 IEEE 28th International Symposium on Computer‐Based Medical Systems. 2015:226–231. doi:10.1109/CBMS.2015.10 DOI

Giesler A, Czekala M, Hagemeier B, Grunzke R. UniProv: A flexible provenance tracking system for UNICORE. In: Di Napoli E, Hermanns MA, Iliev H, et al., eds. High‐Performance Scientific Computing. Cham: Springer International Publishing; 2017:233‐242. doi:10.1007/978-3-319-53862-4_20 DOI

Samuel S. Integrative data management for reproducibility of microscopy experiments. In: Blomqvist E, Maynard D, Gangemi A, et al., eds. The Semantic Web. Cham: Springer International Publishing; 2017:246‐255. doi:10.1007/978-3-319-58451-5_19 DOI

Curcin V, Fairweather E, Danger R, Corrigan D. Templates as a method for implementing data provenance in decision support systems. J Biomed Inform. 2017;65:1‐21. doi:10.1016/j.jbi.2016.10.022 PubMed DOI

HL7 and its participants. FHIR Release #4B [Standard], version 4.3.0. 2022. http://hl7.org/fhir/R4B/

Curcin V, Miles S, Danger R, Chen Y, Bache R, Taweel A. Implementing interoperable provenance in biomedical research. Future Gener Comput Syst. 2014;34. Special Section: Distributed Solutions for Ubiquitous Computing and Ambient Intelligence: 1–16. doi:10.1016/j.future.2013.12.001 DOI

Secretariat of the Convention on Biological Diversity . Secretariat of the Convention on Biological Diversity. The Nagoya Protocol on Access to Genetic Resources and the Fair and Equitable Sharing of Benefits Arising from their Utilization to the Convention on Biological Diversity. Convention on Biological Diversity, United Nations. 2021. https://www.cbd.int/abs/ (visited on December 30, 2022)

WMA—The World Medical Association—WMA Declaration of Taipei on Ethical Considerations regarding Health Databases and Biobanks. https://www.wma.net/policies‐post/wma‐declaration‐of‐taipei‐on‐ethical‐considerations‐regarding‐health‐databases‐and‐biobanks/ (visited on December 30, 2022)

Fairweather E, Wittner R, Chapman M, et al. Non‐repudiable provenance for clinical decision support systems. In: IPAW 2020, IPAW 2021: provenance and annotation of data and processes. In: Glavic B, Braganholo V, Koop D, eds. Lecture Notes in Computer Science. Vol 12839. Cham: Springer; 2021:162‐182. doi:10.1007/978-3-030-80960-7_10 arXiv: 2006.11233 [cs.CR]. DOI

14:00–17:00. ISO/WD Guide 85. https://www.iso.org/cms/render/live/en/sites/isoorg/contents/data/standard/07/55/75538.html (visited on December 30, 2022)

Cheney J, Chapman A, Davidson J, et al. Data provenance, curation and quality in metrology. In: advanced mathematical and computational tools in metrology and testing XII. Vol. 90. Series on advances in mathematics for applied sciences. World Sci. 2021;90:167‐187. doi:10.1142/9789811242380_0009 arXiv: arXiv:2102.08228v1. DOI

Betsou F, Bilbao R, Case J, et al. Standard PREanalytical code version 3.0. Biopreserv Biobank. 2018;16:9‐12. doi:10.1089/bio.2017.0109 PubMed DOI PMC

Soiland‐Reyes S, Sefton P, Crosas M, et al. Packaging research artefacts with RO‐Crate. Data Sci. 2022;5:97‐138. doi:10.3233/ds-210053 DOI

Voges J, Hernaez M, Mattavelli M, Ostermann J. An introduction to MPEG‐G: the first open ISO/IEC standard for the compression and exchange of genomic sequencing data. Proc IEEE. 2021;109:1607‐1622. doi:10.1109/JPROC.2021.3082027 DOI

Rivest RL, Shamir A, Adleman L. A method for obtaining digital signatures and public‐key cryptosystems. Commun ACM. 1978;21:120‐126. doi:10.1145/359340.359342 DOI

Crusoe MR, Abeln S, Iosup A, et al. Methods included: standardizing computational reuse and portability with the common workflow language. Commun ACM. 2022;65. doi:10.1145/3486897 DOI

Linkert M, Rueden CT, Allan C, et al. Metadata matters: access to image data in the real world. J Cell Biol. 2010;189:777‐782. doi:10.1083/jcb.201004104 PubMed DOI PMC

Swedlow JR, Kankaanpää P, Sarkans U, et al. A global view of standards for open image data formats and repositories. Nat Methods. 2021;18:1440‐1446. doi:10.1038/s41592-021-01113-7 PubMed DOI

Wittner R, Mascia C, Frexia F, et al. EOSC‐Life Common Provenance Model. EOSC‐Life deliverable D6.2. 2021. doi:10.5281/zenodo.4705074 DOI

Nejnovějších 20 citací...

Zobrazit více v
Medvik | PubMed

Recording provenance of workflow runs with RO-Crate

. 2024 ; 19 (9) : e0309210. [epub] 20240910

Najít záznam

Citační ukazatele

Nahrávání dat ...

Možnosti archivace

Nahrávání dat ...