Toward a common standard for data and specimen provenance in life sciences
Status PubMed-not-MEDLINE Jazyk angličtina Země Spojené státy americké Médium electronic-ecollection
Typ dokumentu časopisecké články
Grantová podpora
R01 GM122424
NIGMS NIH HHS - United States
OT2 OD026671
NIH HHS - United States
U24 EB028887
NIBIB NIH HHS - United States
Wellcome Trust - United Kingdom
RD840027
EPA - United States
CEP - Centrální evidence projektů
U01 CA200059
NCI NIH HHS - United States
PubMed
38249839
PubMed Central
PMC10797572
DOI
10.1002/lrh2.10365
PII: LRH210365
Knihovny.cz E-zdroje
- Klíčová slova
- International Organization for Standardization, biotechnology, provenance information, standardization,
- Publikační typ
- časopisecké články MeSH
Open and practical exchange, dissemination, and reuse of specimens and data have become a fundamental requirement for life sciences research. The quality of the data obtained and thus the findings and knowledge derived is thus significantly influenced by the quality of the samples, the experimental methods, and the data analysis. Therefore, a comprehensive and precise documentation of the pre-analytical conditions, the analytical procedures, and the data processing are essential to be able to assess the validity of the research results. With the increasing importance of the exchange, reuse, and sharing of data and samples, procedures are required that enable cross-organizational documentation, traceability, and non-repudiation. At present, this information on the provenance of samples and data is mostly either sparse, incomplete, or incoherent. Since there is no uniform framework, this information is usually only provided within the organization and not interoperably. At the same time, the collection and sharing of biological and environmental specimens increasingly require definition and documentation of benefit sharing and compliance to regulatory requirements rather than consideration of pure scientific needs. In this publication, we present an ongoing standardization effort to provide trustworthy machine-actionable documentation of the data lineage and specimens. We would like to invite experts from the biotechnology and biomedical fields to further contribute to the standard.
Biocomplexity Institute Indiana University Bloomington Indiana USA
Biological Resource Center of Institut Pasteur Paris France
Biosystems and Biomaterials Division NIST Gaithersburg Maryland USA
CRS4 Center for Advanced Studies Research and Development in Sardinia Pula Italy
Department of Biomedicine University of Basel Basel Switzerland
Department of Computer Science University of Manchester Manchester UK
Department of Social Statistics School of Social Sciences University of Manchester Manchester UK
EMBL's European Bioinformatics Institute Cambridge UK
Flanders Marine Institute EMBRC Belgium Ostend Belgium
German BioImaging Gesellschaft für Mikroskopie und Bildanalyse e 5 Konstanz Germany
Heidelberg Institute for Theoretical Studies Heidelberg Germany
Informatics Institute University of Amsterdam Amsterdam The Netherlands
INSERM Institut National de la Sante et de la Recherche Medicale Paris France
Institute of Computer Science and Faculty of Informatics Masaryk University Brno Czechia
Interdisciplinary Bank of Biomaterials and Data Würzburg Würzburg Germany
ITTM S A Esch sur Alzette Luxembourg
Japan bio Measurement and Analysis Consortium Tokyo Japan
King's College London London UK
Medical University Graz Graz Austria
National Institute of Standards and Technology Gaithersburg Maryland USA
Ontario Institute for Cancer Research Toronto Ontario Canada
Plentzia Marine Station University of the Basque Country EMBRC Spain Bilbao Spain
SCI STI MM École Politechnique Fédérale de Lausanne Lausanne Switzerland
Standards Council of Canada Ottawa Ontario Canada
University of Klagenfurt Klagenfurt Austria
University of Southampton Southampton UK
US Department of Agriculture Washington District of Columbia USA
Zobrazit více v PubMed
Begley CG, Ioannidis JP. Reproducibility in science. Circ Res. 2015;116:116‐126. doi:10.1161/CIRCRESAHA.114.303819 PubMed DOI
Servick K, Enserink M. The pandemic's first major research scandal erupts. Science. 2020;368:1041‐1042. doi:10.1126/science.368.6495.1041 PubMed DOI
Lagoze C. Big data, data integrity, and the fracturing of the control zone. Big Data Soc. 2014;1:2053951714558281. doi:10.1177/2053951714558281 DOI
Mobley A, Linder SK, Braeuer R, et al. A survey on data reproducibility in cancer research provides insights into our limited ability to translate findings from the laboratory to the clinic. PLOS One. 2013;8:1‐4. doi:10.1371/journal.pone.0063221 PubMed DOI PMC
Morrison SJ. Time to do something about reproducibility. Elife. 2014;3:1‐4. doi:10.7554/eLife.03981 PubMed DOI PMC
Byrne JA, Grima N, Capes‐Davis A, Labbé C. The possibility of systematic research fraud targeting under‐studied human genes: causes, consequences, and potential solutions. Biomarker Insights. 2019;14. doi:10.1177/1177271919829162 PubMed DOI PMC
Prinz F, Schlange T, Asadullah K. Believe it or not: how much can we rely on published data on potential drug targets? Nat Rev Drug Discov. 2011;10. Number: 9 Publisher: Nature Publishing Group:712–2. doi:10.1038/nrd3439-c1 PubMed DOI
Stupple A, Singerman D, Celi LA. The reproducibility crisis in the age of digital medicine. NPJ Digit Med. 2019;2:2. doi:10.1038/s41746-019-0079-z PubMed DOI PMC
Sheen MR, Fields JL, Northan B, et al. Replication study: biomechanical remod‐eling of the microenvironment by stromal caveolin‐1 favors tumor invasion and metastasis. Elife. 2019;8. Ed. by Sean J M and Joan M: e45120. doi:10.7554/eLife.45120 PubMed DOI PMC
Errington TM, Denis A, Perfito N, et al. Reproducibility in Cancer Biology: Chal‐lenges for assessing replicability in preclinical cancer biology. Elife. 2021;10. Ed. by Rodgers P and Franco E: e67995. doi:10.7554/eLife.67995 PubMed DOI PMC
Tiwari K, Kananathan S, Roberts MG, et al. Reproducibility in systems biology modelling. Mol Syst Biol. 2021;17:e9982. doi:10.15252/msb.20209982 PubMed DOI PMC
Freedman LP, Cockburn IM, Simcoe TS. The economics of reproducibility in preclinical research. PLoS Biol. 2015;13:1‐9. doi:10.1371/journal.pbio.1002165 PubMed DOI PMC
Nickerson D, Atalag K, de Bono B, et al. The human Physiome: how standards, software and innovative service infrastructures are providing the building blocks to make it achievable. Interface Focus. 2016;6. doi:10.1098/rsfs.2015.0103 PubMed DOI PMC
Mahase E. Covid‐19: 146 researchers raise concerns over chloroquine study that halted WHO trial. BMJ. 2020;369:369. doi:10.1136/bmj.m2197 PubMed DOI
Chaplin S. Research misconduct: how bad is it and what can be done? Future Prescriber. 2012;13:5‐76. doi:10.1002/fps.88 DOI
Committee on Responsible Science, Committee on Science, Engineering, Medicine, and Public Policy, Policy and Global Affairs, et al. Fostering Integrity in Research. Washington, D.C.: National Academies Press; 2017:21896. doi:10.17226/21896 PubMed DOI
Simeon‐Dubach D, Perren A. Better provenance for biobank samples. Nature. 2011;475:454‐455. doi:10.1038/475454d PubMed DOI
Holub P, Kohlmayer F, Prasser F, et al. Enhancing reuse of data and biological material in medical research: from FAIR to FAIR‐health. Biopreserv Biobank. 2018;16:97‐105. doi:10.1089/bio.2017.0110 PubMed DOI PMC
Müller H, Reihs R, Zatloukal K, et al. State‐of‐the‐Art and Future Challenges in the Integration of Biobank Catalogues, 13. doi:10.1007/978-3-319-16226-3_11 DOI
Ioannidis JP, Greenland S, Hlatky MA, et al. Increasing value and reducing waste in research design, conduct, and analysis. Lancet. 2014;383:166‐175. doi:10.1016/S0140-6736(13)62227-8 PubMed DOI PMC
Freedman LP, Inglese J. The increasing urgency for standards in basic biologic research. Cancer Res. 2014;74:4024‐4029. doi:10.1158/0008-5472.CAN-14-0925 PubMed DOI PMC
Begley CG, Ellis LM. Drug development: raise standards for preclinical cancer research. Nature. 2012;483:531‐533. doi:10.1038/483531a arXiv: 9907372v1. PubMed DOI
Landis SC, Amara SG, Asadullah K, et al. A call for transparent reporting to optimize the predictive value of preclinical research. Nature. 2012;490. nature11556[PII]:187–91:187‐191. doi:10.1038/nature11556 PubMed DOI PMC
Consortium of European Taxonomic Facilities (CETAF) . Code of Conduct and Best Practice for Access and Benefit‐Sharing. https://ec.europa.eu/environment/nature/biodiversity/international/abs/pdf/CETAF%20Best%20Practice%20‐%20Annex%20to%20Commission%20Decision%20C(2019)%203380%20final.pdf (visited on December 30, 2022)
Benson EE, Harding K, Mackenzie‐dodds J. A new quality management perspective for biodiversity conservation and research: investigating Biospecimen reporting for improved study quality (BRISQ) and the standard PRE‐analytical code (SPREC) using Natural History Museum and culture collections as case studies. System Biodivers. 2016;14:525‐547. doi:10.1080/14772000.2016.1201167 DOI
A‐E K, Tillin H. The EMBRC guide to ABS compliance. Recommendations to marine biological resources collections' and users' institutions. A handbook produced by the European marine biological resource Centre. Eur Mar Biol Resource Centre. 2020; https://bluebiobank.eu/docs/EMBRCGuideABS.pdf
Villanueva AG, Cook‐Deegan R, Koenig BA, et al. Characterizing the biomedical data‐sharing landscape. J Law Med Ethics. 2019;47:21‐30. doi:10.1177/1073110519840481 PubMed DOI PMC
Hulsen T. Sharing is caring‐data sharing initiatives in healthcare. Int J Environ Res Public Health. 2020;17:E3046. doi:10.3390/ijerph17093046 PubMed DOI PMC
Banzi R, Canham S, Kuchinke W, Krleza‐Jeric K, Demotes‐Mainard J, Ohmann C. Evaluation of repositories for sharing individual‐participant data from clinical studies. Trials. 2019;20:169. doi:10.1186/s13063-019-3253-3 PubMed DOI PMC
Toh S. Analytic and data sharing options in real‐world multidatabase studies of comparative effectiveness and safety of medical products. Clin Pharmacol Ther. 2020;107:834‐842. doi:10.1002/cpt.1754 PubMed DOI PMC
Grossman RL. Data Lakes, clouds, and commons: A review of platforms for analyzing and sharing genomic data. Trends Genet. 2019;35:223‐234. doi:10.1016/j.tig.2018.12.006 PubMed DOI PMC
Wilson SL, Way GP, Bittremieux W, Armache JP, Haendel MA, Hoffman MM. Sharing biological data: why, when, and how. FEBS Lett. 2021;595:847‐863. doi:10.1002/1873-3468.14067 PubMed DOI PMC
Wittner R, Mascia C, Gallo M, et al. Lightweight distributed provenance model for complex real–world environments. Sci Data. 2022;9:503. doi:10.1038/s41597-022-01537-6 PubMed DOI PMC
Wilkinson MD, Dumontier M, Aalbersberg IJ, et al. E FAIR guiding principles for scientific data management and stewardship. Sci Data. 2016;3:160018. doi:10.1038/sdata.2016.18 PubMed DOI PMC
Groth P, Moreau L. PROV‐Overview: An Overview of the PROV Family of Documents. 2013. https://www.w3.org/TR/2013/NOTE-prov-overview-20130430/
Huynh TD, Groth P, Zednik S. PROV Implementation Report. 2013. http://www.w3.org/TR/2013/NOTE-prov-implementations-20130430/
Khan FZ, Soiland‐Reyes S, Sinnott RO, Lonie A, Goble C, Crusoe MR. Sharing interoperable workflow provenance: A review of best practices and their practical application in CWLProv. GigaScience. 2019;8:giz095. doi:10.1093/gigascience/giz095 PubMed DOI PMC
Mammoliti A, Smirnov P, Safikhani Z, Ba‐Alawi W, Haibe‐Kains B. Creating reproducible pharmacogenomic analysis pipelines. Sci Data. 2019;6:166. doi:10.1038/s41597-019-0174-7 PubMed DOI PMC
McClatchey R, Shamdasani J, Branson A, et al. Traceability and Provenance in Big Data Medical Systems. In: 2015 IEEE 28th International Symposium on Computer‐Based Medical Systems. 2015:226–231. doi:10.1109/CBMS.2015.10 DOI
Giesler A, Czekala M, Hagemeier B, Grunzke R. UniProv: A flexible provenance tracking system for UNICORE. In: Di Napoli E, Hermanns MA, Iliev H, et al., eds. High‐Performance Scientific Computing. Cham: Springer International Publishing; 2017:233‐242. doi:10.1007/978-3-319-53862-4_20 DOI
Samuel S. Integrative data management for reproducibility of microscopy experiments. In: Blomqvist E, Maynard D, Gangemi A, et al., eds. The Semantic Web. Cham: Springer International Publishing; 2017:246‐255. doi:10.1007/978-3-319-58451-5_19 DOI
Curcin V, Fairweather E, Danger R, Corrigan D. Templates as a method for implementing data provenance in decision support systems. J Biomed Inform. 2017;65:1‐21. doi:10.1016/j.jbi.2016.10.022 PubMed DOI
HL7 and its participants. FHIR Release #4B [Standard], version 4.3.0. 2022. http://hl7.org/fhir/R4B/
Curcin V, Miles S, Danger R, Chen Y, Bache R, Taweel A. Implementing interoperable provenance in biomedical research. Future Gener Comput Syst. 2014;34. Special Section: Distributed Solutions for Ubiquitous Computing and Ambient Intelligence: 1–16. doi:10.1016/j.future.2013.12.001 DOI
Secretariat of the Convention on Biological Diversity . Secretariat of the Convention on Biological Diversity. The Nagoya Protocol on Access to Genetic Resources and the Fair and Equitable Sharing of Benefits Arising from their Utilization to the Convention on Biological Diversity. Convention on Biological Diversity, United Nations. 2021. https://www.cbd.int/abs/ (visited on December 30, 2022)
WMA—The World Medical Association—WMA Declaration of Taipei on Ethical Considerations regarding Health Databases and Biobanks. https://www.wma.net/policies‐post/wma‐declaration‐of‐taipei‐on‐ethical‐considerations‐regarding‐health‐databases‐and‐biobanks/ (visited on December 30, 2022)
Fairweather E, Wittner R, Chapman M, et al. Non‐repudiable provenance for clinical decision support systems. In: IPAW 2020, IPAW 2021: provenance and annotation of data and processes. In: Glavic B, Braganholo V, Koop D, eds. Lecture Notes in Computer Science. Vol 12839. Cham: Springer; 2021:162‐182. doi:10.1007/978-3-030-80960-7_10 arXiv: 2006.11233 [cs.CR]. DOI
14:00–17:00. ISO/WD Guide 85. https://www.iso.org/cms/render/live/en/sites/isoorg/contents/data/standard/07/55/75538.html (visited on December 30, 2022)
Cheney J, Chapman A, Davidson J, et al. Data provenance, curation and quality in metrology. In: advanced mathematical and computational tools in metrology and testing XII. Vol. 90. Series on advances in mathematics for applied sciences. World Sci. 2021;90:167‐187. doi:10.1142/9789811242380_0009 arXiv: arXiv:2102.08228v1. DOI
Betsou F, Bilbao R, Case J, et al. Standard PREanalytical code version 3.0. Biopreserv Biobank. 2018;16:9‐12. doi:10.1089/bio.2017.0109 PubMed DOI PMC
Soiland‐Reyes S, Sefton P, Crosas M, et al. Packaging research artefacts with RO‐Crate. Data Sci. 2022;5:97‐138. doi:10.3233/ds-210053 DOI
Voges J, Hernaez M, Mattavelli M, Ostermann J. An introduction to MPEG‐G: the first open ISO/IEC standard for the compression and exchange of genomic sequencing data. Proc IEEE. 2021;109:1607‐1622. doi:10.1109/JPROC.2021.3082027 DOI
Rivest RL, Shamir A, Adleman L. A method for obtaining digital signatures and public‐key cryptosystems. Commun ACM. 1978;21:120‐126. doi:10.1145/359340.359342 DOI
Crusoe MR, Abeln S, Iosup A, et al. Methods included: standardizing computational reuse and portability with the common workflow language. Commun ACM. 2022;65. doi:10.1145/3486897 DOI
Linkert M, Rueden CT, Allan C, et al. Metadata matters: access to image data in the real world. J Cell Biol. 2010;189:777‐782. doi:10.1083/jcb.201004104 PubMed DOI PMC
Swedlow JR, Kankaanpää P, Sarkans U, et al. A global view of standards for open image data formats and repositories. Nat Methods. 2021;18:1440‐1446. doi:10.1038/s41592-021-01113-7 PubMed DOI
Wittner R, Mascia C, Frexia F, et al. EOSC‐Life Common Provenance Model. EOSC‐Life deliverable D6.2. 2021. doi:10.5281/zenodo.4705074 DOI