The Data Use Ontology to streamline responsible access to human biomedical datasets
Status PubMed-not-MEDLINE Language English Country United States Media electronic
Document type Journal Article
Grant support
U24 HG006941
NHGRI NIH HHS - United States
R24 OD011883
NIH HHS - United States
U24 HG010262
NHGRI NIH HHS - United States
Wellcome Trust - United Kingdom
RM1 HG010860
NHGRI NIH HHS - United States
PubMed
34820659
PubMed Central
PMC8591903
DOI
10.1016/j.xgen.2021.100028
PII: S2666-979X(21)00035-5
Knihovny.cz E-resources
- Keywords
- FAIR, GA4GH, automated data access, consent, controlled access, data access, data restrictions, ontology, secondary data use, standard,
- Publication type
- Journal Article MeSH
Human biomedical datasets that are critical for research and clinical studies to benefit human health also often contain sensitive or potentially identifying information of individual participants. Thus, care must be taken when they are processed and made available to comply with ethical and regulatory frameworks and informed consent data conditions. To enable and streamline data access for these biomedical datasets, the Global Alliance for Genomics and Health (GA4GH) Data Use and Researcher Identities (DURI) work stream developed and approved the Data Use Ontology (DUO) standard. DUO is a hierarchical vocabulary of human and machine-readable data use terms that consistently and unambiguously represents a dataset's allowable data uses. DUO has been implemented by major international stakeholders such as the Broad and Sanger Institutes and is currently used in annotation of over 200,000 datasets worldwide. Using DUO in data management and access facilitates researchers' discovery and access of relevant datasets. DUO annotations increase the FAIRness of datasets and support data linkages using common data use profiles when integrating the data for secondary analyses. DUO is implemented in the Web Ontology Language (OWL) and, to increase community awareness and engagement, hosted in an open, centralized GitHub repository. DUO, together with the GA4GH Passport standard, offers a new, efficient, and streamlined data authorization and access framework that has enabled increased sharing of biomedical datasets worldwide.
Australian Genomics Murdoch Children's Research Institute Parkville VIC Australia
BBMRI ERIC AT and Masaryk University Brno Czech Republic
Berlin Institute of Health at Charité Universitätsmedizin Berlin Berlin Germany
Bioinformation and DDBJ Center National Institute of Genetics Mishima Japan
Broad Institute of Harvard and the Massachusetts Institute of Technology Cambridge MA USA
Canada's Michael Smith Genome Sciences Centre Vancouver BC Canada
Centre de Regulació Genòmica Barcelona Spain
Centre of Genomics and Policy Department of Human Genetics McGill University Montreal QC Canada
Department of Life Sciences University of Northampton Northampton UK
Division of Human Genetics Faculty of Health Sciences University of Cape Town Cape Town South Africa
ELIXIR Finland CSC IT Center for Science Ltd Espoo Finland
ELIXIR Hub Wellcome Genome Campus Hinxton UK
European Molecular Biology Laboratory European Bioinformatics Institute Hinxton UK
Google Cloud Kitchener ON N2H 5G5 Canada
Health Data Research UK Gibbs Building 215 Euston Road London NW1 2BE UK
Microsoft Research Redmond WA 98052 USA
National Bioscience Database Center Japan Science and Technology Agency Tokyo Japan
National Human Genome Research Institute NIH Bethesda MD USA
Office of Data Sharing National Cancer Institute NIH Rockville MD USA
Okayama University Okayama Japan
Patient Centered Outcomes Research Institute Washington DC USA
RTI International Research Triangle Park NC USA
Spherical Cow Group Rego Park NY 11374 USA
Tohoku Medical Megabank Organization Tohoku University Sendai Japan
University Health Network Toronto ON Canada
University of Colorado Anschutz Medical Campus Aurora CO USA
See more in PubMed
Rehm H.L., Page A.J.H., Smith L., Adams J.B., Alterovitz G., Babb L.J., Barkley M.P., Baudis M., Beauvais M.J.S., Beck T., et al. GA4GH: international policies and standards for data sharing across genomic research and healthcare. Cell Genomics. 2021;1 100029-1–100029-33. PubMed PMC
Thorogood A., Rehm H.L., Goodhand P., Page A.J.H., Joly Y., Baudis M., Rambla J., Navarro A., Nyronen T.H., Linden M., et al. International Federation of Genomic Medicine Databases Using GA4GH Standards. Cell Genomics. 2021;1 100032-1–100032-5. PubMed PMC
Woolley J.P., Kirby E., Leslie J., Jeanson F., Cabili M.N., Rushton G., Hazard J.G., Ladas V., Veal C.D., Gibson S.J., et al. Responsible sharing of biomedical data and biospecimens via the “Automatable Discovery and Access Matrix” (ADA-M) npj Genomic Med. 2018;3:1–6. PubMed PMC
GA4GH Data Use and Researcher ID work stream https://ga4gh-duri.github.io/
Voisin C., Linden M., Dyke S.O.M., Bowers S.R., Reinold K., Lawson J., Li S., Ota Wang V., Barkley M.P., Bernick D., et al. GA4GH Passport standard for digital identity and access permissions. Cell Genomics. 2021;1 100030-1–100030-12. PubMed PMC
GA4GH Driver projects https://www.ga4gh.org/how-we-work/driver-projects/
Cabili M.N., Lawson J., Saltzman A., Rushton G., O’Rourke P., Wilbanks J., Rodriguez L.L., Nyronen T., Courtot M., Donnelly S., Philippakis A.A. Empirical validation of an automated approach to data use oversight. Cell Genomics. 2021;1 100031-1–100031-6. PubMed PMC
Tryka K.A., Hao L., Sturcke A., Jin Y., Wang Z.Y., Ziyabari L., Lee M., Popova N., Sharopova N., Kimura M., Feolo M. NCBI’s Database of Genotypes and Phenotypes: dbGaP. Nucleic Acids Res. 2014;42:D975–D979. PubMed PMC
Lappalainen I., Almeida-King J., Kumanduri V., Senf A., Spalding J.D., Ur-Rehman S., Saunders G., Kandasamy J., Caccamo M., Leinonen R., et al. The European Genome-phenome Archive of human data consented for biomedical research. Nat. Genet. 2015;47:692–695. PubMed PMC
Dyke S.O.M., Philippakis A.A., Rambla De Argila J., Paltoo D.N., Luetkemeier E.S., Knoppers B.M., Brookes A.J., Spalding J.D., Thompson M., Roos M., et al. Consent Codes: Upholding Standard Data Use Conditions. PLoS Genet. 2016;12:e1005772. PubMed PMC
Lin Y., Harris M.R., Manion F.J., Eisenhauer E., Zhao B., Shi W., Karnovsky, A, He, Y Development of a BFO-Based Informed Consent Ontology (ICO) Proceedings of the 5th International Conference on Biomedical Ontologies (ICBO) 2014 Houston, Texas, USA.
DUO tracker https://github.com/EBISPOT/DUO/issues
DUO governance policy https://github.com/EBISPOT/DUO/blob/master/Governance2021.md
Web Ontology Language Reference https://www.w3.org/TR/owl-ref/
Smith B., Ashburner M., Rosse C., Bard J., Bug W., Ceusters W., Goldberg L.J., Eilbeck K., Ireland A., Mungall C.J., et al. OBI Consortium The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration. Nat. Biotechnol. 2007;25:1251–1255. PubMed PMC
Shefchek K.A., Harris N.L., Gargano M., Matentzoglu N., Unni D., Brush M., Keith D., Conlin T., Vasilevsky N., Zhang X.A., et al. The Monarch Initiative in 2019: an integrative data and analytic platform connecting phenotypes to genotypes across species. Nucleic Acids Res. 2020;48(D1):D704–D715. PubMed PMC
Grenon P., Smith B., Goldberg L. Biodynamic ontology: applying BFO in the biomedical domain. Stud. Health Technol. Inform. 2004;102:20–38. PubMed
Jupp S., Burdett T., Malone J., Leroy C., Pearce M., McMurry J., et al. A new Ontology Lookup Service at EMBL-EBI. Proceedings of the 8th International Conference on Semantic Web Applications and Tools for Life Sciences, Cambridge, UK. 2015:118–119.
DUO Github repository https://github.com/EBISPOT/DUO
Overton J.A., Cuffaro M., The OBO Foundry Operations Committee Technical Working Group. Mungall C.J. String of PURLs – frugal migration and maintenance of persistent identifiers. Data Science. 2020;3:3–13.
DUO releases http://purl.obolibrary.org/obo/duo/releases/
DUO October 2017 release http://purl.obolibrary.org/obo/duo/releases/2017-10-16/duo.owl
Matentzoglu N., Leo J., Hudhra V., Parsia B., Sattler U., et al. A survey of current, stand-alone OWL Reasoners. Proceedings of the 4th International Workshop on OWL Reasoner Evaluation (ORE-2015) Athens, Greece. 2015 June 6, 2015.
https://douroucouli.wordpress.com/2018/08/06/new-version-of-ontology-development-kit-now-with-docker-support/
Hancock J.M. Dictionary of Bioinformatics and Computational Biology. Wiley; 2004. SPARQL (SPARQL Protocol and RDF Query Language)
Courtot M., Gibson F., Lister A., Malone J., Schober D., Brinkman R.R., Ruttenberg A., et al. MIREOT: The minimum information to reference an external ontology term. Applied Ontology. 2011;6:23–33.
Kazakov Y., Krötzsch M., Simančík F. The Incredible ELK. J. Autom. Reason. 2014;53:1–61.
Wilkinson M.D., Dumontier M., Aalbersberg I.J.J., Appleton G., Axton M., Baak A., Blomberg N., Boiten J.W., da Silva Santos L.B., Bourne P.E., et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci. Data. 2016;3:160018. PubMed PMC
GA4GH Machine-readable consent guidance 2020. https://www.ga4gh.org/work_stream/regulatory-ethics/
Powell K. The broken promise that undermines human genome research. Nature. 2021;590:198–201. PubMed
NIH Releases New Policy for Data Management and Sharing https://osp.od.nih.gov/2020/10/29/nih-releases-new-policy-data-management-and-sharing/
Vasilevsky N.A., Foster E.D., Engelstad M.E., Carmody L., Might M., Chambers C., Dawkins H.J.S., Lewis J., Della Rocca M.G., Snyder M., et al. Plain-language medical vocabulary for precision diagnosis. Nat. Genet. 2018;50:474–476. PubMed PMC
GA4GH releases three new deliverables to support responsible genomic data sharing https://www.ga4gh.org/news/ga4gh-releases-three-new-deliverables-to-support-responsible-genomic-data-sharing/
Finnish institute for health and welfare https://thl.fi/en/web/thl-biobank
Bender D., Sartipi K. 2013. HL7 FHIR: An Agile and RESTful approach to healthcare information exchange. Proceedings of the 26th IEEE International Symposium on Computer-Based Medical Systems. 326–331.
Consent - FHIR v4.0.1 https://www.hl7.org/fhir/consent.html#:∼:text=A%20Privacy%20Consent%20Directive%20as,purposes%20and%20periods%20of%20time
OASIS LegalRuleML TC https://www.oasis-open.org/committees/tc_home.php?wg_abbrev=legalruleml
Thorogood A. Policy-aware data lakes: a flexible approach to achieve legal interoperability for global research collaborations. J. Law Biosci. 2020;7:a065. PubMed PMC
Getting your DUCs in a row - standardising the representation of Digital Use Conditions
GA4GH Passport standard for digital identity and access permissions
GA4GH: International policies and standards for data sharing across genomic research and healthcare