The relational modeling of hierarchical data in biodiversity databases
Jazyk angličtina Země Velká Británie, Anglie Médium print
Typ dokumentu časopisecké články, technické zprávy
Grantová podpora
RVO 67985939
Institute of Botany of the Czech Academy of Sciences
RVO 67985939
Institute of Botany of the Czech Academy of Sciences
PubMed
39389568
PubMed Central
PMC11466226
DOI
10.1093/database/baae107
PII: 7817843
Knihovny.cz E-zdroje
- MeSH
- biodiverzita * MeSH
- databáze faktografické * MeSH
- Publikační typ
- časopisecké články MeSH
- technické zprávy MeSH
The unifying element of all biodiversity data is the issue of taxon hierarchy modeling. We compared 25 existing databases in terms of handling taxa hierarchy and presentation of this data. We used documentation or demo installations of databases as a source of information and next in line was the analysis of structures using R packages provided by inspected platforms. If neither of these was available, we used the public interface of individual databases. For almost half (12) of the databases analyzed, we did not find any formalized taxa hierarchy data structure, providing only biological information about taxon membership in higher ranks, which is not fully formalizable and thus not generally usable. The least effective Adjacency List model (storing parentId of a taxon) dominates among the remaining providers. This study demonstrates the lack of attention paid by current biodiversity databases to modeling taxon hierarchy, particularly to making it available to researchers in the form of a hierarchical data structure within the data provided. For biodiversity relational databases, the Closure Table type is the most suitable of the known data models, which also corresponds to the ontology concept. However, its use is rather sporadic within the biodiversity databases ecosystem.
Zobrazit více v PubMed
Gadelha LMR, Jr, de Siracusa PC, Dalcin EC. et al. A survey of biodiversity informatics: concepts, practices, and challenges. WIREs Data Mining Knowl Discov 2021;11:e1394. doi: 10.1002/widm.1394 DOI
Merow C, Boyle B, Enquist BJ. et al. Better incentives are needed to reward academic software development. Nat Ecol Evol 2023;7:626–27. doi: 10.1038/s41559-023-02008-w PubMed DOI
Leonelli S, Ankeny RA. Re-thinking organisms: the impact of databases on model organism biology. Stud Hist Philos Sci C 2012;43:29–36. doi: 10.1016/j.shpsc.2011.10.003 PubMed DOI
Franz NM. Biological taxonomy and ontology development: scope and limitations. Biodivers Inform 2010;7:45–66.
Grenié M, Berti E, Carvajal-Quintero J. et al. Harmonizing taxon names in biodiversity data: a review of tools, databases and best practices. Meth Ecol Evol 2023;14:12–25. doi: 10.1111/2041-210X.13802 DOI
Berendsohn WG. A taxonomic information model for botanical databases: the IOPI Model. TAXON 1997;46:283–309. doi: 10.2307/1224098 DOI
Patterson DJ, Faulwetter S, Shipunov A. Principles for a names-based cyberinfrastructure to serve all of biology. Zootaxa 2008;1950:153–63. doi: 10.11646/zootaxa.1950.1.12 DOI
Yoon N, Rose J. An information model for the representation of multiple biological classifications. In: Alexandrov VN, Dongarra JJ, Juliano BA, Renner RS, Tan CJK (eds), Computational Science — ICCS 2001. Berlin, Heidelberg: Springer, 2001, 937–46. Lecture Notes in Computer Science.
Priss U. Formalizing botanical taxonomies. In: Ganter B, de Moor A, Lex W (eds), Conceptual Structures for Knowledge Creation and Communication. Berlin, Heidelberg: Springer, 2003, 309–22. Lecture Notes in Computer Science.
Celko J. Joe Celko’s Trees and Hierarchies in SQL for Smarties. Amsterdam Boston: Morgan Kaufmann, 2012, 173–86.
Pullan MR, Watson MF, Kennedy JB. et al. The prometheus taxonomic model: a practical approach to representing multiple classifications. TAXON 2000;49:55–75. doi: 10.2307/1223932 DOI
Zhong Y, Jung S, Pramanik S. et al. Data model and comparison and query methods for interacting classifications in a taxonomic database. TAXON 1996;45:223–41. doi: 10.2307/1224663 DOI
Miller JT, Pirzl R, Rosauer D. et al. Phylolink: phylogenetically-based profiling, visualisations and metrics for biodiversity. Bioinformatics 2019;35:1229–30. doi: 10.1093/bioinformatics/bty792 PubMed DOI
Feng X, Enquist BJ, Park DS. et al. A review of the heterogeneous landscape of biodiversity databases: opportunities and challenges for a synthesized biodiversity knowledge base. Glob Ecol Biogeogr 2022;31:1242–60. doi: 10.1111/geb.13497 DOI
Schellenberger Costa D, Boehnisch G, Freiberg M. et al. The big four of plant taxonomy–a comparison of global checklists of vascular plant names. New Phytol 2023;240: 1687–702. doi: 10.1111/nph.18961 PubMed DOI
Darwin Core Maintenance Group . List of Darwin Core Terms. 2023. http://rs.tdwg.org/dwc/doc/list/2023-07-07 (20 July 2023, date last accessed).
Belbin L, Williams KJ. Towards a national bio-environmental data facility: experiences from the Atlas of Living Australia. Int J Geog Inf Sci 2016;30:108–25. doi: 10.1080/13658816.2015.1077962 DOI
Enquist BJ, Condit R, Peet RK. et al. Cyberinfrastructure for an integrated botanical information network to investigate the ecological impacts of global climate change on plant biodiversity. PeerJ Pre Prints 2016. doi: 10.7287/peerj.preprints.2615v2 DOI
Sullivan BL, Aycrigg JL, Barry JH. et al. The eBird enterprise: an integrated approach to development and application of citizen science. Biol Conserv 2014;169:31–40. doi: 10.1016/j.biocon.2013.11.003 DOI
Parr CS, Wilson N, Leary P. et al. The Encyclopedia of Life v2: providing global access to knowledge about life on earth. Biodivers Data J 2014;2:e1079. doi: 10.3897/BDJ.2.e1079 PubMed DOI PMC
GBIF . https://www.gbif.org/ (19 July 2023, date last accessed).
Weigelt P, König C, Kreft H. GIFT – A global inventory of floras and traits for macroecology and biogeography. J Biogeograph 2020;47:16–43. doi: 10.1111/jbi.13623 DOI
iDigBio . Integrated Digitized Biocollections. 2023. https://www.idigbio.org/ (19 July 2023, date last accessed).
iNaturalist . https://www.inaturalist.org/ (19 July 2023, date last accessed).
Jetz W, McPherson JM, Guralnick RP. Integrating biodiversity distribution knowledge: toward a global map of life. Trends Ecol Evol 2012;27:151–59. doi: 10.1016/j.tree.2011.09.007 PubMed DOI
Kattge J, Díaz S, Lavorel S. et al. TRY – a global database of plant traits. Glob Change Biol 2011;17:2905–35. doi: 10.1111/j.1365-2486.2011.02451.x DOI
Constable H, Guralnick R, Wieczorek Jet al. , VertNet Steering Committee . VertNet: a new model for biodiversity data sharing. PLoS Biol 2010;8:e1000309. doi: 10.1371/journal.pbio.1000309 PubMed DOI PMC
Freiberg M, Winter M, Gentile A. et al. LCVP, The Leipzig catalogue of vascular plants, a new taxonomic reference list for all known vascular plants. Scientific Data 2020;7:416. doi: 10.1038/s41597-020-00702-z PubMed DOI PMC
Govaerts R, Nic Lughadha E, Black N. et al. The world checklist of vascular plants, a continuously updated resource for exploring global plant diversity. Scientific Data 2021;8:215. doi: 10.1038/s41597-021-00997-6 PubMed DOI PMC
Borsch T, Berendsohn W, Dalcin E. et al. World Flora Online: placing taxonomists at the heart of a definitive and comprehensive global resource on the world’s plants. Taxon 2020;69:1311–41. doi: 10.1002/tax.12373 DOI
Hassler M. World Plants: Plant List. World Plants 16.2. Synonymic Checklist and Distribution of the World Flora. 2023. https://www.worldplants.de (20 July 2023, date last accessed).
BG-BASE . Introduction. http://www.bg-base.com/ (19 July 2023, date last accessed).
BRAHMS . Software for Natural History Management. https://herbaria.plants.ox.ac.uk/bol/ (19 July 2023, date last accessed).
EMu . Collection Management System. https://emu.axiell.com/ (19 July 2023, date last accessed).
Specify Collections Consortium . Software for Biological Collections and Samples. https://www.specifysoftware.org/ (19 July 2023, date last accessed).
Lepage D, Vaidya G, Guralnick R. Avibase – a database system for managing and organizing taxonomic concepts. ZooKeys 2014;420:117–35. doi: 10.3897/zookeys.420.7089 PubMed DOI PMC
Popović M, Vasić N, Koren T. et al. Biologer: an open platform for collecting biodiversity data. Biodivers Data J 2020;8:e53014. doi: 10.3897/BDJ.8.e53014 PubMed DOI PMC
Kindwall O, Roscher S, Maitre JB. et al. Dyntaxa taxon concept administration and how to handle information related to taxa - ETC/BD Technical paper N°8/2015. Eionet Portal, 2015. https://www.eionet.europa.eu/etcs/etc-bd/products/etc-bd-reports/dyntaxa_admin_handle_info (12 July 2021, date last accessed).
Schulman L, Lahti K, Piirainen E. et al. The Finnish biodiversity information facility as a best-practice model for biodiversity data infrastructures. Scientific Data 2021;8:137. doi: 10.1038/s41597-021-00919-6 PubMed DOI PMC
Novotný P, Brůna J, Chytrý M. et al. Pladias platform: technical description of the database structure. Biodivers Data J 2022;10:e80167. doi: 10.3897/BDJ.10.e80167 PubMed DOI PMC
Dmitriev D. TaxonWorks. Biodiversity Information Science and Standards, 2018. http://taxonworks.org/
Sterner B, Franz NM. Taxonomy for humans or computers? Cognitive pragmatics for big data. Biol Theory 2017;12:99–111. doi: 10.1007/s13752-017-0259-5 DOI
Schindel DE, Cook JA. The next generation of natural history collections. PLoS Biol 2018;16:e2006125. doi: 10.1371/journal.pbio.2006125 PubMed DOI PMC
Laurenne N, Tuominen J, Saarenmaa H. et al. Making species checklists understandable to machines – a shift from relational databases to ontologies. J Biomed Semant 2014;5:40. doi: 10.1186/2041-1480-5-40 PubMed DOI PMC
Lindström J. Database model for taxonomic and observation data. In: Proceedings of the 2nd IASTED International Conference on Advances in Computer Science and Technology. Puerto Vallarta, Mexico: ACTA Press, 2006, 316–21.
Trißl S, Leser U. Querying ontologies in relational database systems. In: Data Integration in the Life Sciences: Second International Workshop, DILS 2005. pp. 63–79. San Diego, CA, USA: Springer, 2005.