NWB Query Engines: Tools to Search Data Stored in Neurodata Without Borders Format
Status PubMed-not-MEDLINE Jazyk angličtina Země Švýcarsko Médium electronic-ecollection
Typ dokumentu časopisecké články
PubMed
33041776
PubMed Central
PMC7526650
DOI
10.3389/fninf.2020.00027
Knihovny.cz E-zdroje
- Klíčová slova
- HDF5, Java, NWB format, Python, SQLite, metadata, neurophysiology, search,
- Publikační typ
- časopisecké články MeSH
The Neurodata Without Borders (abbreviation NWB) format is a current technology for storing neurophysiology data along with the associated metadata. Data stored in the format is organized into separate HDF5 files, each file usually storing the data associated with a single recording session. While the NWB format provides a structured method for storing data, so far there have not been tools which enable searching a collection of NWB files in order to find data of interest for a particular purpose. We describe here three tools to enable searching NWB files. The tools have different features making each of them most useful for a particular task. The first tool, called the NWB Query Engine, is written in Java. It allows searching the complete content of NWB files. It was designed for the first version of NWB (NWB 1) and supports most (but not all) features of the most recent version (NWB 2). For some searches, it is the fastest tool. The second tool, called "search_nwb" is written in Python and also allow searching the complete contents of NWB files. It works with both NWB 1 and NWB 2, as does the third tool. The third tool, called "nwbindexer" enables searching a collection of NWB files using a two-step process. In the first step, a utility is run which creates an SQLite database containing the metadata in a collection of NWB files. This database is then searched in the second step, using another utility. Once the index is built, this two-step processes allows faster searches than are done by the other tools, but does not enable as complete of searches. All three tools use a simple query language which was developed for this project. Software integrating the three tools into a web-interface is provided which enables searching NWB files by submitting a web form.
Zobrazit více v PubMed
Chou J., Howison M., Austin B., Wu K., Qiang J., Bethel E. W., et al. (2011). Parallel index and query for large scale data analysis, in 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC) (Seattle, WA: ), 1–11. 10.1145/2063384.2063424 DOI
Dai K., Hernando J., Billeh Y. N., Gratiy S. L., Planas J., Davison A. P., et al. . (2020). The sonata data format for efficient description of large-scale network models. PLoS Comput. Biol. 16:e1007696. 10.1371/journal.pcbi.1007696 PubMed DOI PMC
Delorme A., Makeig S. (2004). EEGLAB: an open source toolbox for analysis of single-trial EEG dynamics including independent component analysis. J. Neurosci. Methods 134, 9–21. 10.1016/j.jneumeth.2003.10.009 PubMed DOI
Folk M., Heber G., Koziol Q., Pourmal E., Robinson D. (2011). An overview of the HDF5 technology suite and its applications, in Proceedings of the EDBT/ICDT 2011 Workshop on Array Databases, AD '11 (New York, NY: ACM; ), 36–47. 10.1145/1966895.1966900 DOI
Gorgolewski K. J., Auer T., Calhoun V. D., Craddock R. C., Das S., Duff E. P., et al. . (2016). The brain imaging data structure, a format for organizing and describing outputs of neuroimaging experiments. Sci. Data 3:160044. 10.1038/sdata.2016.44 PubMed DOI PMC
Gosink L., Shalf J., Stockinger K., Wu K., Bethel W. (2006). HDF5-fastquery: accelerating complex queries on HDF datasets using fast bitmap indices, in 18th International Conference on Scientific and Statistical Database Management (SSDBM'06) (Vienna: ), 149–158. 10.1109/SSDBM.2006.27 DOI
Grewe J., Wachtler T., Benda J. (2011). A bottom-up approach to data annotation in neurophysiology. Front. Neuroinform. 5:16. 10.3389/fninf.2011.00016 PubMed DOI PMC
Harris K. D., Henze D. A., Csicsvari J., Hirase H., Buzsaki G. (2000). Accuracy of tetrode spike separation as determined by simultaneous intracellular and extracellular measurements. J. Neurophysiol. 84, 401–414. 10.1152/jn.2000.84.1.401 PubMed DOI
Johnson R., Hoeller J., Donald K., Sampaleanu C., Harrop R., Risberg T., et al. (2004). The spring framework-reference documentation. Interface 21:27 Available online at: https://docs.spring.io/spring/docs/current/spring-framework-reference
Kemp B., Olivan J. (2003). European data format ‘plus?(EDF+), an EDF alike standard format for the exchange of physiological data. Clin. Neurophysiol. 114, 1755–1761. 10.1016/S1388-2457(03)00123-8 PubMed DOI
Koziol Q. (2011). HDF5. Boston, MA: Springer; 10.1007/978-0-387-09766-4_44 DOI
Moucek R., Bruha P., Jezek P., Mautner P., Novotny J., Papez V., et al. . (2014). Software and hardware infrastructure for research in electrophysiology. Front. Neuroinform. 8:20. 10.3389/fninf.2014.00075 PubMed DOI PMC
Muller E., Bednar J. A., Diesmann M., Gewaltig M.-O., Hines M., Davison A. P. (2015). Python in neuroscience. Front. Neuroinform. 9:11. 10.3389/fninf.2015.00011 PubMed DOI PMC
Ray S., Chintaluri C., Bhalla U. S., Wójcik D. K. (2016). NSDF: neuroscience simulation data format. Neuroinformatics 14, 147–167. 10.1007/s12021-015-9282-5 PubMed DOI PMC
Rosenthal A., Mork P., Li M. H., Stanford J., Koester D., Reynolds P. (2010). Cloud computing: a new business paradigm for biomedical information sharing. J. Biomed. Informat. 43, 342–353. 10.1016/j.jbi.2009.08.014 PubMed DOI
Rossant C., Kadir S. N., Goodman D. F. M., Schulman J., Hunter M. L. D., Saleem A. B., et al. . (2016). Spike sorting for large, dense electrode arrays. Nat. Neurosci. 19:634. 10.1038/nn.4268 PubMed DOI PMC
Rübel O., Dougherty M., Denes P., Conant D., Chang E. F., Bouchard K. (2016). Methods for specifying scientific data standards and modeling relationships with applications to neuroscience. Front. Neuroinform. 10:48. 10.3389/fninf.2016.00048 PubMed DOI PMC
Rübel O., Tritt A., Camp D., Chang E. F., Donofrio D., Frank L. M., et al. (2017). An advanced data software architecture for neurodata without borders (NWB) to enable efficient management. Use and sharing of neurophysiology data, in 2017 Neuroscience Meeting Planner (Washington, DC: Society for Neuroscience; ).
Rübel O., Tritt A., Dichter B., Braun T., Cain N., Clack N., et al. (2019). NWB:N 2.0: an accessible data standard for neurophysiology. bioRxiv. 10.1101/523035 DOI
Smith G. (2003). Spike2 for Windows, Version 5. Cambridge, UK: Cambridge Electronic Design Limited.
Steinmetz N., Zatka-Haas P., Carandini M., Harris K. (2019). Distributed coding of choice, action and engagement across the mouse brain. Nature 576, 266–273. 10.1038/s41586-019-1787-x PubMed DOI PMC
Stoewer A., Kellner C. J., Benda J., Wachtler T., Grewe J. (2014). File format and library for neuroscience data and metadata. Front. Neuroinform. 8:27 10.3389/conf.fninf.2014.18.00027 PubMed DOI
Teeters J., Godfrey K., Young R., Dang C., Friedsam C., Wark B., et al. . (2015). Neurodata without borders: Creating a common data format for neurophysiology. Neuron 88, 629–634. 10.1016/j.neuron.2015.10.025 PubMed DOI
Vogelstein J. T., Mensh B., Häusser M., Spruston N., Evans A. C., Kording K., et al. (2016). To the cloud! a grassroots proposal to accelerate brain science discovery. Neuron 92, 622–627. 10.1016/j.neuron.2016.10.033 PubMed DOI PMC
Wang Y., Su Y., Agrawal G. (2013). Supporting a light-weight data management layer over HDF5, in 2013 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing (Delft: ), 335–342. 10.1109/CCGrid.2013.9 DOI
Yatsenko D., Reimer J., Ecker A. S., Walker E. Y., Sinz F., Berens P., et al. (2015). Datajoint: managing big scientific data using MATLAB or Python. bioRxiv. 10.1101/031658 DOI