Body odors offer a unique window into the physiological and psychological profile of the emitter. This information, broadcast in nonverbal communication, significantly shapes social interactions. However, effectively digitizing body odors requires a precise framework for perceptual operationalization. Previous research has used a very limited number of verbal terms, such as pleasant, intense, or attractive, which fails to adequately capture qualitative differences. To address this gap, we elicited body odor descriptions from 2,607 participants across 17 countries and 13 languages. All these descriptions are presented here in one dataset, together with a condensed list of 25 body odor words (BOW). Those terms reliably differentiated between body states, and were validated in a separate study with a different group of 155 perceivers. The dataset, available as a web application, provides a novel operationalization of body odor impressions, which is a precondition for studying olfaction in human nonverbal communication, for perception-based digitization of body odors and for comparative studies.
- MeSH
- čich MeSH
- jazyk (prostředek komunikace) MeSH
- lidé MeSH
- neverbální komunikace * MeSH
- odoranty * MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
- dataset MeSH
This paper introduces a new Czech Political Candidate Dataset (CPCD), which compiles comprehensive data on all candidates who have run in any municipal, regional, national, and/or European Parliament election in the Czech Republic since 1993. For each candidate, the CPCD includes their first name, last name, age, gender, place of residence, university degree, party membership, party affiliation, ballot position, and election results for candidates and for parties. We match candidates over various elections by using algorithms that rely on their personal information. We add information on political donations made to political parties. We source donation information from the Czech Political Donation Dataset (CPDD), our other newly built dataset, in which we compile records of individual donations to 12 leading political parties from official records for the period from 2017 to 2023. CPDD is publicly available along with the CPCD.
- MeSH
- lidé MeSH
- politika * MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
- dataset MeSH
- Geografické názvy
- Česká republika MeSH
Today, MALDI-ToF MS is an established technique to characterize and identify pathogenic bacteria. The technique is increasingly applied by clinical microbiological laboratories that use commercially available complete solutions, including spectra databases covering clinically relevant bacteria. Such databases are validated for clinical, or research applications, but are often less comprehensive concerning highly pathogenic bacteria (HPB). To improve MALDI-ToF MS diagnostics of HPB we initiated a program to develop protocols for reliable and MALDI-compatible microbial inactivation and to acquire mass spectra thereof many years ago. As a result of this project, databases covering HPB, closely related bacteria, and bacteria of clinical relevance have been made publicly available on platforms such as ZENODO. This publication in detail describes the most recent version of this database. The dataset contains a total of 11,055 spectra from altogether 1,601 microbial strains and 264 species and is primarily intended to improve the diagnosis of HPB. We hope that our MALDI-ToF MS data may also be a valuable resource for developing machine learning-based bacterial identification and classification methods.
We present a novel system that leverages curators in the loop to develop a dataset and model for detecting structure features and functional annotations at residue-level from standard publication text. Our approach involves the integration of data from multiple resources, including PDBe, EuropePMC, PubMedCentral, and PubMed, combined with annotation guidelines from UniProt, and LitSuggest and HuggingFace models as tools in the annotation process. A team of seven annotators manually curated ten articles for named entities, which we utilized to train a starting PubmedBert model from HuggingFace. Using a human-in-the-loop annotation system, we iteratively developed the best model with commendable performance metrics of 0.90 for precision, 0.92 for recall, and 0.91 for F1-measure. Our proposed system showcases a successful synergy of machine learning techniques and human expertise in curating a dataset for residue-level functional annotations and protein structure features. The results demonstrate the potential for broader applications in protein research, bridging the gap between advanced machine learning models and the indispensable insights of domain experts.
- MeSH
- databáze proteinů MeSH
- lidé MeSH
- proteiny * chemie MeSH
- strojové učení * MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
- dataset MeSH
Well-documented sleep datasets from healthy adults are important for sleep pattern analysis and comparison with a wide range of neuropsychiatric disorders. Currently, available sleep datasets from healthy adults are acquired using low-density arrays with a minimum of four electrodes in a typical sleep montage. The low spatial resolution is thus prohibitive for the analysis of the spatial structure of sleep. Here we introduce an open-access sleep dataset from 29 healthy adults (13 female, aged 32.17 ± 6.30 years) acquired at the Montreal Neurological Institute. The dataset includes overnight polysomnograms with high-density scalp electroencephalograms incorporating 83 electrodes, electrocardiogram, electromyogram, electrooculogram, and an average of electrode positions using manual co-registrations and sleep scoring annotations. Data characteristics and group-level analysis of sleep properties were assessed. The database can be accessed through ( https://doi.org/10.17605/OSF.IO/R26FH ). This is the first high-density electroencephalogram open sleep database from healthy adults, allowing researchers to investigate sleep physiology at high spatial resolution. We expect that this database will serve as a valuable resource for studying sleep physiology and for benchmarking sleep pathology.
- MeSH
- databáze faktografické MeSH
- dospělí MeSH
- elektroencefalografie * MeSH
- lidé MeSH
- polysomnografie * MeSH
- skalp * MeSH
- spánek * MeSH
- Check Tag
- dospělí MeSH
- lidé MeSH
- mužské pohlaví MeSH
- ženské pohlaví MeSH
- Publikační typ
- časopisecké články MeSH
- dataset MeSH
Retinopathy of prematurity (ROP) represents a vasoproliferative disease, especially in newborns and infants, which can potentially affect and damage the vision. Despite recent advances in neonatal care and medical guidelines, ROP still remains one of the leading causes of worldwide childhood blindness. The paper presents a unique dataset of 6,004 retinal images of 188 newborns, most of whom are premature infants. The dataset is accompanied by the anonymized patients' information from the ROP screening acquired at the University Hospital Ostrava, Czech Republic. Three digital retinal imaging camera systems are used in the study: Clarity RetCam 3, Natus RetCam Envision, and Phoenix ICON. The study is enriched by the software tool ReLeSeT which is aimed at automatic retinal lesion segmentation and extraction from retinal images. Consequently, this tool enables computing geometric and intensity features of retinal lesions. Also, we publish a set of pre-processing tools for feature boosting of retinal lesions and retinal blood vessels for building classification and segmentation models in ROP analysis.
- MeSH
- lidé MeSH
- novorozenec nedonošený * MeSH
- novorozenec MeSH
- počítačové zpracování obrazu MeSH
- retina * diagnostické zobrazování MeSH
- retinopatie nedonošených * diagnostické zobrazování MeSH
- Check Tag
- lidé MeSH
- novorozenec MeSH
- Publikační typ
- časopisecké články MeSH
- dataset MeSH
- Geografické názvy
- Česká republika MeSH
The chondrocranium provides the key initial support for the fetal brain, jaws and cranial sensory organs in all vertebrates. The patterns of shaping and growth of the chondrocranium set up species-specific development of the entire craniofacial complex. The 3D development of chondrocranium have been studied primarily in animal model organisms, such as mice or zebrafish. In comparison, very little is known about the full 3D human chondrocranium, except from drawings made by anatomists many decades ago. The knowledge of human-specific aspects of chondrocranial development are essential for understanding congenital craniofacial defects and human evolution. Here advanced microCT scanning was used that includes contrast enhancement to generate the first 3D atlas of the human fetal chondrocranium during the middle trimester (13 to 19 weeks). In addition, since cartilage and bone are both visible with the techniques used, the endochondral ossification of cranial base was mapped since this region is so critical for brain and jaw growth. The human 3D models are published as a scientific resource for human development.
- MeSH
- chrupavka diagnostické zobrazování embryologie MeSH
- lebka diagnostické zobrazování embryologie MeSH
- lidé MeSH
- plod diagnostické zobrazování MeSH
- rentgenová mikrotomografie MeSH
- těhotenství MeSH
- zobrazování trojrozměrné * MeSH
- Check Tag
- lidé MeSH
- těhotenství MeSH
- ženské pohlaví MeSH
- Publikační typ
- časopisecké články MeSH
- dataset MeSH
Microscopic examination plays a significant role in the initial screening for a variety of hematological, as well as non-hematological, diagnoses. Microscopic blood smear examination that is considered a key diagnostic technique, is in recent clinical practice still performed manually, which is not only time consuming, but can lead to human errors. Although automated and semi-automated systems have been developed in recent years, their high purchasing and maintenance costs make them unaffordable for many medical institutions. Even though much research has been conducted lately to explore more accurate and feasible solutions, most researchers had to deal with a lack of medical data. To address the lack of large-scale databases in this field, we created a high-resolution dataset containing a total of 16027 annotated white blood cells. Moreover, the dataset covers overall 9 types of white blood cells, including clinically significant pathological findings. Since we used high-quality acquisition equipment, the dataset provides one of the highest quality images of blood cells, achieving an approximate resolution of 42 pixels per 1 μm.
- MeSH
- leukocyty * cytologie patologie MeSH
- lidé MeSH
- mikroskopie MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
- dataset MeSH
- práce podpořená grantem MeSH
The recent human Monkeypox outbreak underlined the importance of studying basic biology of orthopoxviruses. However, the transcriptome of its causative agent has not been investigated before neither with short-, nor with long-read sequencing approaches. This Oxford Nanopore long-read RNA-Sequencing dataset fills this gap. It will enable the in-depth characterization of the transcriptomic architecture of the monkeypox virus, and may even make possible to annotate novel host transcripts. Moreover, our direct cDNA and native RNA sequencing reads will allow the estimation of gene expression changes of both the virus and the host cells during the infection. Overall, our study will lead to a deeper understanding of the alterations caused by the viral infection on a transcriptome level.
- MeSH
- komplementární DNA MeSH
- lidé MeSH
- nanopórové sekvenování * MeSH
- opičí neštovice * MeSH
- stanovení celkové genové exprese MeSH
- transkriptom MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
- dataset MeSH
The human brain represents a complex computational system, the function and structure of which may be measured using various neuroimaging techniques focusing on separate properties of the brain tissue and activity. We capture the organization of white matter fibers acquired by diffusion-weighted imaging using probabilistic diffusion tractography. By segmenting the results of tractography into larger anatomical units, it is possible to draw inferences about the structural relationships between these parts of the system. This pipeline results in a structural connectivity matrix, which contains an estimate of connection strength among all regions. However, raw data processing is complex, computationally intensive, and requires expert quality control, which may be discouraging for researchers with less experience in the field. We thus provide brain structural connectivity matrices in a form ready for modelling and analysis and thus usable by a wide community of scientists. The presented dataset contains brain structural connectivity matrices together with the underlying raw diffusion and structural data, as well as basic demographic data of 88 healthy subjects.