JavaScript NENÍ povolen !

Prosím povolte JavaScript.

datasets Dotaz Zobrazit nápovědu

Reset

988 záznamů v Medvik

Článek

Point cloud registration from local feature correspondences-Evaluation on challenging datasets

registration and evaluate the proposed method and its individual components on challenging real-world datasets ...

Petricek, Tomas
Autor Petricek, Tomas Department of Cybernetics, Faculty of Electrical Engineering, Czech Technical University in Prague, Prague, Czech Republic
Svoboda, Tomas
Autor Svoboda, Tomas Department of Cybernetics, Faculty of Electrical Engineering, Czech Technical University in Prague, Prague, Czech Republic

PLoS One. 2017 ; 12 (11) : e0187943. [pub] 20171114

ISSN 1932-6203
Medvik
Zdroj

Registration of laser scans, or point clouds in general, is a crucial step of localization and mapping with mobile robots or in object modeling pipelines. A coarse alignment of the point clouds is generally needed before applying local methods such as the Iterative Closest Point (ICP) algorithm. We propose a feature-based approach to point cloud registration and evaluate the proposed method and its individual components on challenging real-world datasets. For a moderate overlap between the laser scans, the method provides a superior registration accuracy compared to state-of-the-art methods including Generalized ICP, 3D Normal-Distribution Transform, Fast Point-Feature Histograms, and 4-Points Congruent Sets. Compared to the surface normals, the points as the underlying features yield higher performance in both keypoint detection and establishing local reference frames. Moreover, sign disambiguation of the basis vectors proves to be an important aspect in creating repeatable local reference frames. A novel method for sign disambiguation is proposed which yields highly repeatable reference frames.

Článek

Reductionist Pathways for Parasitism in Euglenozoans? Expanded Datasets Provide New Insights

Trends in parasitology. 2021 ; 37 (2) : 100-116. [pub] 20201027

Trends Parasitol
ISSN 1471-5007
Medvik
Zdroj

The unicellular trypanosomatids belong to the phylum Euglenozoa and all known species are obligate parasites. Distinct lineages infect plants, invertebrates, and vertebrates, including humans. Genome data for marine diplonemids, together with freshwater euglenids and free-living kinetoplastids, the closest known nonparasitic relatives to trypanosomatids, recently became available. Robust phylogenetic reconstructions across Euglenozoa are now possible and place the results of parasite-focused studies into an evolutionary context. Here we discuss recent advances in identifying the factors shaping the evolution of Euglenozoa, focusing on ancestral features generally considered parasite-specific. Remarkably, most of these predate the transition(s) to parasitism, suggesting that the presence of certain preconditions makes a significant lifestyle change more likely.

MeSH
biologická evoluce * MeSH
datové soubory jako téma MeSH
Euglenozoa klasifikace genetika MeSH
fylogeneze MeSH
genom genetika MeSH
infekce prvoky kmene Euglenozoa parazitologie MeSH
lidé MeSH
paraziti klasifikace genetika MeSH
zvířata MeSH
Check Tag
lidé MeSH
zvířata MeSH
Publikační typ
časopisecké články MeSH
práce podpořená grantem MeSH
přehledy MeSH

Článek

Typicality of functional connectivity robustly captures motion artifacts in rs-fMRI across datasets, atlases, and preprocessing pipelines

In a resting-state fMRI dataset of 245 healthy subjects, this measure was significantly correlated with ...

Human brain mapping. 2020 ; 41 (18) : 5325-5340. [pub] 20200902

Hum Brain Mapp
ISSN 1097-0193
Medvik
Zdroj

Functional connectivity analysis of resting-state fMRI data has recently become one of the most common approaches to characterizing individual brain function. It has been widely suggested that the functional connectivity matrix is a useful approximate representation of the brain's connectivity, potentially providing behaviorally or clinically relevant markers. However, functional connectivity estimates are known to be detrimentally affected by various artifacts, including those due to in-scanner head motion. Moreover, as individual functional connections generally covary only very weakly with head motion estimates, motion influence is difficult to quantify robustly, and prone to be neglected in practice. Although the use of individual estimates of head motion, or group-level correlation of motion and functional connectivity has been suggested, a sufficiently sensitive measure of individual functional connectivity quality has not yet been established. We propose a new intuitive summary index, Typicality of Functional Connectivity, to capture deviations from standard brain functional connectivity patterns. In a resting-state fMRI dataset of 245 healthy subjects, this measure was significantly correlated with individual head motion metrics. The results were further robustly reproduced across atlas granularity, preprocessing options, and other datasets, including 1,081 subjects from the Human Connectome Project. In principle, Typicality of Functional Connectivity should be sensitive also to other types of artifacts, processing errors, and possibly also brain pathology, allowing extensive use in data quality screening and quantification in functional connectivity studies as well as methodological investigations.

MeSH
artefakty MeSH
atlasy jako téma * MeSH
datové soubory jako téma * MeSH
dospělí MeSH
hlava - pohyby MeSH
konektom * metody normy MeSH
lidé MeSH
magnetická rezonanční tomografie * metody normy MeSH
mladý dospělý MeSH
mozek diagnostické zobrazování fyziologie MeSH
počítačové zpracování obrazu * metody normy MeSH
Check Tag
dospělí MeSH
lidé MeSH
mladý dospělý MeSH
mužské pohlaví MeSH
ženské pohlaví MeSH
Publikační typ
časopisecké články MeSH
práce podpořená grantem MeSH

Článek

CANCERTOOL: A Visualization and Representation Interface to Exploit Cancer Datasets

CANCERTOOL, a web-based interface that aims to overcome the major limitations of public transcriptomics dataset ...

Cancer research. 2018 ; 78 (21) : 6320-6328. [pub] 20180919

Cancer Res
ISSN 1538-7445
Medvik
Zdroj

With the advent of OMICs technologies, both individual research groups and consortia have spear-headed the characterization of human samples of multiple pathophysiologic origins, resulting in thousands of archived genomes and transcriptomes. Although a variety of web tools are now available to extract information from OMICs data, their utility has been limited by the capacity of nonbioinformatician researchers to exploit the information. To address this problem, we have developed CANCERTOOL, a web-based interface that aims to overcome the major limitations of public transcriptomics dataset analysis for highly prevalent types of cancer (breast, prostate, lung, and colorectal). CANCERTOOL provides rapid and comprehensive visualization of gene expression data for the gene(s) of interest in well-annotated cancer datasets. This visualization is accompanied by generation of reports customized to the interest of the researcher (e.g., editable figures, detailed statistical analyses, and access to raw data for reanalysis). It also carries out gene-to-gene correlations in multiple datasets at the same time or using preset patient groups. Finally, this new tool solves the time-consuming task of performing functional enrichment analysis with gene sets of interest using up to 11 different databases at the same time. Collectively, CANCERTOOL represents a simple and freely accessible interface to interrogate well-annotated datasets and obtain publishable representations that can contribute to refinement and guidance of cancer-related investigations at all levels of hypotheses and design.Significance: In order to facilitate access of research groups without bioinformatics support to public transcriptomics data, we have developed a free online tool with an easy-to-use interface that allows researchers to obtain quality information in a readily publishable format. Cancer Res; 78(21); 6320-8. ©2018 AACR.

Článek

Genomic benchmarks: a collection of datasets for genomic sequence classification

RESULTS: Here we present a collection of curated and easily accessible sequence classification datasets ...

BMC genomic data. 2023 ; 24 (1) : 25. [pub] 20230501

BMC Genom Data
ISSN 2730-6844
Medvik
Zdroj

BACKGROUND: Recently, deep neural networks have been successfully applied in many biological fields. In 2020, a deep learning model AlphaFold won the protein folding competition with predicted structures within the error tolerance of experimental methods. However, this solution to the most prominent bioinformatic challenge of the past 50 years has been possible only thanks to a carefully curated benchmark of experimentally predicted protein structures. In Genomics, we have similar challenges (annotation of genomes and identification of functional elements) but currently, we lack benchmarks similar to protein folding competition. RESULTS: Here we present a collection of curated and easily accessible sequence classification datasets in the field of genomics. The proposed collection is based on a combination of novel datasets constructed from the mining of publicly available databases and existing datasets obtained from published articles. The collection currently contains nine datasets that focus on regulatory elements (promoters, enhancers, open chromatin region) from three model organisms: human, mouse, and roundworm. A simple convolution neural network is also included in a repository and can be used as a baseline model. Benchmarks and the baseline model are distributed as the Python package 'genomic-benchmarks', and the code is available at https://github.com/ML-Bioinfo-CEITEC/genomic_benchmarks . CONCLUSIONS: Deep learning techniques revolutionized many biological fields but mainly thanks to the carefully curated benchmarks. For the field of Genomics, we propose a collection of benchmark datasets for the classification of genomic sequences with an interface for the most commonly used deep learning libraries, implementation of the simple neural network and a training framework that can be used as a starting point for future research. The main aim of this effort is to create a repository for shared datasets that will make machine learning for genomics more comparable and reproducible while reducing the overhead of researchers who want to enter the field, leading to healthy competition and new discoveries.

Článek

However, very few datasets are publicly available. ...

GigaScience. 2014 ; 3 (1) : 35.

Gigascience
ISSN 2047-217X
Medvik
Zdroj

BACKGROUND: The event-related potentials technique is widely used in cognitive neuroscience research. The P300 waveform has been explored in many research articles because of its wide applications, such as lie detection or brain-computer interfaces (BCI). However, very few datasets are publicly available. Therefore, most researchers use only their private datasets for their analysis. This leads to minimally comparable results, particularly in brain-computer research interfaces. Here we present electroencephalography/event-related potentials (EEG/ERP) data. The data were obtained from 20 healthy subjects and was acquired using an odd-ball hardware stimulator. The visual stimulation was based on a three-stimulus paradigm and included target, non-target and distracter stimuli. The data and collected metadata are shared in the EEG/ERP Portal. FINDINGS: The paper also describes the process and validation results of the presented data. The data were validated using two different methods. The first method evaluated the data by measuring the percentage of artifacts. The second method tested if the expectation of the experimental results was fulfilled (i.e., if the target trials contained the P300 component). The validation proved that most datasets were suitable for subsequent analysis. CONCLUSIONS: The presented datasets together with their metadata provide researchers with an opportunity to study the P300 component from different perspectives. Furthermore, they can be used for BCI research.

Publikační typ
časopisecké články MeSH

Článek

High-throughput concentration-response analysis for omics datasets

established ecotoxicological techniques, concentration-response modeling is rarely used for large datasets ...

Environmental toxicology and chemistry. 2015 ; 34 (9) : 2167-80. [pub] 20150814

Environ Toxicol Chem
ISSN 1552-8618
Medvik
Zdroj

Omics-based methods are increasingly used in current ecotoxicology. Therefore, a large number of observations for various toxic substances and organisms are available and may be used for identifying modes of action, adverse outcome pathways, or novel biomarkers. For these purposes, good statistical analysis of toxicogenomic data is vital. In contrast to established ecotoxicological techniques, concentration-response modeling is rarely used for large datasets. Instead, statistical hypothesis testing is prevalent, which provides only a limited scope for inference. The present study therefore applied automated concentration-response modeling for 3 different ecotoxicotranscriptomic and ecotoxicometabolomic datasets. The modeling process was performed by simultaneously applying 9 different regression models, representing distinct mechanistic, toxicological, and statistical ideas that result in different curve shapes. The best-fitting models were selected by using Akaike's information criterion. The linear and exponential models represented the best data description for more than 50% of responses. Models generating U-shaped curves were frequently selected for transcriptomic signals (30%), and sigmoid models were identified as best fit for many metabolomic signals (21%). Thus, selecting the models from an array of different types seems appropriate, because concentration-response functions may vary because of the observed response type, and they also depend on the compound, the organism, and the investigated concentration and exposure duration range. The application of concentration-response models can help to further tap the potential of omics data and is a necessary step for quantitative mixture effect assessment at the molecular response level.

MeSH
dánio pruhované růst a vývoj metabolismus MeSH
ekosystém * MeSH
embryo nesavčí účinky léků metabolismus MeSH
genomika * MeSH
látky znečišťující životní prostředí toxicita MeSH
lineární modely MeSH
metabolomika * MeSH
rychlé screeningové testy MeSH
sekvenční analýza hybridizací s uspořádaným souborem oligonukleotidů MeSH
tetrachlorethylen toxicita MeSH
transkriptom účinky léků MeSH
zvířata MeSH
Check Tag
zvířata MeSH
Publikační typ
časopisecké články MeSH
práce podpořená grantem MeSH

Článek

Sharing datasets of the COVID-19 epidemic in the Czech Republic

Open datasets also represent one of the means for evaluation of the pandemic on a global level. ...

PLoS One. 2022 ; 17 (4) : e0267397. [pub] 20220421

ISSN 1932-6203
Medvik
Zdroj

At the time of the COVID-19 pandemic, providing access to data (properly optimised regarding personal data protection) plays a crucial role in providing the general public and media with up-to-date information. Open datasets also represent one of the means for evaluation of the pandemic on a global level. The primary aim of this paper is to describe the methodological and technical framework for publishing datasets describing characteristics related to the COVID-19 epidemic in the Czech Republic (epidemiology, hospital-based care, vaccination), including the use of these datasets in practice. Practical aspects and experience with data sharing are discussed. As a reaction to the epidemic situation, a new portal COVID-19: Current Situation in the Czech Republic (https://onemocneni-aktualne.mzcr.cz/covid-19) was developed and launched in March 2020 to provide a fully-fledged and trustworthy source of information for the public and media. The portal also contains a section for the publication of (i) public open datasets available for download in CSV and JSON formats and (ii) authorised-access-only section where the authorised persons can (through an online generated token) safely visualise or download regional datasets with aggregated data at the level of the individual municipalities and regions. The data are also provided to the local open data catalogue (covering only open data on healthcare, provided by the Ministry of Health) and to the National Catalogue of Open Data (covering all open data sets, provided by various authorities/publishers, and harversting all data from local catalogues). The datasets have been published in various authentication regimes and widely used by general public, scientists, public authorities and decision-makers. The total number of API calls since its launch in March 2020 to 15 December 2020 exceeded 13 million. The datasets have been adopted as an official and guaranteed source for outputs of third parties, including public authorities, non-governmental organisations, scientists and online news portals. Datasets currently published as open data meet the 3-star open data requirements, which makes them machine-readable and facilitates their further usage without restrictions. This is essential for making the data more easily understandable and usable for data consumers. In conjunction with the strategy of the MH in the field of data opening, additional datasets meeting the already implemented standards will be also released, both on COVID-19 related and unrelated topics.

Článek

Analyses of large flow cytometry datasets

Cytometry. Part A. 2014 ; 85 (3) : 203-5.

Cytometry A
ISSN 1552-4930
Medvik
Zdroj

MeSH
algoritmy * MeSH
cytokiny metabolismus MeSH
klinické zkoušky jako téma * MeSH
lidé MeSH
průtoková cytometrie * MeSH
vakcíny proti AIDS terapeutické užití MeSH
Check Tag
lidé MeSH
Publikační typ
časopisecké články MeSH
komentáře MeSH
práce podpořená grantem MeSH

Článek

An experimental comparison of feature selection methods on two-class biomedical datasets

All of the experiments were conducted on eight two-class datasets from biomedical areas. ...

Computers in biology and medicine. 2015 ; 66 (-) : 1-10. [pub] 20150824

Comput Biol Med
ISSN 1879-0534
Medvik
Zdroj

Feature selection is a significant part of many machine learning applications dealing with small-sample and high-dimensional data. Choosing the most important features is an essential step for knowledge discovery in many areas of biomedical informatics. The increased popularity of feature selection methods and their frequent utilisation raise challenging new questions about the interpretability and stability of feature selection techniques. In this study, we compared the behaviour of ten state-of-the-art filter methods for feature selection in terms of their stability, similarity, and influence on prediction performance. All of the experiments were conducted on eight two-class datasets from biomedical areas. While entropy-based feature selection appears to be the most stable, the feature selection techniques yielding the highest prediction performance are minimum redundance maximum relevance method and feature selection based on Bhattacharyya distance. In general, univariate feature selection techniques perform similarly to or even better than more complex multivariate feature selection techniques with high-dimensional datasets. However, with more complex and smaller datasets multivariate methods slightly outperform univariate techniques.

MeSH
algoritmy MeSH
databáze faktografické MeSH
lidé MeSH
multivariační analýza MeSH
Parkinsonova nemoc diagnóza MeSH
sekvenční analýza hybridizací s uspořádaným souborem oligonukleotidů metody MeSH
software MeSH
statistické modely MeSH
výpočetní biologie metody MeSH
Check Tag
lidé MeSH
Publikační typ
časopisecké články MeSH
práce podpořená grantem MeSH
srovnávací studie MeSH

Kolekce

Publikováno

Filtry

datasets Dotaz Zobrazit nápovědu

datasets Dotaz Zobrazit nápovědu

Upřesnit dle MeSH