database pattern analysis
Dotaz
Zobrazit nápovědu
Background: While administrative databases for health care are increasingly used as research tools, such databases generally contain only health insurance claims data, the contents of which are insufficient for conducting epidemiological research. Creating a dataset appropriate for specific analysis requires technical expertise and familiarity with data analysis. The aim of our research is to develop a data warehouse (DW) accessible to researchers of epidemiology without this expertise.Methods: We began by adding commonly used attributes in the epidemiological field to the National Database of Health Insurance Claims of Japan (NDB), to construct a Research Question Oriented DB. Secondly, we developed a versatile analysis unit schema by which the Research Question Oriented DW was reconstructed as per-patient units, covering demographics including sex, age group etc. We then proposed a pattern relational calculus by which research-specific attributes can be added without expert knowledge of SQL. Finally, we applied the DW in two epidemiological studies.Results: In both studies, the coverage of attributes constructed only by the versatile analysis unit schema was limited. The versatile analysis unit schema covered 12% (3/25) of the attributes used for the one study as well as 15% (3/20) in the other study. On the other hand, the pattern relational calculus we proposed covered all remaining attributes which researchers used for their study.Conclusion: As the versatile analysis unit schema and the pattern relational calculus were able to cover all attributes used in the two epidemiological studies, this shows that even within a limited scope, our method allows researchers who have little knowledge of SQL to tackle respective epidemiological study.Abbreviations and Terminologies: NDB-SD: NDB Sampling Data set; DW: Data Warehouse; Shema: design of attributes in relations in the relational model theory; Relation: table with no duplicate tuple; Attribute: column name or variable name in relations; Primary key: one or more attributes that uniquely identify each tuple in a relation; Tuple: combination of attributes in a relation, almost the same meaning as row; Tuple relational calculus: logical expression used in the relational model theory; SQL: database language based on the relational model theory.
- MeSH
- analýza dat * MeSH
- big data MeSH
- databáze jako téma MeSH
- epidemiologické studie * MeSH
- lidé MeSH
- poskytování zdravotní péče MeSH
- všeobecné zdravotní pojištění MeSH
- Check Tag
- lidé MeSH
- Geografické názvy
- Japonsko MeSH
Diamond-Blackfan Anemia (DBA) is characterized by a defect of erythroid progenitors and, clinically, by anemia and malformations. DBA exhibits an autosomal dominant pattern of inheritance with incomplete penetrance. Currently nine genes, all encoding ribosomal proteins (RP), have been found mutated in approximately 50% of patients. Experimental evidence supports the hypothesis that DBA is primarily the result of defective ribosome synthesis. By means of a large collaboration among six centers, we report here a mutation update that includes nine genes and 220 distinct mutations, 56 of which are new. The DBA Mutation Database now includes data from 355 patients. Of those where inheritance has been examined, 125 patients carry a de novo mutation and 72 an inherited mutation. Mutagenesis may be ascribed to slippage in 65.5% of indels, whereas CpG dinucleotides are involved in 23% of transitions. Using bioinformatic tools we show that gene conversion mechanism is not common in RP genes mutagenesis, notwithstanding the abundance of RP pseudogenes. Genotype-phenotype analysis reveals that malformations are more frequently associated with mutations in RPL5 and RPL11 than in the other genes. All currently reported DBA mutations together with their functional and clinical data are included in the DBA Mutation Database. 2010 Wiley-Liss, Inc.
- MeSH
- Diamondova-Blackfanova anemie diagnóza genetika MeSH
- genetické asociační studie MeSH
- genetické databáze * MeSH
- lidé MeSH
- molekulární sekvence - údaje MeSH
- mutace * genetika MeSH
- mutageneze genetika MeSH
- ribozomální proteiny genetika MeSH
- ribozomy * genetika MeSH
- sekvence nukleotidů MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- práce podpořená grantem MeSH
- Research Support, N.I.H., Extramural MeSH
Silene vulgaris possesses ecotype-specific tolerance to high levels of copper in the soil. Although this was reported a few decades ago, little is known about this trait on a molecular level. The aim of this study was to analyze the transcription response to elevated copper concentrations in two S. vulgaris ecotypes originating from copper-contrasting soil types - copper-tolerant Lubietova and copper-sensitive Stranska skala. To reveal if plants are transcriptionally affected, we first analyzed the HMA7 gene, a known key player in copper metabolism. Based on BAC library screening, we identified a BAC clone containing a SvHMA7 sequence with all the structural properties specific for plant copper-transporting ATPases. The functionality of the gene was tested using heterologous complementation in yeast mutants. Analyses of SvHMA7 transcription patterns showed that both ecotypes studied up-regulated SvHMA7 transcription after the copper treatment. Our data are supported by analysis of appropriate reference genes based on RNA-Seq databases. To identify genes specifically involved in copper response in the studied ecotypes, we analyzed transcription profiles of genes coding Cu-transporting proteins and genes involved in the prevention of copper-induced oxidative stress in both ecotypes. Our data show that three genes (APx, POD and COPT5) differ in their transcription pattern between the ecotypes with constitutively increased transcription in Lubietova. Taken together, we have identified transcription differences between metallifferous and non-metalliferous ecotypes of S. vulgaris, and we have suggested candidate genes participating in metal tolerance in this species.
- MeSH
- adenosintrifosfatasy genetika metabolismus MeSH
- databáze nukleových kyselin MeSH
- ekotyp MeSH
- genová knihovna MeSH
- kořeny rostlin účinky léků genetika růst a vývoj fyziologie MeSH
- měď metabolismus farmakologie MeSH
- orgánová specificita MeSH
- proteiny přenášející kationty genetika metabolismus MeSH
- regulace genové exprese u rostlin * MeSH
- RNA rostlin chemie genetika MeSH
- rostlinné proteiny genetika metabolismus MeSH
- sekvenční analýza RNA MeSH
- Silene účinky léků genetika růst a vývoj fyziologie MeSH
- transkriptom * MeSH
- výhonky rostlin účinky léků genetika růst a vývoj fyziologie MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
... Contents -- Contributors vii -- Acknowledgements ix -- 1 GIS and spatial analysis: introduction and overview ... ... -- 2 A review of statistical spatial analysis in geographical information systems 13 -- Trevor C. ... ... Haining -- 4 Spatial analysis and GIS 65 -- Morton E. ?’ ... ... pattern analysers relevant to GIS 83 -- Stan ? ... ... Ralston -- PART III GIS AND SPATIAL ANALYSIS: APPLICATIONS 187 -- 10 Urban analysis in a GIS environment ...
[1st ed.] 281 s.
- Klíčová slova
- geografický informační systém,
- Konspekt
- Lékařské vědy. Lékařství
- NLK Obory
- environmentální vědy
- lékařská informatika
One of the aims of high-throughput gene/protein profiling experiments is the identification of biological processes altered between two or more conditions. Pathway analysis is an umbrella term for a multitude of computational approaches used for this purpose. While in the beginning pathway analysis relied on enrichment-based approaches, a newer generation of methods is now available, exploiting pathway topologies in addition to gene/protein expression levels. However, little effort has been invested in their critical assessment with respect to their performance in different experimental setups. Here, we assessed the performance of seven representative methods identifying differentially expressed pathways between two groups of interest based on gene expression data with prior knowledge of pathway topologies: SPIA, PRS, CePa, TAPPA, TopologyGSA, Clipper and DEGraph. We performed a number of controlled experiments that investigated their sensitivity to sample and pathway size, threshold-based filtering of differentially expressed genes, ability to detect target pathways, ability to exploit the topological information and the sensitivity to different pre-processing strategies. We also verified type I error rates and described the influence of overexpression of single genes, gene sets and topological motifs of various sizes on the detection of a pathway as differentially expressed. The results of our experiments demonstrate a wide variability of the tested methods. We provide a set of recommendations for an informed selection of the proper method for a given data analysis task.
- MeSH
- datové soubory jako téma MeSH
- genetické databáze MeSH
- lidé MeSH
- metabolické sítě a dráhy * MeSH
- stanovení celkové genové exprese metody MeSH
- vysoce účinné nukleotidové sekvenování MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- srovnávací studie MeSH
Tento článek se zabývá automatickou lokalizací objektů (očí, úst) ve dvourozměrných (2D) černobílých obrazech obličejů. Je motivován praktickým problémem v genetice člověka a výstup lokalizace objektů v dané databázi obrazů je zapotřebí pro řešení dalších úloh v genetickém výzkumu. V článku se aplikuje robustní filtr na obrazy s cílem odstranit šum. Hlavní metodou jsou šablony. Ústa a obě oči se lokalizují současně za použití váženého Pearsonova korelačního koeficientu nebo jeho robustní analogie založené na robustních regresních metodách. V databázi s 212 obrazy obličejů tato metoda správně nalezne ústa a oči ve 100 % případů. Také robustní korelační koeficient založený na regresní metodě nejmenších vážených čtverců lokalizuje ústa a oči ve 100 % obrazů uvažované databáze. Článek studuje robustní aspekty této metody vzhledem k otočení, šumu, okluzi a asymetrii v obraze. Současná lokalizace úst i obou očí je invariantní vůči libovolnému otočení obličeje. Tato studie využívá speciální vlastnosti daných obrazů obličejů vzhledem k očekávanému použití v genetických aplikacích.
This paper is devoted to automatic localization of objects (eyes, mouth) in two-dimensional (2D) grey scale images of faces. Motivated by a practical problem in human genetics, the output of the localization of objects in the given database of images is needed for further tasks in the genetic research. A robust filter is applied on the image to ensure denoising. Templates are used as the main method. The mouth and both eyes are localized jointly using the weighted Pearson product-moment correlation coefficient or its robust analogy based on robust regression methods. In the database with 212 images of faces the method allows to locate the mouth and eyes correctly in 100 % of cases. Also the robust correlation coefficient based on the least weighted squares regression localizes the mouth and both eyes in 100 % of images of the given database. Robustness aspects of the method are examined with respect to rotation, noise, occlusion and asymmetry in the image. The joint localization of the mouth and both eyes produces the method invariant to rotation of any degree. This work is tailor made for the given images with expected usage of the methods in genetic applications.
- Klíčová slova
- lokalizace objektů, šablony, detekce oči a úst, robustní korelační analýza, redukce šumu,
- MeSH
- biometrie metody MeSH
- citlivost na kontrast fyziologie MeSH
- databáze jako téma normy MeSH
- fotografování metody MeSH
- genetický výzkum MeSH
- interpretace obrazu počítačem metody MeSH
- lidé MeSH
- obličej MeSH
- oči MeSH
- počítačové zpracování obrazu metody MeSH
- regresní analýza MeSH
- reprodukovatelnost výsledků MeSH
- rozpoznávání fyziologické fyziologie MeSH
- subtrakční technika normy MeSH
- ústa MeSH
- vylepšení obrazu metody MeSH
- Check Tag
- lidé MeSH
The impact of the classification method and features selection for the speech emotion recognition accuracy is discussed in this paper. Selecting the correct parameters in combination with the classifier is an important part of reducing the complexity of system computing. This step is necessary especially for systems that will be deployed in real-time applications. The reason for the development and improvement of speech emotion recognition systems is wide usability in nowadays automatic voice controlled systems. Berlin database of emotional recordings was used in this experiment. Classification accuracy of artificial neural networks, k-nearest neighbours, and Gaussian mixture model is measured considering the selection of prosodic, spectral, and voice quality features. The purpose was to find an optimal combination of methods and group of features for stress detection in human speech. The research contribution lies in the design of the speech emotion recognition system due to its accuracy and efficiency.
- MeSH
- algoritmy * MeSH
- emoce fyziologie MeSH
- faktografické databáze MeSH
- kvalita hlasu MeSH
- lidé MeSH
- neuronové sítě MeSH
- počítačové zpracování signálu přístrojové vybavení MeSH
- řeč fyziologie MeSH
- ROC křivka MeSH
- rozpoznávání automatizované * MeSH
- rozpoznávání fyziologické fyziologie MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
Do epidemiologickej prierezovej štúdie prebiehajúcej v roku 1992 bolo zahrnutých 722 rodičiek z I. gynekologicko-pôrodníckej kliniky FNsP v Košiciach. Databáza súboru bola tvorená v softwari SOR (Správa o rodičke) a SON (Správa o novorodencovi). Analýza pôsobenia vybraných sociálnych faktorov poukázala na ich významný vplyv na niektoré reprodukčné ukazovatele žien. Vzdelanie matky významne ovplyvňovalo počet pôrodov. Štatisticky vysoko významné rozdiely boli zistené u žien s tromi a viac pôrodmi medzi ženami s VS a základným vzdelaním, ale aj celková štatistická analýza súboru chí-kvadrátovým testom poukazuje na významnosť vzdelania matiek. Vplyv vzdelanosti matiek nebol dokázaný pri sledovaní počtu spontánnych potratov a interrupcií. Rodinný stav matky mal štatisticky významný vplyv na gestačný vek. Gestačný vek 38 a viac týždňov malo 90,4 % vydatých rodičiek, zatiaľ čo u rodičiek bez partnera to bolo len 77,2 % (p < 0,001).
The epidemiological cross-sectional study implemented in 1992 comprised 722 parturient women from the 1st Gynaecological and Obstetric Clinic of the Faculty Hospital in Košice. The database of the group was processed using software SOR and SON. Analysis of the action of selected social factors revealed their important influence on some reproductive parameters. The mother's education had a significant effect on the number of deliveries. Highly significant differences were found between women with three or more deliveries as regards their education (university education, elementary education). However also the general statistical analysis of the group indicates the importance of the mother's education. The influece of education was not found when investigating the number of spontaneous and induced abortions. The family status of the mother had a significant impact on gestational age. A gestational age of 38 weeks or longer was recorded in 90.4% married mothers, while in unmarried mothers it was only 77.2% (p<0.001)
BACKGROUND: Seasonality at the clinical onset of type 1 diabetes (T1D) has been suggested by different studies, however, the results are conflicting. This study aimed to evaluate the presence of seasonality at clinical onset of T1D based on the SWEET database comprising data from 32 different countries. METHODS: The study cohort included 23 603 patients (52% males) recorded in the international multicenter SWEET database (48 centers), with T1D onset ≤20 years, year of onset between 1980 and 2015, gender, year and month of birth and T1D-diagnosis documented. Data were stratified according to four age groups (<5, 5-<10, 10-<15, 15-20 years) at T1D onset, the latitude of European center (Northern ≥50°N and Southern Europe <50°N) and the year of onset ≤ or >2009. RESULTS: Analysis by month revealed significant seasonality with January being the month with the highest and June with the lowest percentage of incident cases (P < .001). Winter, early spring and late autumn months had higher percentage of incident cases compared with late spring and summer months. Stratification by age showed similar seasonality patterns in all four age groups (P ≤ .003 each), but not in children <24 months of age. There was no gender or latitude effect on seasonality pattern, however, the pattern differed by the year of onset (P < .001). Seasonality of diagnosis conformed to a sinusoidal model for all cases, females and males, age groups, northern and southern European countries. CONCLUSIONS: Seasonality at T1D clinical onset is documented by the large SWEET database with no gender or latitude (Europe only) effect except from the year of manifestation.
- MeSH
- diabetes mellitus 1. typu epidemiologie MeSH
- dítě MeSH
- kohortové studie MeSH
- kojenec MeSH
- lidé MeSH
- mladiství MeSH
- mladý dospělý MeSH
- předškolní dítě MeSH
- roční období * MeSH
- Check Tag
- dítě MeSH
- kojenec MeSH
- lidé MeSH
- mladiství MeSH
- mladý dospělý MeSH
- mužské pohlaví MeSH
- předškolní dítě MeSH
- ženské pohlaví MeSH
- Publikační typ
- časopisecké články MeSH
- multicentrická studie MeSH
- práce podpořená grantem MeSH
- Geografické názvy
- Evropa epidemiologie MeSH
Although humic acids (HA) are involved in many biological processes in soils and thus their ecological importance has received much attention, the degradative pathways and corresponding catalytic genes underlying the HA degradation by bacteria remain unclear. To unveil those uncertainties, we analyzed transcriptomes extracted from Pseudomonas sp. PAMC 26793 cells time-dependently induced in the presence of HA in a lab flask. Out of 6288 genes, 299 (microarray) and 585 (RNA-seq) were up-regulated by > 2.0-fold in HA-induced cells, compared with controls. A significant portion (9.7% in microarray and 24.1% in RNA-seq) of these genes are predicted to function in the transport and metabolism of small molecule compounds, which could result from microbial HA degradation. To further identify lignin (a surrogate for HA)-degradative genes, 6288 protein sequences were analyzed against carbohydrate-active enzyme database and a self-curated list of putative lignin degradative genes. Out of 19 genes predicted to function in lignin degradation, several genes encoding laccase, dye-decolorizing peroxidase, vanillate O-demethylase oxygenase and reductase, and biphenyl 2,3-dioxygenase were up-regulated > 2.0-fold in RNA-seq. This induction was further confirmed by qRT-PCR, validating the likely involvement of these genes in the degradation of HA.
- MeSH
- bakteriální geny MeSH
- biodegradace MeSH
- databáze proteinů MeSH
- huminové látky mikrobiologie MeSH
- lignin metabolismus MeSH
- metabolické sítě a dráhy * MeSH
- Pseudomonas genetika metabolismus MeSH
- půdní mikrobiologie * MeSH
- regulace genové exprese u bakterií MeSH
- stanovení celkové genové exprese * MeSH
- tundra * MeSH
- Publikační typ
- časopisecké články MeSH