Annotation
Dotaz
Zobrazit nápovědu
Secondary structure elements (SSEs) are inherent parts of protein structures, and their arrangement is characteristic for each protein family. Therefore, annotation of SSEs can facilitate orientation in the vast number of homologous structures which is now available for many protein families. It also provides a way to identify and annotate the key regions, like active sites and channels, and subsequently answer the key research questions, such as understanding of molecular function and its variability.This chapter introduces the concept of SSE annotation and describes the workflow for obtaining SSE annotation for the members of a selected protein family using program SecStrAnnotator.
Objectives: Work aims to create a portable tool with a decision support, providing relevant methods for purposes of physical activity evaluation in real-time. Methods: We have utilized accelerometer equipped ez430Chronos watch in conjunction with a preconfigured RaspberryPi-based setup. Wireless transmission of accelerometer data into a running web application instance, which served as a user frontend, is provided through the WebSocket protocol. Decision support is based on a Weka classifier. Results: The proposed framework is ready to be used for the annotation and basic evaluation of physical activity data in a Wi-Fi covered areas. Minor issues are related to the occasional instability of data transmission, which has to be handled consequently. Conclusions: We found the overall framework architecture robust enough to serve its purpose. Next steps in the development will lead to an expansion of outlined functionality.
Recent technological advances have made next-generation sequencing (NGS) a popular and financially accessible technique allowing a broad range of analyses to be done simultaneously. A huge amount of newly generated NGS data, however, require advanced software support to help both in analyzing the data and biologically interpreting the results. In this article, we describe SATrans (Software for Annotation of Transcriptome), a software package providing fast and robust functional annotation of novel sequences obtained from transcriptome sequencing. Moreover, it performs advanced gene ontology analysis of differentially expressed genes, thereby helping to interpret biologically-and in a user-friendly form-the quantitative changes in gene expression. The software is freely available and provides the possibility to work with thousands of sequences using a standard personal computer or notebook running on the Linux operating system.
Annotation of multiple regions of interest across the whole mouse brain is an indispensable process for quantitative evaluation of a multitude of study endpoints in neuroscience digital pathology. Prior experience and domain expert knowledge are the key aspects for image annotation quality and consistency. At present, image annotation is often achieved manually by certified pathologists or trained technicians, limiting the total throughput of studies performed at neuroscience digital pathology labs. It may also mean that simpler and quicker methods of examining tissue samples are used by non-pathologists, especially in the early stages of research and preclinical studies. To address these limitations and to meet the growing demand for image analysis in a pharmaceutical setting, we developed AnNoBrainer, an open-source software tool that leverages deep learning, image registration, and standard cortical brain templates to automatically annotate individual brain regions on 2D pathology slides. Application of AnNoBrainer to a published set of pathology slides from transgenic mice models of synucleinopathy revealed comparable accuracy, increased reproducibility, and a significant reduction (~ 50%) in time spent on brain annotation, quality control and labelling compared to trained scientists in pathology. Taken together, AnNoBrainer offers a rapid, accurate, and reproducible automated annotation of mouse brain images that largely meets the experts' histopathological assessment standards (> 85% of cases) and enables high-throughput image analysis workflows in digital pathology labs.
OBJECTIVES: Our main objective is to design a method of, and supporting software for, interactive correction and semantic annotation of narrative clinical reports, which would allow for their easier and less erroneous processing outside their original context: first, by physicians unfamiliar with the original language (and possibly also the source specialty), and second, by tools requiring structured information, such as decision-support systems. Our additional goal is to gain insights into the process of narrative report creation, including the errors and ambiguities arising therein, and also into the process of report annotation by clinical terms. Finally, we also aim to provide a dataset of ground-truth transformations (specific for Czech as the source language), set up by expert physicians, which can be reused in the future for subsequent analytical studies and for training automated transformation procedures. METHODS: A three-phase preprocessing method has been developed to support secondary use of narrative clinical reports in electronic health record. Narrative clinical reports are narrative texts of healthcare documentation often stored in electronic health records. In the first phase a narrative clinical report is tokenized. In the second phase the tokenized clinical report is normalized. The normalized clinical report is easily readable for health professionals with the knowledge of the language used in the narrative clinical report. In the third phase the normalized clinical report is enriched with extracted structured information. The final result of the third phase is a semi-structured normalized clinical report where the extracted clinical terms are matched to codebook terms. Software tools for interactive correction, expansion and semantic annotation of narrative clinical reports has been developed and the three-phase preprocessing method validated in the cardiology area. RESULTS: The three-phase preprocessing method was validated on 49 anonymous Czech narrative clinical reports in the field of cardiology. Descriptive statistics from the database of accomplished transformations has been calculated. Two cardiologists participated in the annotation phase. The first cardiologist annotated 1500 clinical terms found in 49 narrative clinical reports to codebook terms using the classification systems ICD 10, SNOMED CT, LOINC and LEKY. The second cardiologist validated annotations of the first cardiologist. The correct clinical terms and the codebook terms have been stored in a database. CONCLUSIONS: We extracted structured information from Czech narrative clinical reports by the proposed three-phase preprocessing method and linked it to electronic health records. The software tool, although generic, is tailored for Czech as the specific language of electronic health record pool under study. This will provide a potential etalon for porting this approach to dozens of other less-spoken languages. Structured information can support medical decision making, quality assurance tasks and further medical research.
- MeSH
- elektronické zdravotní záznamy normy MeSH
- mezinárodní klasifikace nemocí MeSH
- psaní normy MeSH
- řízený slovník * MeSH
- sémantika * MeSH
- směrnice jako téma MeSH
- smysluplné využití normy MeSH
- software MeSH
- správnost dat MeSH
- strojové učení * MeSH
- uživatelské rozhraní počítače MeSH
- zpracování přirozeného jazyka * MeSH
- zpracování textu normy MeSH
- Publikační typ
- časopisecké články MeSH
Accurate annotation of genomic variants in human diseases is essential to allow personalized medicine. Assessment of somatic and germline TP53 alterations has now reached the clinic and is required in several circumstances such as the identification of the most effective cancer therapy for patients with chronic lymphocytic leukemia (CLL). Here, we present Seshat, a Web service for annotating TP53 information derived from sequencing data. A flexible framework allows the use of standard file formats such as Mutation Annotation Format (MAF) or Variant Call Format (VCF), as well as common TXT files. Seshat performs accurate variant annotations using the Human Genome Variation Society (HGVS) nomenclature and the stable TP53 genomic reference provided by the Locus Reference Genomic (LRG). In addition, using the 2017 release of the UMD_TP53 database, Seshat provides multiple statistical information for each TP53 variant including database frequency, functional activity, or pathogenicity. The information is delivered in standardized output tables that minimize errors and facilitate comparison of mutational data across studies. Seshat is a beneficial tool to interpret the ever-growing TP53 sequencing data generated by multiple sequencing platforms and it is freely available via the TP53 Website, http://p53.fr or directly at http://vps338341.ovh.net/.
- MeSH
- anotace sekvence MeSH
- databáze genetické * MeSH
- genetická variace genetika MeSH
- genomika trendy MeSH
- internet MeSH
- lidé MeSH
- mutace MeSH
- nádorový supresorový protein p53 genetika MeSH
- software * MeSH
- výpočetní biologie trendy MeSH
- vysoce účinné nukleotidové sekvenování MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- MeSH
- fyziologický stres MeSH
- hypertrofie etiologie MeSH
- lidé MeSH
- srdeční komory patologie MeSH
- Check Tag
- lidé MeSH
Genome sequencing of the human parasite Schistosoma mansoni revealed an interesting gene superfamily, called micro-exon gene (meg), that encodes secreted MEG proteins. The genes are composed of short exons (3-81 base pairs) regularly interspersed with long introns (up to 5 kbp). This article recollects 35 S. mansoni specific meg genes that are distributed over 7 autosomes and one pair of sex chromosomes and that code for at least 87 verified MEG proteins. We used various bioinformatics tools to produce an optimal alignment and propose a phylogenetic analysis. This work highlighted intriguing conserved patterns/motifs in the sequences of the highly variable MEG proteins. Based on the analyses, we were able to classify the verified MEG proteins into two subfamilies and to hypothesize their duplication and colonization of all the chromosomes. Together with motif identification, we also proposed to revisit MEGs' common names and annotation in order to avoid duplication, to help the reproducibility of research results and to avoid possible misunderstandings.
- MeSH
- exony genetika MeSH
- fylogeneze MeSH
- lidé MeSH
- mapování chromozomů MeSH
- reprodukovatelnost výsledků MeSH
- Schistosoma mansoni * genetika MeSH
- zvířata MeSH
- Check Tag
- lidé MeSH
- zvířata MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH