BACKGROUND: Recently, deep neural networks have been successfully applied in many biological fields. In 2020, a deep learning model AlphaFold won the protein folding competition with predicted structures within the error tolerance of experimental methods. However, this solution to the most prominent bioinformatic challenge of the past 50 years has been possible only thanks to a carefully curated benchmark of experimentally predicted protein structures. In Genomics, we have similar challenges (annotation of genomes and identification of functional elements) but currently, we lack benchmarks similar to protein folding competition. RESULTS: Here we present a collection of curated and easily accessible sequence classification datasets in the field of genomics. The proposed collection is based on a combination of novel datasets constructed from the mining of publicly available databases and existing datasets obtained from published articles. The collection currently contains nine datasets that focus on regulatory elements (promoters, enhancers, open chromatin region) from three model organisms: human, mouse, and roundworm. A simple convolution neural network is also included in a repository and can be used as a baseline model. Benchmarks and the baseline model are distributed as the Python package 'genomic-benchmarks', and the code is available at https://github.com/ML-Bioinfo-CEITEC/genomic_benchmarks . CONCLUSIONS: Deep learning techniques revolutionized many biological fields but mainly thanks to the carefully curated benchmarks. For the field of Genomics, we propose a collection of benchmark datasets for the classification of genomic sequences with an interface for the most commonly used deep learning libraries, implementation of the simple neural network and a training framework that can be used as a starting point for future research. The main aim of this effort is to create a repository for shared datasets that will make machine learning for genomics more comparable and reproducible while reducing the overhead of researchers who want to enter the field, leading to healthy competition and new discoveries.
Hybrid sterility contributes to speciation by preventing gene flow between related taxa. Prdm9, the first and only hybrid male sterility gene known in vertebrates, predetermines the sites of recombination between homologous chromosomes and their synapsis in early meiotic prophase. The asymmetric binding of PRDM9 to heterosubspecific homologs of Mus musculus musculus × Mus musculus domesticus F1 hybrids and increase of PRDM9-independent DNA double-strand break hotspots results indificult- to- repair double-strand breaks, incomplete synapsis of homologous chromosomes, and meiotic arrest at the first meiotic prophase. Here, we show that Prdm9 behaves as a major hybrid male sterility gene in mice outside the Mus musculus musculus × Mus musculus domesticus F1 hybrids, in the genomes composed of Mus musculus castaneus and Mus musculus musculus chromosomes segregating on the Mus musculus domesticus background. The Prdm9cst/dom2 (castaneus/domesticus) allelic combination secures meiotic synapsis, testes weight, and sperm count within physiological limits, while the Prdm9msc1/dom2 (musculus/domesticus) males show a range of fertility impairment. Out of 5 quantitative trait loci contributing to the Prdm9msc1/dom2-related infertility, 4 control either meiotic synapsis or fertility phenotypes and 1 controls both, synapsis, and fertility. Whole-genome genotyping of individual chromosomes showed preferential involvement of nonrecombinant musculus chromosomes in asynapsis in accordance with the chromosomal character of hybrid male sterility. Moreover, we show that the overall asynapsis rate can be estimated solely from the genotype of individual males by scoring the effect of nonrecombinant musculus chromosomes. Prdm9-controlled hybrid male sterility represents an example of genetic architecture of hybrid male sterility consisting of genic and chromosomal components.
- MeSH
- chromozomy MeSH
- histonlysin-N-methyltransferasa genetika metabolismus MeSH
- meióza * genetika MeSH
- mužská infertilita * genetika MeSH
- myši MeSH
- sperma metabolismus MeSH
- zvířata MeSH
- Check Tag
- mužské pohlaví MeSH
- myši MeSH
- zvířata MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
During meiosis, the recombination-initiating DNA double-strand breaks (DSBs) are repaired by crossovers or noncrossovers (gene conversions). While crossovers are easily detectable, noncrossover identification is hampered by the small size of their converted tracts and the necessity of sequence polymorphism. We report identification and characterization of a mouse chromosome-wide set of noncrossovers by next-generation sequencing of 10 mouse intersubspecific chromosome substitution strains. Based on 94 identified noncrossovers, we determined the mean length of a conversion tract to be 32 bp. The spatial chromosome-wide distribution of noncrossovers and crossovers significantly differed, although both sets overlapped the known hotspots of PRDM9-directed histone methylation and DNA DSBs, thus supporting their origin in the standard DSB repair pathway. A significant deficit of noncrossovers descending from asymmetric DSBs proved their proposed adverse effect on meiotic recombination and pointed to sister chromatids as an alternative template for their repair. The finding has implications for the molecular mechanism of hybrid sterility in mice from crosses between closely related Mus musculus musculus and Mus musculus domesticus subspecies.
- MeSH
- chromozomy genetika MeSH
- dvouřetězcové zlomy DNA MeSH
- genetická zdatnost MeSH
- genová konverze * MeSH
- histonlysin-N-methyltransferasa genetika metabolismus MeSH
- histonový kód MeSH
- hybridizace genetická * MeSH
- meióza * MeSH
- myši inbrední C57BL MeSH
- myši MeSH
- zvířata MeSH
- Check Tag
- myši MeSH
- zvířata MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
G-quadruplexes (G4s) are a class of stable structural nucleic acid secondary structures that are known to play a role in a wide spectrum of genomic functions, such as DNA replication and transcription. The classical understanding of G4 structure points to four variable length guanine strands joined by variable length nucleotide stretches. Experiments using G4 immunoprecipitation and sequencing experiments have produced a high number of highly probable G4 forming genomic sequences. The expense and technical difficulty of experimental techniques highlights the need for computational approaches of G4 identification. Here, we present PENGUINN, a machine learning method based on Convolutional neural networks, that learns the characteristics of G4 sequences and accurately predicts G4s outperforming state-of-the-art methods. We provide both a standalone implementation of the trained model, and a web application that can be used to evaluate sequences for their G4 potential.
- Publikační typ
- časopisecké články MeSH
STUDY OBJECTIVES: This study describes high-throughput phenotyping strategies for sleep and circadian behavior in mice, including examinations of robustness, reliability, and heritability among Diversity Outbred (DO) mice and their eight founder strains. METHODS: We performed high-throughput sleep and circadian phenotyping in male mice from the DO population (n = 338) and their eight founder strains: A/J (n = 6), C57BL/6J (n = 14), 129S1/SvlmJ (n = 6), NOD/LtJ (n = 6), NZO/H1LtJ (n = 6), CAST/EiJ (n = 8), PWK/PhJ (n = 8), and WSB/EiJ (n = 6). Using infrared beam break systems, we defined sleep as at least 40 s of continuous inactivity and quantified sleep-wake amounts and bout characteristics. We developed assays to measure sleep latency in a new environment and during a modified Murine Multiple Sleep Latency Test, and estimated circadian period from wheel-running experiments. For each trait, broad-sense heritability (proportion of variability explained by all genetic factors) was derived in founder strains, while narrow-sense heritability (proportion of variability explained by additive genetic effects) was calculated in DO mice. RESULTS: Phenotypes were robust to different inactivity durations to define sleep. Differences across founder strains and moderate/high broad-sense heritability were observed for most traits. There was large phenotypic variability among DO mice, and phenotypes were reliable, although estimates of heritability were lower than in founder mice. This likely reflects important nonadditive genetic effects. CONCLUSIONS: A high-throughput phenotyping strategy in mice, based primarily on monitoring of activity patterns, provides reliable and heritable estimates of sleep and circadian traits. This approach is suitable for discovery analyses in DO mice, where genetic factors explain some proportion of phenotypic variation.
- MeSH
- collaborative cross u myší * MeSH
- fenotyp MeSH
- inbrední kmeny myší MeSH
- myši inbrední C57BL MeSH
- myši inbrední NOD MeSH
- myši MeSH
- reprodukovatelnost výsledků MeSH
- spánek * genetika MeSH
- zvířata MeSH
- Check Tag
- mužské pohlaví MeSH
- myši MeSH
- zvířata MeSH
- Publikační typ
- časopisecké články MeSH
- Research Support, N.I.H., Extramural MeSH
Genetic reference panels are widely used to map complex, quantitative traits in model organisms. We have generated new high-resolution genetic maps of 259 mouse inbred strains from recombinant inbred strain panels (C57BL/6J × DBA/2J, ILS/IbgTejJ × ISS/IbgTejJ, and C57BL/6J × A/J) and chromosome substitution strain panels (C57BL/6J-Chr#, C57BL/6J-Chr#
- MeSH
- genotyp MeSH
- inbrední kmeny myší genetika MeSH
- mapování chromozomů MeSH
- variabilita počtu kopií segmentů DNA MeSH
- zvířata MeSH
- Check Tag
- mužské pohlaví MeSH
- ženské pohlaví MeSH
- zvířata MeSH
- Publikační typ
- časopisecké články MeSH
Hybrid sterility (HS) belongs to reproductive isolation barriers that safeguard the integrity of species in statu nascendi. Although hybrid sterility occurs almost universally among animal and plant species, most of our current knowledge comes from the classical genetic studies on Drosophila interspecific crosses or introgressions. With the house mouse subspecies Mus m. musculus and Mus m. domesticus as a model, new research tools have become available for studies of the molecular mechanisms and genetic networks underlying HS. Here we used QTL analysis and intersubspecific chromosome substitution strains to identify a 4.7 Mb critical region on Chromosome X (Chr X) harboring the Hstx2 HS locus, which causes asymmetrical spermatogenic arrest in reciprocal intersubspecific F1 hybrids. Subsequently, we mapped autosomal loci on Chrs 3, 9 and 13 that can abolish this asymmetry. Combination of immunofluorescent visualization of the proteins of synaptonemal complexes with whole-chromosome DNA FISH on pachytene spreads revealed that heterosubspecific, unlike consubspecific, homologous chromosomes are predisposed to asynapsis in F1 hybrid male and female meiosis. The asynapsis is under the trans- control of Hstx2 and Hst1/Prdm9 hybrid sterility genes in pachynemas of male but not female hybrids. The finding concurred with the fertility of intersubpecific F1 hybrid females homozygous for the Hstx2(Mmm) allele and resolved the apparent conflict with the dominance theory of Haldane's rule. We propose that meiotic asynapsis in intersubspecific hybrids is a consequence of cis-acting mismatch between homologous chromosomes modulated by the trans-acting Hstx2 and Prdm9 hybrid male sterility genes.
- MeSH
- chromozom X genetika MeSH
- genetické lokusy genetika MeSH
- histonlysin-N-methyltransferasa genetika MeSH
- hybridizace genetická MeSH
- lidé MeSH
- lokus kvantitativního znaku genetika MeSH
- meióza MeSH
- mužská infertilita genetika MeSH
- myši MeSH
- párování chromozomů genetika MeSH
- reprodukční izolace MeSH
- synaptonemální komplex genetika MeSH
- zvířata MeSH
- Check Tag
- lidé MeSH
- mužské pohlaví MeSH
- myši MeSH
- ženské pohlaví MeSH
- zvířata MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
Two important questions in bioacoustics are whether vocal repertoires of animals are graded or discrete and how the vocal expressions are linked to the context of emission. Here we address these questions in an ungulate species. The vocal repertoire of young domestic pigs, Sus scrofa, was quantitatively described based on 1513 calls recorded in 11 situations. We described the acoustic quality of calls with 8 acoustic parameters. Based on these parameters, the k-means clustering method showed a possibility to distinguish either two or five clusters although the call types are rather blurred than strictly discrete. The division of the vocal repertoire of piglets into two call types has previously been used in many experimental studies into pig acoustic communication and the five call types correspond well to previously published partial repertoires in specific situations. Clear links exist between the type of situation, its putative valence, and the vocal expression in that situation. These links can be described adequately both with a set of quantitative acoustic variables and through categorisation into call types. The information about the situation of emission of the calls is encoded through five call types almost as accurately as through the full quantitative description.
- MeSH
- akustika MeSH
- novorozená zvířata MeSH
- Sus scrofa MeSH
- vokalizace zvířat * MeSH
- zvířata MeSH
- zvuková spektrografie MeSH
- Check Tag
- mužské pohlaví MeSH
- ženské pohlaví MeSH
- zvířata MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
According to the Dobzhansky-Muller model, hybrid sterility is a consequence of the independent evolution of related taxa resulting in incompatible genomic interactions of their hybrids. The model implies that the incompatibilities evolve randomly, unless a particular gene or nongenic sequence diverges much faster than the rest of the genome. Here we propose that asynapsis of heterospecific chromosomes in meiotic prophase provides a recurrently evolving trigger for the meiotic arrest of interspecific F1 hybrids. We observed extensive asynapsis of chromosomes and disturbance of the sex body in >95% of pachynemas of Mus m. musculus × Mus m. domesticus sterile F1 males. Asynapsis was not preceded by a failure of double-strand break induction, and the rate of meiotic crossing over was not affected in synapsed chromosomes. DNA double-strand break repair was delayed or failed in unsynapsed autosomes, and misexpression of chromosome X and chromosome Y genes was detected in single pachynemas and by genome-wide expression profiling. Oocytes of F1 hybrid females showed the same kind of synaptic problems but with the incidence reduced to half. Most of the oocytes with pachytene asynapsis were eliminated before birth. We propose the heterospecific pairing of homologous chromosomes as a preexisting condition of asynapsis in interspecific hybrids. The asynapsis may represent a universal mechanistic basis of F1 hybrid sterility manifested by pachytene arrest. It is tempting to speculate that a fast-evolving subset of the noncoding genomic sequence important for chromosome pairing and synapsis may be the culprit.
- MeSH
- apoptóza genetika MeSH
- biologická evoluce MeSH
- biologické modely MeSH
- druhová specificita MeSH
- dvouřetězcové zlomy DNA MeSH
- inbrední kmeny myší klasifikace genetika fyziologie MeSH
- infertilita genetika patologie patofyziologie MeSH
- křížení genetické MeSH
- meióza genetika MeSH
- myši inbrední BALB C MeSH
- myši inbrední C57BL MeSH
- myši MeSH
- oocyty patologie MeSH
- párování chromozomů genetika MeSH
- rekombinace genetická MeSH
- spermatocyty patologie MeSH
- spermatogeneze genetika MeSH
- těhotenství MeSH
- transkriptom MeSH
- vznik druhů (genetika) MeSH
- zvířata MeSH
- Check Tag
- mužské pohlaví MeSH
- myši MeSH
- těhotenství MeSH
- ženské pohlaví MeSH
- zvířata MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
The Dobzhansky-Muller model of incompatibilities explains reproductive isolation between species by incorrect epistatic interactions. Although the mechanisms of speciation are of great interest, no incompatibility has been characterized at the gene level in mammals. The Hybrid sterility 1 gene (Hst1) participates in the arrest of meiosis in F(1) males of certain strains from two Mus musculus subspecies, e.g., PWD from M. m. musculus and C57BL/6J (henceforth B6) from M. m. domesticus. Hst1 has been identified as a meiotic PR-domain gene (Prdm9) encoding histone 3 methyltransferase in the male offspring of PWD females and B6 males, (PWD×B6)F(1). To characterize the incompatibilities underlying hybrid sterility, we phenotyped reproductive and meiotic markers in males with altered copy numbers of Prdm9. A partial rescue of fertility was observed upon removal of the B6 allele of Prdm9 from the azoospermic (PWD×B6)F(1) hybrids, whereas removing one of the two Prdm9 copies in PWD or B6 background had no effect on male reproduction. Incompatibility(ies) not involving Prdm9(B6) also acts in the (PWD×B6)F(1) hybrids, since the correction of hybrid sterility by Prdm9(B6) deletion was not complete. Additions and subtractions of Prdm9 copies, as well as allelic replacements, improved meiotic progression and fecundity also in the progeny-producing reciprocal (B6×PWD)F(1) males. Moreover, an increased dosage of Prdm9 and reciprocal cross enhanced fertility of other sperm-carrying male hybrids, (PWD×B6-C3H.Prdm9)F(1), harboring another Prdm9 allele of M. m. domesticus origin. The levels of Prdm9 mRNA isoforms were similar in the prepubertal testes of all types of F(1) hybrids of PWD with B6 and B6-C3H.Prdm9 despite their different prospective fertility, but decreased to 53% after removal of Prdm9(B6). Therefore, the Prdm9(B6) allele probably takes part in posttranscriptional dominant-negative hybrid interaction(s) absent in the parental strains.
- MeSH
- alely MeSH
- chiméra * genetika fyziologie MeSH
- fertilita genetika MeSH
- genetická epistáze * MeSH
- histonlysin-N-methyltransferasa genetika MeSH
- hybridizace genetická MeSH
- mapování chromozomů MeSH
- meióza MeSH
- mužská infertilita genetika MeSH
- myši MeSH
- reprodukční izolace MeSH
- zvířata MeSH
- Check Tag
- mužské pohlaví MeSH
- myši MeSH
- ženské pohlaví MeSH
- zvířata MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH