JavaScript is NOT enabled !

Please enable JavaScript.

advanced reference assembly Query Show help

Reset

13 hits in PubMed

Article

An advanced reference genome of Trifolium subterraneum L. reveals genes related to agronomic performance

We reported a draft genome assembly of the subterranean clover TSUd_r1.1. ...

Kaur, Parwinder
Author Kaur, Parwinder ORCID Centre for Plant Genetics and Breeding and Institute of Agriculture, The University of Western Australia, Crawley, WA, Australia
Bayer, Philipp E
Author Bayer, Philipp E School of Plant Biology and Institute of Agriculture, The University of Western Australia, Crawley, WA, Australia
Milec, Zbyněk
Author Milec, Zbyněk Institute of Experimental Botany, Centre of the Region Haná for Biotechnological and Agricultural Research, Olomouc, Czech Republic
Vrána, Jan
Author Vrána, Jan Institute of Experimental Botany, Centre of the Region Haná for Biotechnological and Agricultural Research, Olomouc, Czech Republic
Yuan, Yuxuan
Author Yuan, Yuxuan ORCID School of Plant Biology and Institute of Agriculture, The University of Western Australia, Crawley, WA, Australia
Appels, Rudi
Author Appels, Rudi Murdoch University, Murdoch, WA, Australia
Edwards, David
Author Edwards, David School of Plant Biology and Institute of Agriculture, The University of Western Australia, Crawley, WA, Australia
Batley, Jacqueline
Author Batley, Jacqueline School of Plant Biology and Institute of Agriculture, The University of Western Australia, Crawley, WA, Australia
Nichols, Phillip
Author Nichols, Phillip School of Plant Biology and Institute of Agriculture, The University of Western Australia, Crawley, WA, Australia Department of Agriculture and Food Western Australia, South Perth, WA, Australia
Erskine, William
Author Erskine, William Centre for Plant Genetics and Breeding and Institute of Agriculture, The University of Western Australia, Crawley, WA, Australia

Plant biotechnology journal. 2017 Aug ; 15 (8) : 1034-1046. [epub] 20170323

Plant Biotechnol J
ISSN 1467-7652 | 1467-7644
Source

Subterranean clover is an important annual forage legume, whose diploidy and inbreeding nature make it an ideal model for genomic analysis in Trifolium. We reported a draft genome assembly of the subterranean clover TSUd_r1.1. Here we evaluate genome mapping on nanochannel arrays and generation of a transcriptome atlas across tissues to advance the assembly and gene annotation. Using a BioNano-based assembly spanning 512 Mb (93% genome coverage), we validated the draft assembly, anchored unplaced contigs and resolved misassemblies. Multiple contigs (264) from the draft assembly coalesced into 97 super-scaffolds (43% of genome). Sequences longer than >1 Mb increased from 40 to 189 Mb giving 1.4-fold increase in N50 with total genome in pseudomolecules improved from 73 to 80%. The advanced assembly was re-annotated using transcriptome atlas data to contain 31 272 protein-coding genes capturing >96% of the gene content. Functional characterization and GO enrichment confirmed gene expression for response to water deprivation, flavonoid biosynthesis and embryo development ending in seed dormancy, reflecting adaptation to the harsh Mediterranean environment. Comparative analyses across Papilionoideae identified 24 893 Trifolium-specific and 6325 subterranean-clover-specific genes that could be mined further for traits such as geocarpy and grazing tolerance. Eight key traits, including persistence, improved livestock health by isoflavonoid production in addition to important agro-morphological traits, were fine-mapped on the high-density SNP linkage map anchored to the assembly. This new genomic information is crucial to identify loci governing traits allowing marker-assisted breeding, comparative mapping and identification of tissue-specific gene promoters for biotechnological improvement of forage legumes.

Keywords
BioNano, Legume comparative genomics, advanced reference assembly, forage legumes, gene expression, transcriptome,
MeSH
Genome, Plant genetics MeSH
Genomics methods MeSH
Sequence Analysis, DNA methods MeSH
Trifolium genetics MeSH
Publication type
Journal Article MeSH

Article

Complete analysis of G-quadruplex forming sequences in the gapless assembly of human chromosome Y

Recent advancements have finally delivered a complete human genome assembly, including the elusive Y ...

Biochimie. 2025 Feb ; 229 () : 49-57. [epub] 20241009

ISSN 1638-6183 | 0300-9084
Source

Recent advancements have finally delivered a complete human genome assembly, including the elusive Y chromosome. This accomplishment closes a significant knowledge gap. Prior efforts were hampered by challenges in sequencing repetitive DNA structures such as direct and inverted repeats. We used the G4Hunter algorithm to analyze the presence of G-quadruplex forming sequences (G4s) within the current human reference genome (GRCh38) and the new telomere-to-telomere (T2T) Y chromosome assemblies. This analysis served a dual purpose: identifying the location of potential G4s within the genomes and exploring their association with functionally annotated sequences. Compared to GRCh38, the T2T assembly exhibited a significantly higher prevalence of G-quadruplex forming sequences. Notably, these repeats were abundantly located around precursor RNA, exons, genes, and within protein binding sites. This remarkable co-occurrence of G4-forming sequences with these critical regulatory regions suggests their role in fundamental DNA regulation processes. Our findings indicate that the current human reference genome significantly underestimated the number of G4s, potentially overlooking their functional importance.

Keywords
Chromosome Y, G-quadruplex, Gapless assembly, Genome analysis,
MeSH
Algorithms MeSH
G-Quadruplexes * MeSH
Genome, Human * MeSH
Humans MeSH
Chromosomes, Human, Y * genetics MeSH
Telomere genetics MeSH
Check Tag
Humans MeSH
Publication type
Journal Article MeSH

Article

Improvement of the banana "Musa acuminata" reference sequence using NGS data and semi-automated bioinformatics methods

BACKGROUND: Recent advances in genomics indicate functional significance of a majority of genome sequences ...

BMC genomics. 2016 Mar 16 ; 17 () : 243. [epub] 20160316

BMC Genomics
ISSN 1471-2164
Source

BACKGROUND: Recent advances in genomics indicate functional significance of a majority of genome sequences and their long range interactions. As a detailed examination of genome organization and function requires very high quality genome sequence, the objective of this study was to improve reference genome assembly of banana (Musa acuminata). RESULTS: We have developed a modular bioinformatics pipeline to improve genome sequence assemblies, which can handle various types of data. The pipeline comprises several semi-automated tools. However, unlike classical automated tools that are based on global parameters, the semi-automated tools proposed an expert mode for a user who can decide on suggested improvements through local compromises. The pipeline was used to improve the draft genome sequence of Musa acuminata. Genotyping by sequencing (GBS) of a segregating population and paired-end sequencing were used to detect and correct scaffold misassemblies. Long insert size paired-end reads identified scaffold junctions and fusions missed by automated assembly methods. GBS markers were used to anchor scaffolds to pseudo-molecules with a new bioinformatics approach that avoids the tedious step of marker ordering during genetic map construction. Furthermore, a genome map was constructed and used to assemble scaffolds into super scaffolds. Finally, a consensus gene annotation was projected on the new assembly from two pre-existing annotations. This approach reduced the total Musa scaffold number from 7513 to 1532 (i.e. by 80%), with an N50 that increased from 1.3 Mb (65 scaffolds) to 3.0 Mb (26 scaffolds). 89.5% of the assembly was anchored to the 11 Musa chromosomes compared to the previous 70%. Unknown sites (N) were reduced from 17.3 to 10.0%. CONCLUSION: The release of the Musa acuminata reference genome version 2 provides a platform for detailed analysis of banana genome variation, function and evolution. Bioinformatics tools developed in this work can be used to improve genome sequence assemblies in other species.

Keywords
Bioinformatics tool, GBS, Genome assembly, Genome map, Musa acuminata, Paired-end sequences,
MeSH
Molecular Sequence Annotation MeSH
Musa genetics MeSH
Genetic Markers MeSH
Genome, Plant * MeSH
Contig Mapping MeSH
Sequence Analysis, DNA MeSH
Computational Biology methods MeSH
High-Throughput Nucleotide Sequencing MeSH
Publication type
Journal Article MeSH
Research Support, Non-U.S. Gov't MeSH
Names of Substances
Genetic Markers MeSH

Article

New telomere to telomere assembly of human chromosome 8 reveals a previous underestimation of G-quadruplex forming sequences and inverted repeats

evolving and improving sequencing methods, human chromosome 8 is now available as a gapless, end-to-end assembly ...

Gene. 2022 Feb 05 ; 810 () : 146058. [epub] 20211101

ISSN 1879-0038 | 0378-1119
Source

Taking advantage of evolving and improving sequencing methods, human chromosome 8 is now available as a gapless, end-to-end assembly. Thanks to advances in long-read sequencing technologies, its centromere, telomeres, duplicated gene families and repeat-rich regions are now fully sequenced. We were interested to assess if the new assembly altered our understanding of the potential impact of non-B DNA structures within this completed chromosome sequence. It has been shown that non-B secondary structures, such as G-quadruplexes, hairpins and cruciforms, have important regulatory functions and potential as targeted therapeutics. Therefore, we analysed the presence of putative G-quadruplex forming sequences and inverted repeats in the current human reference genome (GRCh38) and in the new end-to-end assembly of chromosome 8. The comparison revealed that the new assembly contains significantly more inverted repeats and G-quadruplex forming sequences compared to the current reference sequence. This observation can be explained by improved accuracy of the new sequencing methods, particularly in regions that contain extensive repeats of bases, as is preferred by many non-B DNA structures. These results show a significant underestimation of the prevalence of non-B DNA secondary structure in previous assembly versions of the human genome and point to their importance being not fully appreciated. We anticipate that similar observations will occur as the improved sequencing technologies fill in gaps across the genomes of humans and other organisms.

Keywords
G-quadruplex, Genome sequence of human chromosome 8, Inverted repeat, Non-B DNA structures,
MeSH
G-Quadruplexes * MeSH
Genome, Human MeSH
Sequence Inversion * MeSH
Humans MeSH
Chromosomes, Human, Pair 8 * MeSH
Sequence Analysis, DNA MeSH
Telomere * MeSH
Check Tag
Humans MeSH
Publication type
Journal Article MeSH

Article

Deleterious phenotypes in wild Arabidopsis arenosa populations are common and linked to runs of homozygosity

transcriptome profiling, linkage mapping and genome-wide runs of homozygosity patterns using a newly assembled ...

G3 (Bethesda, Md.). 2024 Mar 06 ; 14 (3) : .

G3 (Bethesda)
ISSN 2160-1836
Source

In this study, we aimed to systematically assess the frequency at which potentially deleterious phenotypes appear in natural populations of the outcrossing model plant Arabidopsis arenosa, and to establish their underlying genetics. For this purpose, we collected seeds from wild A. arenosa populations and screened over 2,500 plants for unusual phenotypes in the greenhouse. We repeatedly found plants with obvious phenotypic defects, such as small stature and necrotic or chlorotic leaves, among first-generation progeny of wild A. arenosa plants. Such abnormal plants were present in about 10% of maternal sibships, with multiple plants with similar phenotypes in each of these sibships, pointing to a genetic basis of the observed defects. A combination of transcriptome profiling, linkage mapping and genome-wide runs of homozygosity patterns using a newly assembled reference genome indicated a range of underlying genetic architectures associated with phenotypic abnormalities. This included evidence for homozygosity of certain genomic regions, consistent with alleles that are identical by descent being responsible for these defects. Our observations suggest that deleterious alleles with different genetic architectures are segregating at appreciable frequencies in wild A. arenosa populations.

Keywords
Arabidopsis arenosa, abnormal phenotypes, reference genome, runs of homozygosity, wild populations,
MeSH
Arabidopsis * genetics MeSH
Phenotype MeSH
Chromosome Mapping MeSH
Seeds MeSH
Publication type
Journal Article MeSH
Research Support, Non-U.S. Gov't MeSH

Article

The complex polyploid genome architecture of sugarcane

Thus, modern sugarcane hybrids are the last remaining major crop without a reference-quality genome. ...

Nature. 2024 Apr ; 628 (8009) : 804-810. [epub] 20240327

ISSN 1476-4687 | 0028-0836
Source

Sugarcane, the world's most harvested crop by tonnage, has shaped global history, trade and geopolitics, and is currently responsible for 80% of sugar production worldwide1. While traditional sugarcane breeding methods have effectively generated cultivars adapted to new environments and pathogens, sugar yield improvements have recently plateaued2. The cessation of yield gains may be due to limited genetic diversity within breeding populations, long breeding cycles and the complexity of its genome, the latter preventing breeders from taking advantage of the recent explosion of whole-genome sequencing that has benefited many other crops. Thus, modern sugarcane hybrids are the last remaining major crop without a reference-quality genome. Here we take a major step towards advancing sugarcane biotechnology by generating a polyploid reference genome for R570, a typical modern cultivar derived from interspecific hybridization between the domesticated species (Saccharum officinarum) and the wild species (Saccharum spontaneum). In contrast to the existing single haplotype ('monoploid') representation of R570, our 8.7 billion base assembly contains a complete representation of unique DNA sequences across the approximately 12 chromosome copies in this polyploid genome. Using this highly contiguous genome assembly, we filled a previously unsized gap within an R570 physical genetic map to describe the likely causal genes underlying the single-copy Bru1 brown rust resistance locus. This polyploid genome assembly with fine-grain descriptions of genome architecture and molecular targets for biotechnology will help accelerate molecular and transgenic breeding and adaptation of sugarcane to future environmental conditions.

MeSH
Biotechnology MeSH
Chromosomes, Plant genetics MeSH
DNA, Plant genetics MeSH
Genome, Plant * genetics MeSH
Haplotypes genetics MeSH
Hybridization, Genetic genetics MeSH
Polyploidy * MeSH
Reference Standards MeSH
Saccharum * classification genetics MeSH
Plant Breeding MeSH
Publication type
Journal Article MeSH
Research Support, Non-U.S. Gov't MeSH
Research Support, U.S. Gov't, Non-P.H.S. MeSH
Names of Substances
DNA, Plant MeSH

Article

Teaching transposon classification as a means to crowd source the curation of repeat annotation - a tardigrade perspective

BACKGROUND: The advancement of sequencing technologies results in the rapid release of hundreds of new ...

Mobile DNA. 2024 May 06 ; 15 (1) : 10. [epub] 20240506

Mob DNA
ISSN 1759-8753
Source

BACKGROUND: The advancement of sequencing technologies results in the rapid release of hundreds of new genome assemblies a year providing unprecedented resources for the study of genome evolution. Within this context, the significance of in-depth analyses of repetitive elements, transposable elements (TEs) in particular, is increasingly recognized in understanding genome evolution. Despite the plethora of available bioinformatic tools for identifying and annotating TEs, the phylogenetic distance of the target species from a curated and classified database of repetitive element sequences constrains any automated annotation effort. Moreover, manual curation of raw repeat libraries is deemed essential due to the frequent incompleteness of automatically generated consensus sequences. RESULTS: Here, we present an example of a crowd-sourcing effort aimed at curating and annotating TE libraries of two non-model species built around a collaborative, peer-reviewed teaching process. Manual curation and classification are time-consuming processes that offer limited short-term academic rewards and are typically confined to a few research groups where methods are taught through hands-on experience. Crowd-sourcing efforts could therefore offer a significant opportunity to bridge the gap between learning the methods of curation effectively and empowering the scientific community with high-quality, reusable repeat libraries. CONCLUSIONS: The collaborative manual curation of TEs from two tardigrade species, for which there were no TE libraries available, resulted in the successful characterization of hundreds of new and diverse TEs in a reasonable time frame. Our crowd-sourcing setting can be used as a teaching reference guide for similar projects: A hidden treasure awaits discovery within non-model organisms.

Keywords
Annotation, Genome assembly, Library, Manual curation, Non-model organism, Transposable elements,
Publication type
Journal Article MeSH

Article

Bioinformatics strategies for taxonomy independent binning and visualization of sequences in shotgun metagenomics

With the advancement of sequencing techniques, the main focus shifted to the whole metagenome shotgun ...

Computational and structural biotechnology journal. 2017 ; 15 () : 48-55. [epub] 20161205

Comput Struct Biotechnol J
ISSN 2001-0370
Source

One of main steps in a study of microbial communities is resolving their composition, diversity and function. In the past, these issues were mostly addressed by the use of amplicon sequencing of a target gene because of reasonable price and easier computational postprocessing of the bioinformatic data. With the advancement of sequencing techniques, the main focus shifted to the whole metagenome shotgun sequencing, which allows much more detailed analysis of the metagenomic data, including reconstruction of novel microbial genomes and to gain knowledge about genetic potential and metabolic capacities of whole environments. On the other hand, the output of whole metagenomic shotgun sequencing is mixture of short DNA fragments belonging to various genomes, therefore this approach requires more sophisticated computational algorithms for clustering of related sequences, commonly referred to as sequence binning. There are currently two types of binning methods: taxonomy dependent and taxonomy independent. The first type classifies the DNA fragments by performing a standard homology inference against a reference database, while the latter performs the reference-free binning by applying clustering techniques on features extracted from the sequences. In this review, we describe the strategies within the second approach. Although these strategies do not require prior knowledge, they have higher demands on the length of sequences. Besides their basic principle, an overview of particular methods and tools is provided. Furthermore, the review covers the utilization of the methods in context with the length of sequences and discusses the needs for metagenomic data preprocessing in form of initial assembly prior to binning.

Keywords
Abundance, Genomic signature, Metagenomics, Sequence binning, Taxonomy independent, Visualization,
Publication type
Journal Article MeSH
Review MeSH

Article

Capturing Wheat Phenotypes at the Genome Level

Recent technological advances in next-generation sequencing (NGS) technologies have dramatically reduced ...

Frontiers in plant science. 2022 ; 13 () : 851079. [epub] 20220704

Front Plant Sci
ISSN 1664-462X
Source

Recent technological advances in next-generation sequencing (NGS) technologies have dramatically reduced the cost of DNA sequencing, allowing species with large and complex genomes to be sequenced. Although bread wheat (Triticum aestivum L.) is one of the world's most important food crops, efficient exploitation of molecular marker-assisted breeding approaches has lagged behind that achieved in other crop species, due to its large polyploid genome. However, an international public-private effort spanning 9 years reported over 65% draft genome of bread wheat in 2014, and finally, after more than a decade culminated in the release of a gold-standard, fully annotated reference wheat-genome assembly in 2018. Shortly thereafter, in 2020, the genome of assemblies of additional 15 global wheat accessions was released. As a result, wheat has now entered into the pan-genomic era, where basic resources can be efficiently exploited. Wheat genotyping with a few hundred markers has been replaced by genotyping arrays, capable of characterizing hundreds of wheat lines, using thousands of markers, providing fast, relatively inexpensive, and reliable data for exploitation in wheat breeding. These advances have opened up new opportunities for marker-assisted selection (MAS) and genomic selection (GS) in wheat. Herein, we review the advances and perspectives in wheat genetics and genomics, with a focus on key traits, including grain yield, yield-related traits, end-use quality, and resistance to biotic and abiotic stresses. We also focus on reported candidate genes cloned and linked to traits of interest. Furthermore, we report on the improvement in the aforementioned quantitative traits, through the use of (i) clustered regularly interspaced short-palindromic repeats/CRISPR-associated protein 9 (CRISPR/Cas9)-mediated gene-editing and (ii) positional cloning methods, and of genomic selection. Finally, we examine the utilization of genomics for the next-generation wheat breeding, providing a practical example of using in silico bioinformatics tools that are based on the wheat reference-genome sequence.

Keywords
CRISPR/Cas9, QTL cloning, Wheat, abiotic-stress tolerance, disease resistance, genome-wide association, genomic selection, quantitative trait locus mapping,
Publication type
Journal Article MeSH
Review MeSH

Article

Speciation analysis of arsenic by selective hydride generation-cryotrapping-atomic fluorescence spectrometry with flame-in-gas-shield atomizer: achieving extremely low detection limits with inexpensive instrumentation

selective hydride generation-cryotrapping (HG-CT) coupled to an extremely sensitive but simple in-house assembled ...

Analytical chemistry. 2014 Oct 21 ; 86 (20) : 10422-8. [epub] 20141010

Anal Chem
ISSN 1520-6882 | 0003-2700
Source

This work describes the method of a selective hydride generation-cryotrapping (HG-CT) coupled to an extremely sensitive but simple in-house assembled and designed atomic fluorescence spectrometry (AFS) instrument for determination of toxicologically important As species. Here, an advanced flame-in-gas-shield atomizer (FIGS) was interfaced to HG-CT and its performance was compared to a standard miniature diffusion flame (MDF) atomizer. A significant improvement both in sensitivity and baseline noise was found that was reflected in improved (4 times) limits of detection (LODs). The yielded LODs with the FIGS atomizer were 0.44, 0.74, 0.15, 0.17 and 0.67 ng L(-1) for arsenite, total inorganic, mono-, dimethylated As and trimethylarsine oxide, respectively. Moreover, the sensitivities with FIGS and MDF were equal for all As species, allowing for the possibility of single species standardization with arsenate standard for accurate quantification of all other As species. The accuracy of HG-CT-AFS with FIGS was verified by speciation analysis in two samples of bottled drinking water and certified reference materials, NRC CASS-5 (nearshore seawater) and SLRS-5 (river water) that contain traces of methylated As species. As speciation was in agreement with results previously reported and sums of all quantified species corresponded with the certified total As. The feasibility of HG-CT-AFS with FIGS was also demonstrated by the speciation analysis in microsamples of exfoliated bladder epithelial cells isolated from human urine. The results for the sums of trivalent and pentavalent As species corresponded well with the reference results obtained by HG-CT-ICPMS (inductively coupled plasma mass spectrometry).

MeSH
Arsenic analysis chemistry MeSH
Chemistry Techniques, Analytical economics instrumentation MeSH
Spectrometry, Fluorescence standards MeSH
Limit of Detection MeSH
Nebulizers and Vaporizers MeSH
Drinking Water chemistry MeSH
Spectrophotometry, Atomic standards MeSH
Publication type
Journal Article MeSH
Research Support, Non-U.S. Gov't MeSH
Research Support, N.I.H., Extramural MeSH
Names of Substances
Arsenic MeSH
Drinking Water MeSH

Published

Filters

advanced reference assembly Query Show help

advanced reference assembly Query Show help

Refine by MeSH