-
Je něco špatně v tomto záznamu ?
Improvement of the banana "Musa acuminata" reference sequence using NGS data and semi-automated bioinformatics methods
G. Martin, FC. Baurens, G. Droc, M. Rouard, A. Cenci, A. Kilian, A. Hastie, J. Doležel, JM. Aury, A. Alberti, F. Carreel, A. D'Hont,
Jazyk angličtina Země Anglie, Velká Británie
Typ dokumentu časopisecké články, práce podpořená grantem
NLK
BioMedCentral
od 2000-12-01
BioMedCentral Open Access
od 2000
Directory of Open Access Journals
od 2000
Free Medical Journals
od 2000
PubMed Central
od 2000
Europe PubMed Central
od 2000 do 2020
ProQuest Central
od 2009-01-01
Open Access Digital Library
od 2000-07-01
Open Access Digital Library
od 2000-01-01
Open Access Digital Library
od 2000-01-01
Medline Complete (EBSCOhost)
od 2000-01-01
Health & Medicine (ProQuest)
od 2009-01-01
ROAD: Directory of Open Access Scholarly Resources
od 2000
Springer Nature OA/Free Journals
od 2000-12-01
- MeSH
- anotace sekvence MeSH
- banánovník genetika MeSH
- genetické markery MeSH
- genom rostlinný * MeSH
- kontigové mapování MeSH
- sekvenční analýza DNA MeSH
- výpočetní biologie metody MeSH
- vysoce účinné nukleotidové sekvenování MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
BACKGROUND: Recent advances in genomics indicate functional significance of a majority of genome sequences and their long range interactions. As a detailed examination of genome organization and function requires very high quality genome sequence, the objective of this study was to improve reference genome assembly of banana (Musa acuminata). RESULTS: We have developed a modular bioinformatics pipeline to improve genome sequence assemblies, which can handle various types of data. The pipeline comprises several semi-automated tools. However, unlike classical automated tools that are based on global parameters, the semi-automated tools proposed an expert mode for a user who can decide on suggested improvements through local compromises. The pipeline was used to improve the draft genome sequence of Musa acuminata. Genotyping by sequencing (GBS) of a segregating population and paired-end sequencing were used to detect and correct scaffold misassemblies. Long insert size paired-end reads identified scaffold junctions and fusions missed by automated assembly methods. GBS markers were used to anchor scaffolds to pseudo-molecules with a new bioinformatics approach that avoids the tedious step of marker ordering during genetic map construction. Furthermore, a genome map was constructed and used to assemble scaffolds into super scaffolds. Finally, a consensus gene annotation was projected on the new assembly from two pre-existing annotations. This approach reduced the total Musa scaffold number from 7513 to 1532 (i.e. by 80%), with an N50 that increased from 1.3 Mb (65 scaffolds) to 3.0 Mb (26 scaffolds). 89.5% of the assembly was anchored to the 11 Musa chromosomes compared to the previous 70%. Unknown sites (N) were reduced from 17.3 to 10.0%. CONCLUSION: The release of the Musa acuminata reference genome version 2 provides a platform for detailed analysis of banana genome variation, function and evolution. Bioinformatics tools developed in this work can be used to improve genome sequence assemblies in other species.
BioNano Genomics 9640 Towne Centre Drive San Diego CA 92121 USA
Bioversity International Parc Scientifique Agropolis 2 34397 Montpellier Cedex 5 France
CIRAD UMR AGAP TA A 108 03 Avenue Agropolis F 34398 Montpellier cedex 5 France
Commissariat à l'Energie Atomique Genoscope 2 rue Gaston Cremieux BP5706 91057 Evry France
Diversity Arrays Technology Yarralumla Australian Capital Territory 2600 Australia
Citace poskytuje Crossref.org
- 000
- 00000naa a2200000 a 4500
- 001
- bmc17000340
- 003
- CZ-PrNML
- 005
- 20170112123131.0
- 007
- ta
- 008
- 170103s2016 enk f 000 0|eng||
- 009
- AR
- 024 7_
- $a 10.1186/s12864-016-2579-4 $2 doi
- 024 7_
- $a 10.1186/s12864-016-2579-4 $2 doi
- 035 __
- $a (PubMed)26984673
- 040 __
- $a ABA008 $b cze $d ABA008 $e AACR2
- 041 0_
- $a eng
- 044 __
- $a enk
- 100 1_
- $a Martin, Guillaume $u CIRAD (Centre de coopération Internationale en Recherche Agronomique pour le Développement), UMR AGAP, TA A-108/03, Avenue Agropolis, F-34398, Montpellier, cedex 5, France.
- 245 10
- $a Improvement of the banana "Musa acuminata" reference sequence using NGS data and semi-automated bioinformatics methods / $c G. Martin, FC. Baurens, G. Droc, M. Rouard, A. Cenci, A. Kilian, A. Hastie, J. Doležel, JM. Aury, A. Alberti, F. Carreel, A. D'Hont,
- 520 9_
- $a BACKGROUND: Recent advances in genomics indicate functional significance of a majority of genome sequences and their long range interactions. As a detailed examination of genome organization and function requires very high quality genome sequence, the objective of this study was to improve reference genome assembly of banana (Musa acuminata). RESULTS: We have developed a modular bioinformatics pipeline to improve genome sequence assemblies, which can handle various types of data. The pipeline comprises several semi-automated tools. However, unlike classical automated tools that are based on global parameters, the semi-automated tools proposed an expert mode for a user who can decide on suggested improvements through local compromises. The pipeline was used to improve the draft genome sequence of Musa acuminata. Genotyping by sequencing (GBS) of a segregating population and paired-end sequencing were used to detect and correct scaffold misassemblies. Long insert size paired-end reads identified scaffold junctions and fusions missed by automated assembly methods. GBS markers were used to anchor scaffolds to pseudo-molecules with a new bioinformatics approach that avoids the tedious step of marker ordering during genetic map construction. Furthermore, a genome map was constructed and used to assemble scaffolds into super scaffolds. Finally, a consensus gene annotation was projected on the new assembly from two pre-existing annotations. This approach reduced the total Musa scaffold number from 7513 to 1532 (i.e. by 80%), with an N50 that increased from 1.3 Mb (65 scaffolds) to 3.0 Mb (26 scaffolds). 89.5% of the assembly was anchored to the 11 Musa chromosomes compared to the previous 70%. Unknown sites (N) were reduced from 17.3 to 10.0%. CONCLUSION: The release of the Musa acuminata reference genome version 2 provides a platform for detailed analysis of banana genome variation, function and evolution. Bioinformatics tools developed in this work can be used to improve genome sequence assemblies in other species.
- 650 _2
- $a výpočetní biologie $x metody $7 D019295
- 650 _2
- $a kontigové mapování $7 D020451
- 650 _2
- $a genetické markery $7 D005819
- 650 12
- $a genom rostlinný $7 D018745
- 650 _2
- $a vysoce účinné nukleotidové sekvenování $7 D059014
- 650 _2
- $a anotace sekvence $7 D058977
- 650 _2
- $a banánovník $x genetika $7 D028521
- 650 _2
- $a sekvenční analýza DNA $7 D017422
- 655 _2
- $a časopisecké články $7 D016428
- 655 _2
- $a práce podpořená grantem $7 D013485
- 700 1_
- $a Baurens, Franc-Christophe $u CIRAD (Centre de coopération Internationale en Recherche Agronomique pour le Développement), UMR AGAP, TA A-108/03, Avenue Agropolis, F-34398, Montpellier, cedex 5, France.
- 700 1_
- $a Droc, Gaëtan $u CIRAD (Centre de coopération Internationale en Recherche Agronomique pour le Développement), UMR AGAP, TA A-108/03, Avenue Agropolis, F-34398, Montpellier, cedex 5, France.
- 700 1_
- $a Rouard, Mathieu $u Bioversity International, Parc Scientifique Agropolis II, 34397, Montpellier, Cedex 5, France.
- 700 1_
- $a Cenci, Alberto $u Bioversity International, Parc Scientifique Agropolis II, 34397, Montpellier, Cedex 5, France.
- 700 1_
- $a Kilian, Andrzej $u Diversity Arrays Technology, Yarralumla, Australian Capital Territory, 2600, Australia.
- 700 1_
- $a Hastie, Alex $u BioNano Genomics, 9640 Towne Centre Drive, San Diego, CA, 92121, USA.
- 700 1_
- $a Doležel, Jaroslav $u Institute of Experimental Botany, Centre of the Region Hana for Biotechnological and Agricultural Research, Šlechtitelů 31, CZ-78371, Olomouc, Czech Republic.
- 700 1_
- $a Aury, Jean-Marc $u Commissariat à l'Energie Atomique (CEA), Institut de Genomique (IG), Genoscope, 2 rue Gaston Cremieux, BP5706, 91057, Evry, France. $7 gn_A_00010156
- 700 1_
- $a Alberti, Adriana $u Commissariat à l'Energie Atomique (CEA), Institut de Genomique (IG), Genoscope, 2 rue Gaston Cremieux, BP5706, 91057, Evry, France. $7 gn_A_00003471
- 700 1_
- $a Carreel, Françoise $u CIRAD (Centre de coopération Internationale en Recherche Agronomique pour le Développement), UMR AGAP, TA A-108/03, Avenue Agropolis, F-34398, Montpellier, cedex 5, France.
- 700 1_
- $a D'Hont, Angélique $u CIRAD (Centre de coopération Internationale en Recherche Agronomique pour le Développement), UMR AGAP, TA A-108/03, Avenue Agropolis, F-34398, Montpellier, cedex 5, France. dhont@cirad.fr.
- 773 0_
- $w MED00008181 $t BMC genomics $x 1471-2164 $g Roč. 17, č. - (2016), s. 243
- 856 41
- $u https://pubmed.ncbi.nlm.nih.gov/26984673 $y Pubmed
- 910 __
- $a ABA008 $b sig $c sign $y a $z 0
- 990 __
- $a 20170103 $b ABA008
- 991 __
- $a 20170112123230 $b ABA008
- 999 __
- $a ok $b bmc $g 1179480 $s 960907
- BAS __
- $a 3
- BAS __
- $a PreBMC
- BMC __
- $a 2016 $b 17 $c - $d 243 $e 20160316 $i 1471-2164 $m BMC genomics $n BMC Genomics $x MED00008181
- LZP __
- $a Pubmed-20170103