Abstract

Next-generation sequencing (NGS) technologies have enabled high-throughput and low-cost generation of sequence data; however, de novo genome assembly remains a great challenge, particularly for large genomes. NGS short reads are often insufficient to create large contigs that span repeat sequences and to facilitate unambiguous assembly. Plant genomes are notorious for containing high quantities of repetitive elements, which combined with huge genome sizes, makes accurate assembly of these large and complex genomes intractable thus far. Using two-color genome mapping of tiling bacterial artificial chromosomes (BAC) clones on nanochannel arrays, we completed high-confidence assembly of a 2.1-Mb, highly repetitive region in the large and complex genome of Aegilops tauschii, the D-genome donor of hexaploid wheat (Triticum aestivum). Genome mapping is based on direct visualization of sequence motifs on single DNA molecules hundreds of kilobases in length. With the genome map as a scaffold, we anchored unplaced sequence contigs, validated the initial draft assembly, and resolved instances of misassembly, some involving contigs <2 kb long, to dramatically improve the assembly from 75% to 95% complete.

Keywords

ContigGenomeAegilops tauschiiSequence assemblyBiologyComputational biologyDNA sequencingGeneticsWhole genome sequencingHybrid genome assemblyComparative genomicsGenome projectReference genomeGenomicsGeneTranscriptome

MeSH Terms

Chromosome MappingChromosomesArtificialBacterialChromosomesPlantGenesPlantGenomePlantHigh-Throughput Nucleotide SequencingMolecular Sequence DataNanotechnologySequence AnalysisDNATriticum

Affiliated Institutions

Related Publications

Publication Info

Year
2013
Type
article
Volume
8
Issue
2
Pages
e55864-e55864
Citations
154
Access
Closed

Citation Metrics

154
OpenAlex
6
Influential
126
CrossRef

Cite This

Alex Hastie, Lingli Dong, Alexis L. Smith et al. (2013). Rapid Genome Mapping in Nanochannel Arrays for Highly Complete and Accurate De Novo Sequence Assembly of the Complex Aegilops tauschii Genome. PLoS ONE , 8 (2) , e55864-e55864. https://doi.org/10.1371/journal.pone.0055864

Identifiers

DOI
10.1371/journal.pone.0055864
PMID
23405223
PMCID
PMC3566107

Data Quality

Data completeness: 86%