Abstract
Existing long-read assemblers require thousands of central processing unit hours to assemble a human genome and are being outpaced by sequencing technologies in terms of both throughput and cost. We developed a long-read assembler wtdbg2 (https://github.com/ruanjue/wtdbg2) that is 2–17 times as fast as published tools while achieving comparable contiguity and accuracy. It paves the way for population-scale long-read assembly in future. Wtdbg2 assembles genomes with comparable contiguity and accuracy to existing tools using long-read sequencing data, and is several times faster, especially for large genomes.
Keywords
MeSH Terms
Affiliated Institutions
Related Publications
GAGE: A critical evaluation of genome assemblies and assembly algorithms
New sequencing technology has dramatically altered the landscape of whole-genome sequencing, allowing scientists to initiate numerous projects to decode the genomes of previousl...
Whole-Genome Sequencing and Assembly with High-Throughput, Short-Read Technologies
While recently developed short-read sequencing technologies may dramatically reduce the sequencing cost and eventually achieve the $1000 goal for re-sequencing, their limitation...
Minimap2: pairwise alignment for nucleotide sequences
Abstract Motivation Recent advances in sequencing technologies promise ultra-long reads of ∼100 kb in average, full-length mRNA or cDNA reads in high throughput and genomic cont...
De novo assembly of human genomes with massively parallel short read sequencing
Next-generation massively parallel DNA sequencing technologies provide ultrahigh throughput at a substantially lower unit data cost; however, the data are very short read length...
Publication Info
- Year
- 2019
- Type
- article
- Volume
- 17
- Issue
- 2
- Pages
- 155-158
- Citations
- 1466
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.1038/s41592-019-0669-3
- PMID
- 31819265
- PMCID
- PMC7004874