Abstract

Abstract Summary: MEGAHIT is a NGS de novo assembler for assembling large and complex metagenomics data in a time- and cost-efficient manner. It finished assembling a soil metagenomics dataset with 252 Gbps in 44.1 and 99.6 h on a single computing node with and without a graphics processing unit, respectively. MEGAHIT assembles the data as a whole, i.e. no pre-processing like partitioning and normalization was needed. When compared with previous methods on assembling the soil data, MEGAHIT generated a three-time larger assembly, with longer contig N50 and average contig length; furthermore, 55.8% of the reads were aligned to the assembly, giving a fourfold improvement. Availability and implementation: The source code of MEGAHIT is freely available at https://github.com/voutcn/megahit under GPLv3 license. Contact: rb@l3-bioinfo.com or twlam@cs.hku.hk Supplementary information: Supplementary data are available at Bioinformatics online.

Keywords

ContigMetagenomicsDe Bruijn graphSequence assemblyComputer scienceDe Bruijn sequenceSoftwarek-merGraphicsGraphNode (physics)Normalization (sociology)Graphics processing unitParallel computingComputational biologyTheoretical computer scienceBiologyDNA sequencingComputer graphics (images)MathematicsOperating systemCombinatoricsGenomeEngineeringGenetics

Affiliated Institutions

Related Publications

Publication Info

Year
2015
Type
article
Volume
31
Issue
10
Pages
1674-1676
Citations
8475
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

8475
OpenAlex

Cite This

Dinghua Li, Chi-Man Liu, Ruibang Luo et al. (2015). MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct <i>de Bruijn</i> graph. Bioinformatics , 31 (10) , 1674-1676. https://doi.org/10.1093/bioinformatics/btv033

Identifiers

DOI
10.1093/bioinformatics/btv033