Abstract

Abstract Motivation Although long-read sequencing technologies can produce genomes with long contiguity, they suffer from high error rates. Thus, we developed NextPolish, a tool that efficiently corrects sequence errors in genomes assembled with long reads. This new tool consists of two interlinked modules that are designed to score and count K-mers from high quality short reads, and to polish genome assemblies containing large numbers of base errors. Results When evaluated for the speed and efficiency using human and a plant (Arabidopsis thaliana) genomes, NextPolish outperformed Pilon by correcting sequence errors faster, and with a higher correction accuracy. Availability and implementation NextPolish is implemented in C and Python. The source code is available from https://github.com/Nextomics/NextPolish. Supplementary information Supplementary data are available at Bioinformatics online.

Keywords

Python (programming language)Computer scienceGenomeSequence assemblyk-merSource codeSoftwareContiguityComputational biologyBiologyProgramming languageGeneticsOperating system

Affiliated Institutions

Related Publications

Publication Info

Year
2019
Type
article
Volume
36
Issue
7
Pages
2253-2255
Citations
1183
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

1183
OpenAlex

Cite This

Jiang Hu, Junpeng Fan, Zongyi Sun et al. (2019). NextPolish: a fast and efficient genome polishing tool for long-read assembly. Bioinformatics , 36 (7) , 2253-2255. https://doi.org/10.1093/bioinformatics/btz891

Identifiers

DOI
10.1093/bioinformatics/btz891