Abstract

The virulence factor database (VFDB, http://www.mgc.ac.cn/VFs/) is devoted to providing the scientific community with a comprehensive warehouse and online platform for deciphering bacterial pathogenesis. The various combinations, organizations and expressions of virulence factors (VFs) are responsible for the diverse clinical symptoms of pathogen infections. Currently, whole-genome sequencing is widely used to decode potential novel or variant pathogens both in emergent outbreaks and in routine clinical practice. However, the efficient characterization of pathogenomic compositions remains a challenge for microbiologists or physicians with limited bioinformatics skills. Therefore, we introduced to VFDB an integrated and automatic pipeline, VFanalyzer, to systematically identify known/potential VFs in complete/draft bacterial genomes. VFanalyzer first constructs orthologous groups within the query genome and preanalyzed reference genomes from VFDB to avoid potential false positives due to paralogs. Then, it conducts iterative and exhaustive sequence similarity searches among the hierarchical prebuilt datasets of VFDB to accurately identify potential untypical/strain-specific VFs. Finally, via a context-based data refinement process for VFs encoded by gene clusters, VFanalyzer can achieve relatively high specificity and sensitivity without manual curation. In addition, a thoroughly optimized interactive web interface is introduced to present VFanalyzer reports in comparative pathogenomic style for easy online analysis.

Keywords

BiologyGenomeContext (archaeology)Computational biologyFalse positive paradoxInterface (matter)Pipeline (software)Web applicationComputer scienceGeneWorld Wide WebGeneticsArtificial intelligence

MeSH Terms

DatabasesGeneticGenomeBacterialGenomicsSensitivity and SpecificitySoftwareVirulenceVirulence FactorsWhole Genome Sequencing

Affiliated Institutions

Related Publications

Ensembl 2020

Abstract The Ensembl (https://www.ensembl.org) is a system for generating and distributing genome annotation such as genes, variation, regulation and comparative genomics across...

2019 Nucleic Acids Research 1174 citations

Ensembl 2022

Abstract Ensembl (https://www.ensembl.org) is unique in its flexible infrastructure for access to genomic data and annotation. It has been designed to efficiently deliver annota...

2021 Nucleic Acids Research 2054 citations

Publication Info

Year
2018
Type
article
Volume
47
Issue
D1
Pages
D687-D692
Citations
1716
Access
Closed

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

1716
OpenAlex
134
Influential
1537
CrossRef

Cite This

Bo Liu, Dandan Zheng, Qi Jin et al. (2018). VFDB 2019: a comparative pathogenomic platform with an interactive web interface. Nucleic Acids Research , 47 (D1) , D687-D692. https://doi.org/10.1093/nar/gky1080

Identifiers

DOI
10.1093/nar/gky1080
PMID
30395255
PMCID
PMC6324032

Data Quality

Data completeness: 90%