Abstract
<ns4:p>The <ns4:ext-link xmlns:ns3="http://www.w3.org/1999/xlink" ext-link-type="uri" ns3:href="https://pubmlst.org/">PubMLST.org</ns4:ext-link> website hosts a collection of open-access, curated databases that integrate population sequence data with provenance and phenotype information for over 100 different microbial species and genera. Although the PubMLST website was conceived as part of the development of the first multi-locus sequence typing (MLST) scheme in 1998 the software it uses, the Bacterial Isolate Genome Sequence database (BIGSdb, published in 2010), enables PubMLST to include all levels of sequence data, from single gene sequences up to and including complete, finished genomes. Here we describe developments in the BIGSdb software made from publication to June 2018 and show how the platform realises microbial population genomics for a wide range of applications. The system is based on the gene-by-gene analysis of microbial genomes, with each deposited sequence annotated and curated to identify the genes present and systematically catalogue their variation. Originally intended as a means of characterising isolates with typing schemes, the synthesis of sequences and records of genetic variation with provenance and phenotype data permits highly scalable (whole genome sequence data for tens of thousands of isolates) means of addressing a wide range of functional questions, including: the prediction of antimicrobial resistance; likely cross-reactivity with vaccine antigens; and the functional activities of different variants that lead to key phenotypes. There are no limitations to the number of sequences, genetic loci, allelic variants or schemes (combinations of loci) that can be included, enabling each database to represent an expanding catalogue of the genetic variation of the population in question. In addition to providing web-accessible analyses and links to third-party analysis and visualisation tools, the BIGSdb software includes a RESTful application programming interface (API) that enables access to all the underlying data for third-party applications and data analysis pipelines.</ns4:p>
Keywords
Affiliated Institutions
Related Publications
Comparison of traditional and molecular methods of typing isolates of Staphylococcus aureus
Fifty-nine Staphylococcus aureus isolates and 1 isolate of Staphylococcus intermedius were typed by investigators at eight institutions by using either antibiograms, bacteriopha...
Molecular dissection of the evolution of carbapenem-resistant multilocus sequence type 258 <i>Klebsiella pneumoniae</i>
Significance Carbapenem-resistant Klebsiella pneumoniae has emerged globally as a multidrug-resistant hospital pathogen for which there are few treatment options. Clinical isola...
Raincloud plots: a multi-platform tool for robust data visualization
<ns3:p>Across scientific disciplines, there is a rapidly growing recognition of the need for more statistically robust, transparent approaches to data visualization. Complementa...
Introducing EzTaxon-e: a prokaryotic 16S rRNA gene sequence database with phylotypes that represent uncultured species
Despite recent advances in commercially optimized identification systems, bacterial identification remains a challenging task in many routine microbiological laboratories, espec...
EzTaxon: a web-based tool for the identification of prokaryotes based on 16S ribosomal RNA gene sequences
16S rRNA gene sequences have been widely used for the identification of prokaryotes. However, the flood of sequences of non-type strains and the lack of a peer-reviewed database...
Publication Info
- Year
- 2018
- Type
- preprint
- Volume
- 3
- Pages
- 124-124
- Citations
- 3001
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.12688/wellcomeopenres.14826.1