Abstract

Kyoto Encyclopedia of Genes and Genomes (KEGG, http://www.genome.jp/kegg/ or http://www.kegg.jp/) is a database resource that integrates genomic, chemical and systemic functional information. In particular, gene catalogs from completely sequenced genomes are linked to higher-level systemic functions of the cell, the organism and the ecosystem. Major efforts have been undertaken to manually create a knowledge base for such systemic functions by capturing and organizing experimental knowledge in computable forms; namely, in the forms of KEGG pathway maps, BRITE functional hierarchies and KEGG modules. Continuous efforts have also been made to develop and improve the cross-species annotation procedure for linking genomes to the molecular networks through the KEGG Orthology system. Here we report KEGG Mapper, a collection of tools for KEGG PATHWAY, BRITE and MODULE mapping, enabling integration and interpretation of large-scale data sets. We also report a variant of the KEGG mapping procedure to extend the knowledge base, where different types of data and knowledge, such as disease genes and drug targets, are integrated as part of the KEGG molecular networks. Finally, we describe recent enhancements to the KEGG content, especially the incorporation of disease and drug information used in practice and in society, to support translational bioinformatics.

Keywords

KEGGEncyclopediaGenomeBiologyAnnotationComputational biologyGene AnnotationComputer scienceBioinformaticsGeneGeneticsTranscriptome

MeSH Terms

Computational BiologyDatabasesFactualDiseaseGenomicsHumansKnowledge BasesMolecular Sequence AnnotationPharmacological PhenomenaSoftwareSystems Integration

Affiliated Institutions

Related Publications

Publication Info

Year
2011
Type
article
Volume
40
Issue
D1
Pages
D109-D114
Citations
4866
Access
Closed

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

4866
OpenAlex
485
Influential
4088
CrossRef

Cite This

Minoru Kanehisa, Susumu Goto, Yoko Sato et al. (2011). KEGG for integration and interpretation of large-scale molecular data sets. Nucleic Acids Research , 40 (D1) , D109-D114. https://doi.org/10.1093/nar/gkr988

Identifiers

DOI
10.1093/nar/gkr988
PMID
22080510
PMCID
PMC3245020

Data Quality

Data completeness: 86%