InterPro in 2019: improving coverage, classification and access to protein sequence annotations

2018 Nucleic Acids Research 1,449 citations

Abstract

The InterPro database (http://www.ebi.ac.uk/interpro/) classifies protein sequences into families and predicts the presence of functionally important domains and sites. Here, we report recent developments with InterPro (version 70.0) and its associated software, including an 18% growth in the size of the database in terms on new InterPro entries, updates to content, the inclusion of an additional entry type, refined modelling of discontinuous domains, and the development of a new programmatic interface and website. These developments extend and enrich the information provided by InterPro, and provide greater flexibility in terms of data access. We also show that InterPro's sequence coverage has kept pace with the growth of UniProtKB, and discuss how our evaluation of residue coverage may help guide future curation activities.

Keywords

UniProtBiologyComputational biologyFlexibility (engineering)BioinformaticsGenetics

MeSH Terms

AnimalsDatabasesGeneticDatabasesProteinGene OntologyHumansInternetMolecular Sequence AnnotationMultigene FamilyProtein DomainsSequence HomologyAmino AcidSoftwareUser-Computer Interface

Affiliated Institutions

Related Publications

Publication Info

Year
2018
Type
article
Volume
47
Issue
D1
Pages
D351-D360
Citations
1449
Access
Closed

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

1449
OpenAlex
75
Influential
1244
CrossRef

Cite This

Alex Mitchell, Teresa K. Attwood, Patricia C. Babbitt et al. (2018). InterPro in 2019: improving coverage, classification and access to protein sequence annotations. Nucleic Acids Research , 47 (D1) , D351-D360. https://doi.org/10.1093/nar/gky1100

Identifiers

DOI
10.1093/nar/gky1100
PMID
30398656
PMCID
PMC6323941

Data Quality

Data completeness: 90%