Abstract

Abstract Domains are the building blocks of all globular proteins, and are units of compact three-dimensional structure as well as evolutionary units. There is a limited repertoire of domain families, so that these domain families are duplicated and combined in different ways to form the set of proteins in a genome. Proteins are gene products. The processes that produce new genes are duplication and recombination as well as gene fusion and fission. We attempt to gain an overview of these processes by studying the structural domains in the proteins of seven genomes from the three kingdoms of life: Eubacteria, Archaea and Eukaryota. We use here the domain and superfamily definitions in Structural Classification of Proteins Database (SCOP) in order to map pairs of adjacent domains in genome sequences in terms of their superfamily combinations. We find 624 out of the 764 superfamilies in SCOP in these genomes, and the 624 families occur in 585 pairwise combinations. Most families are observed in combination with one or two other families, while a few families are very versatile in their combinatorial behaviour. This type of pattern can be described by a scale-free network. Finally, we study domain repeats and we compare the set of the domain combinations in the genomes to those in PDB, and discuss the implications for structural genomics. Contact: apic@mrc-lmb.cam.ac.uk

Keywords

GenomeStructural Classification of Proteins databaseStructural genomicsBiologyGene duplicationComputational biologyDomain (mathematical analysis)Protein domainGenomicsGeneticsGeneProtein structure

Affiliated Institutions

Related Publications

Publication Info

Year
2001
Type
article
Volume
17
Issue
suppl_1
Pages
S83-S89
Citations
124
Access
Closed

External Links

Social Impact

Altmetric

Social media, news, blog, policy document mentions

Citation Metrics

124
OpenAlex

Cite This

Gordana Apic, Julian Gough, Sarah A. Teichmann (2001). An insight into domain combinations. Bioinformatics , 17 (suppl_1) , S83-S89. https://doi.org/10.1093/bioinformatics/17.suppl_1.s83

Identifiers

DOI
10.1093/bioinformatics/17.suppl_1.s83