Abstract
By combining the pairwise interactions between proteins, as predicted by the conserved co-occurrence of their genes in operons, we obtain protein interaction networks. Here we study the properties of such networks to identify functional modules: sets of proteins that together are involved in a biological process. The complete network contains 3,033 orthologous groups of proteins in 38 genomes. It consists of one giant component, containing 1,611 orthologous groups, and of 516 small disjointed clusters that, on average, contain only 2.7 orthologous groups. These small clusters have a homogeneous functional composition and thus represent functional modules in themselves. Analysis of the giant component reveals that it is a scale-free, small-world network with a high degree of local clustering (C = 0.6). It consists of locally highly connected subclusters that are connected to each other by linker proteins. The linker proteins tend to have multiple functions, or are involved in multiple processes and have an above average probability of being essential. By splitting up the giant component at these linker proteins, we identify 265 subclusters that tend to have a homogeneous functional composition. The rare functional inhomogeneities in our subclusters reflect the mixing of different types of (molecular) functions in a single cellular process, exemplified by subclusters containing both metabolic enzymes as well as the transcription factors that regulate them. Comparative genome analysis, thus, allows identification of a level of functional interaction between that of pairwise interactions, and of the complete genome.
Keywords
Affiliated Institutions
Related Publications
A Genomic Perspective on Protein Families
In order to extract the maximum amount of information from the rapidly accumulating genome sequences, all conserved genes need to be classified according to their homologous rel...
The COG database: new developments in phylogenetic classification of proteins from complete genomes
The database of Clusters of Orthologous Groups of proteins (COGs), which represents an attempt on a phylogenetic classification of the proteins encoded in complete genomes, curr...
Specificity and Stability in Topology of Protein Networks
Molecular networks guide the biochemistry of a living cell on multiple levels: Its metabolic and signaling pathways are shaped by the network of interacting proteins, whose prod...
KEGG: Kyoto Encyclopedia of Genes and Genomes
Kyoto Encyclopedia of Genes and Genomes (KEGG) is a knowledge base for systematic analysis of gene functions in terms of the networks of genes and molecules. The major component...
An analysis of protein domain linkers: their classification and role in protein folding
Recent advances in protein engineering have come from creating multi-functional chimeric proteins containing modules from various proteins. These modules are typically joined vi...
Publication Info
- Year
- 2002
- Type
- article
- Volume
- 99
- Issue
- 9
- Pages
- 5890-5895
- Citations
- 250
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.1073/pnas.092632599