Abstract
Various new methods have been proposed to predict functional interactions between proteins based on the genomic context of their genes. The types of genomic context that they use are Type I: the fusion of genes; Type II: the conservation of gene-order or co-occurrence of genes in potential operons; and Type III: the co-occurrence of genes across genomes (phylogenetic profiles). Here we compare these types for their coverage, their correlations with various types of functional interaction, and their overlap with homology-based function assignment. We apply the methods to Mycoplasma genitalium, the standard benchmarking genome in computational and experimental genomics. Quantitatively, conservation of gene order is the technique with the highest coverage, applying to 37% of the genes. By combining gene order conservation with gene fusion (6%), the co-occurrence of genes in operons in absence of gene order conservation (8%), and the co-occurrence of genes across genomes (11%), significant context information can be obtained for 50% of the genes (the categories overlap). Qualitatively, we observe that the functional interactions between genes are stronger as the requirements for physical neighborhood on the genome are more stringent, while the fraction of potential false positives decreases. Moreover, only in cases in which gene order is conserved in a substantial fraction of the genomes, in this case six out of twenty-five, does a single type of functional interaction (physical interaction) clearly dominate (>80%). In other cases, complementary function information from homology searches, which is available for most of the genes with significant genomic context, is essential to predict the type of interaction. Using a combination of genomic context and homology searches, new functional features can be predicted for 10% of M. genitalium genes.
Keywords
MeSH Terms
Affiliated Institutions
Related Publications
Complete Chemical Synthesis, Assembly, and Cloning of a <i>Mycoplasma genitalium</i> Genome
We have synthesized a 582,970–base pair Mycoplasma genitalium genome. This synthetic genome, named M. genitalium JCVI-1.0, contains all the genes of wild-type M. genitalium G37 ...
KEGG: Kyoto Encyclopedia of Genes and Genomes
KEGG (Kyoto Encyclopedia of Genes and Genomes) is a knowledge base for systematic analysis of gene functions, linking genomic information with higher order functional informatio...
The STRING database in 2023: protein–protein association networks and functional enrichment analyses for any sequenced genome of interest
Abstract Much of the complexity within cells arises from functional and regulatory interactions among proteins. The core of these interactions is increasingly known, but novel i...
The Human Genome Project
The recently initiated human genome project is a large international effort to elucidate the genetic architecture of the genomes of man and several model organisms. The initial ...
Signaling and Circuitry of Multiple MAPK Pathways Revealed by a Matrix of Global Gene Expression Profiles
Genome-wide transcript profiling was used to monitor signal transduction during yeast pheromone response. Genetic manipulations allowed analysis of changes in gene expression un...
Publication Info
- Year
- 2000
- Type
- article
- Volume
- 10
- Issue
- 8
- Pages
- 1204-1210
- Citations
- 509
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.1101/gr.10.8.1204
- PMID
- 10958638
- PMCID
- PMC310926