Abstract
Comparisons of DNA and protein sequences between humans and model organisms, including the yeast Saccharomyces cerevisiae, the nematode Caenorhabditis elegans, and the fruit fly Drosophila melanogaster, are a significant source of information about the function of human genes and proteins in both normal and disease states. Important questions regarding cross-species sequence comparison remain unanswered, including (1) the fraction of the metabolic, signaling, and regulatory pathways that is shared by humans and the various model organisms; and (2) the validity of functional inferences based on sequence homology. We addressed these questions by analyzing the available fractions of human, fly, nematode, and yeast genomes for orthologous protein-coding genes, applying strict criteria to distinguish between candidate orthologous and paralogous proteins. Forty-two quartets of proteins could be identified as candidate orthologs. Twenty-four Drosophila protein sequences were more similar to their human orthologs than the corresponding nematode proteins. Analysis of sequence substitutions and evolutionary distances in this data set revealed that most C. elegans genes are evolving more rapidly than Drosophila genes, suggesting that unequal evolutionary rates may contribute to the differences in similarity to human protein sequences. The available fraction of Drosophila proteins appears to lack representatives of many protein families and domains, reflecting the relative paucity of genomic data from this species.
Keywords
Affiliated Institutions
Related Publications
Comparative Genomics of the Eukaryotes
A comparative analysis of the genomes of Drosophila melanogaster , Caenorhabditis elegans , and Saccharomyces cerevisiae —and the proteins they are predicted to encode—was under...
The COG database: new developments in phylogenetic classification of proteins from complete genomes
The database of Clusters of Orthologous Groups of proteins (COGs), which represents an attempt on a phylogenetic classification of the proteins encoded in complete genomes, curr...
Genetic pathways that regulate ageing in model organisms
Searches for genes involved in the ageing process have been made in genetically tractable model organisms such as yeast, the nematode Caenorhabditis elegans, Drosophila melanoga...
Human, Drosophila, and C.elegans TDP43: Nucleic Acid Binding Properties and Splicing Regulatory Function
TAR DNA binding protein (TDP43), a highly conserved heterogeneous nuclear ribonucleoprotein, was found to down-regulate splicing of the exon 9 cystic fibrosis transmembrane cond...
Comparative Analysis of Amino Acid Usage and Protein Length Distribution Between Alternatively and Non-alternatively Spliced Genes Across Six Eukaryotic Genomes
Alternative splicing has been discovered in nearly all metazoan organisms as a mechanism to increase the diversity of gene products. However, the origin and evolution of alterna...
Publication Info
- Year
- 1998
- Type
- article
- Volume
- 8
- Issue
- 6
- Pages
- 590-598
- Citations
- 162
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.1101/gr.8.6.590