Abstract

A computer program called BLASTX was previously shown to be effective in identifying and assigning putative function to likely protein coding regions by detecting significant similarity between a conceptually translated nucleotide query sequence and members of a protein sequence database. We present and assess the sensitivity of a new option to this software tool, herein called BLASTC, which employs information obtained from biases in codon utilization, along with the information obtained from sequence similarity. A rationale for combining these diverse information sources was derived, and analyses of the information available from codon utilization in several species were performed, with wide variation seen. Codon bias information was found on average to improve the sensitivity of detection of short coding regions of human origin by about a factor of 5. The implications of combining information sources on the interpretation of positive findings are discussed.

Keywords

Coding (social sciences)Similarity (geometry)Coding regionSequence (biology)Computer scienceCodon usage biasComputational biologyData miningGeneticsBiologyArtificial intelligenceMathematicsGeneGenomeStatistics

Affiliated Institutions

Related Publications

Publication Info

Year
1994
Type
article
Volume
1
Issue
1
Pages
39-50
Citations
149
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

149
OpenAlex

Cite This

David J. States, Warren Gish (1994). QGB: Combined Use of Sequence Similarity and Codon Bias for Coding Region Identification. Journal of Computational Biology , 1 (1) , 39-50. https://doi.org/10.1089/cmb.1994.1.39

Identifiers

DOI
10.1089/cmb.1994.1.39