Abstract

The distribution of optimal local alignment scores of random sequences plays a vital role in evaluating the statistical significance of sequence alignments. These scores can be well described by an extreme-value distribution. The distribution's parameters depend upon the scoring system employed and the random letter frequencies; in general they cannot be derived analytically, but must be estimated by curve fitting. For obtaining accurate parameter estimates, a form of the recently described 'island' method has several advantages. We describe this method in detail, and use it to investigate the functional dependence of these parameters on finite-length edge effects.

Keywords

BiologyDistribution (mathematics)Sequence (biology)StatisticsEstimation theoryStatistical modelStatistical parameterStatistical analysisStatistical physicsAlgorithmMathematicsBiological systemMathematical analysisPhysicsGenetics

MeSH Terms

AlgorithmsComputational BiologyLikelihood FunctionsSequence AlignmentSequence AnalysisProteinStatistical Distributions

Affiliated Institutions

Related Publications

Accelerated Profile HMM Searches

Profile hidden Markov models (profile HMMs) and probabilistic inference methods have made important contributions to the theory of sequence database homology search. However, pr...

2011 PLoS Computational Biology 6891 citations

Publication Info

Year
2001
Type
article
Volume
29
Issue
2
Pages
351-361
Citations
182
Access
Closed

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

182
OpenAlex
20
Influential
133
CrossRef

Cite This

Stephen F. Altschul (2001). The estimation of statistical parameters for local alignment score distributions. Nucleic Acids Research , 29 (2) , 351-361. https://doi.org/10.1093/nar/29.2.351

Identifiers

DOI
10.1093/nar/29.2.351
PMID
11139604
PMCID
PMC29669

Data Quality

Data completeness: 86%