Abstract

A crucial issue in triphone based continuous speech recognition is the large number of models to be estimated against the limited availability of training data. This problem can be relieved by composing a triphone model from less context-dependent models. This paper introduces a new statistical framework, derived from the Bayesian principle, to perform such a composition. The potential power of this new framework is explored, both algorithmically and experimentally, by an implementation with hidden Markov modeling techniques. This implementation is applied to the recognition of the 39-phone set on the TIMIT database. The new model achieves 74.4% and 75.6% accuracy, respectively, on the core and complete test sets.

Keywords

Computer scienceHidden Markov modelTIMITPhoneBayesian probabilityContext (archaeology)Speech recognitionArtificial intelligenceSet (abstract data type)Context modelMarkov modelMachine learningPattern recognition (psychology)Markov chain

Affiliated Institutions

Related Publications

Publication Info

Year
2002
Type
article
Volume
1
Pages
409-412
Citations
45
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

45
OpenAlex

Cite This

Ming Jiang, F.J. Smith (2002). Improved phone recognition using Bayesian triphone models. , 1 , 409-412. https://doi.org/10.1109/icassp.1998.674454

Identifiers

DOI
10.1109/icassp.1998.674454