Improved phone recognition using Bayesian triphone models

Abstract

A crucial issue in triphone based continuous speech recognition is the large number of models to be estimated against the limited availability of training data. This problem can be relieved by composing a triphone model from less context-dependent models. This paper introduces a new statistical framework, derived from the Bayesian principle, to perform such a composition. The potential power of this new framework is explored, both algorithmically and experimentally, by an implementation with hidden Markov modeling techniques. This implementation is applied to the recognition of the 39-phone set on the TIMIT database. The new model achieves 74.4% and 75.6% accuracy, respectively, on the core and complete test sets.

Keywords

Computer scienceHidden Markov modelTIMITPhoneBayesian probabilityContext (archaeology)Speech recognitionArtificial intelligenceSet (abstract data type)Context modelMarkov modelMachine learningPattern recognition (psychology)Markov chain

Affiliated Institutions

Queen's University Belfast GB

Related Publications

Speech Recognition Using Augmented Conditional Random Fields

Yasser Hifny , Steve Renals

Acoustic modeling based on hidden Markov models (HMMs) is employed by state-of-the-art stochastic speech recognition systems. Although HMMs are a natural choice to warp the time...

2009 IEEE Transactions on Audio Speech and... 82 citations

Lattice-based optimization of sequence classification criteria for neural-network acoustic modeling

Brian Kingsbury

Acoustic models used in hidden Markov model/neural-network (HMM/NN) speech recognition systems are usually trained with a frame-based cross-entropy error criterion. In contrast,...

2009 238 citations

An overlapping-feature-based phonological model incorporating linguistic constraints: Applications to speech recognition

Jiping Sun , Li Deng

Modeling phonological units of speech is a critical issue in speech recognition. In this paper, our recent development of an overlapping-feature-based phonological model that re...

2002 The Journal of the Acoustical Society... 71 citations

Applying Convolutional Neural Networks concepts to hybrid NN-HMM model for speech recognition

Ossama Abdel‐Hamid , Abdelrahman Mohamed , Hui Jiang +1 more

Convolutional Neural Networks (CNN) have showed success in achieving translation invariance for many image processing tasks. The success is largely attributed to the use of loca...

2012 885 citations

Backpropagation training for multilayer conditional random field based phone recognition

Rohit Prabhavalkar , Eric Fosler‐Lussier

Conditional random fields (CRFs) have recently found increased popularity in automatic speech recognition (ASR) applications. CRFs have previously been shown to be effective com...

2010 31 citations

Publication Info

Year: 2002
Type: article
Volume: 1
Pages: 409-412
Citations: 45
Access: Closed

External Links

View on DOI.org

Social Impact

Altmetric

Improved phone recognition using Bayesian triphone models

PlumX Metrics

Social media, news, blog, policy document mentions

Citation Metrics

OpenAlex

Cite This

APA Style

                            
                                    Ming Jiang, 
                                
                                    F.J. Smith
                                
                            (2002). 
                            Improved phone recognition using Bayesian triphone models. 
                            
                            , 1
                            
                            , 409-412.
                            https://doi.org/10.1109/icassp.1998.674454

Identifiers

DOI: 10.1109/icassp.1998.674454