Abstract

Recently, Deep Belief Networks (DBNs) have been proposed for phone recognition and were found to achieve highly competitive performance. In the original DBNs, only framelevel information was used for training DBN weights while it has been known for long that sequential or full-sequence information can be helpful in improving speech recognition accuracy. In this paper we investigate approaches to optimizing the DBN weights, state-to-state transition parameters, and language model scores using the sequential discriminative training criterion. We describe and analyze the proposed training algorithm and strategy, and discuss practical issues and how they affect the final results. We show that the DBNs learned using the sequence-based training criterion outperform those with frame-based criterion using both threelayer and six-layer models, but the optimization procedure for the deeper DBN is more difficult for the former criterion.

Keywords

Computer scienceTraining (meteorology)Speech recognitionSequence (biology)Artificial intelligenceNatural language processing

Affiliated Institutions

Related Publications

Publication Info

Year
2010
Type
article
Citations
213
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

213
OpenAlex

Cite This

Abdelrahman Mohamed, Dong Yu, Li Deng (2010). Investigation of full-sequence training of deep belief networks for speech recognition. . https://doi.org/10.21437/interspeech.2010-304

Identifiers

DOI
10.21437/interspeech.2010-304