Porting concepts from DNNs back to GMMs
- Author
- Kris Demuynck (UGent) and Fabian Triefenbach (UGent)
- Organization
- Abstract
- Deep neural networks (DNNs) have been shown to outperform Gaussian Mixture Models (GMM) on a variety of speech recognition benchmarks. In this paper we analyze the differences between the DNN and GMM modeling techniques and port the best ideas from the DNN-based modeling to a GMM-based system. By going both deep (multiple layers) and wide (multiple parallel sub-models) and by sharing model parameters, we are able to close the gap between the two modeling techniques on the TIMIT database. Since the 'deep' GMMs retain the maximum-likelihood trained Gaussians as first layer, advanced techniques such as speaker adaptation and model-based noise robustness can be readily incorporated. Regardless of their similarities, the DNNs and the deep GMMs still show a sufficient amount of complementarity to allow effective system combination.
- Keywords
- Gaussian mixture models, UNCERTAINTY, GMM, speech recognition, deep neural networks, DNN, SPEECH RECOGNITION, deep structures
Downloads
-
demyunck 2013 ASRU.pdf
- full text
- |
- open access
- |
- |
- 179.42 KB
Citation
Please use this url to cite or link to this publication: http://hdl.handle.net/1854/LU-4382368
- MLA
- Demuynck, Kris, and Fabian Triefenbach. “Porting Concepts from DNNs Back to GMMs.” 2013 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), IEEE, 2013, pp. 356–61, doi:10.1109/ASRU.2013.6707756.
- APA
- Demuynck, K., & Triefenbach, F. (2013). Porting concepts from DNNs back to GMMs. 2013 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), 356–361. https://doi.org/10.1109/ASRU.2013.6707756
- Chicago author-date
- Demuynck, Kris, and Fabian Triefenbach. 2013. “Porting Concepts from DNNs Back to GMMs.” In 2013 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), 356–61. New York, NY, USA: IEEE. https://doi.org/10.1109/ASRU.2013.6707756.
- Chicago author-date (all authors)
- Demuynck, Kris, and Fabian Triefenbach. 2013. “Porting Concepts from DNNs Back to GMMs.” In 2013 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), 356–361. New York, NY, USA: IEEE. doi:10.1109/ASRU.2013.6707756.
- Vancouver
- 1.Demuynck K, Triefenbach F. Porting concepts from DNNs back to GMMs. In: 2013 IEEE Workshop on automatic speech recognition and understanding (ASRU). New York, NY, USA: IEEE; 2013. p. 356–61.
- IEEE
- [1]K. Demuynck and F. Triefenbach, “Porting concepts from DNNs back to GMMs,” in 2013 IEEE Workshop on automatic speech recognition and understanding (ASRU), Olomouc, Czech Republic, 2013, pp. 356–361.
@inproceedings{4382368, abstract = {{Deep neural networks (DNNs) have been shown to outperform Gaussian Mixture Models (GMM) on a variety of speech recognition benchmarks. In this paper we analyze the differences between the DNN and GMM modeling techniques and port the best ideas from the DNN-based modeling to a GMM-based system. By going both deep (multiple layers) and wide (multiple parallel sub-models) and by sharing model parameters, we are able to close the gap between the two modeling techniques on the TIMIT database. Since the 'deep' GMMs retain the maximum-likelihood trained Gaussians as first layer, advanced techniques such as speaker adaptation and model-based noise robustness can be readily incorporated. Regardless of their similarities, the DNNs and the deep GMMs still show a sufficient amount of complementarity to allow effective system combination.}}, author = {{Demuynck, Kris and Triefenbach, Fabian}}, booktitle = {{2013 IEEE Workshop on automatic speech recognition and understanding (ASRU)}}, isbn = {{9781479927562}}, keywords = {{Gaussian mixture models,UNCERTAINTY,GMM,speech recognition,deep neural networks,DNN,SPEECH RECOGNITION,deep structures}}, language = {{eng}}, location = {{Olomouc, Czech Republic}}, pages = {{356--361}}, publisher = {{IEEE}}, title = {{Porting concepts from DNNs back to GMMs}}, url = {{http://doi.org/10.1109/ASRU.2013.6707756}}, year = {{2013}}, }
- Altmetric
- View in Altmetric
- Web of Science
- Times cited: