Advanced search
1 file | 179.42 KB Add to list

Porting concepts from DNNs back to GMMs

Kris Demuynck (UGent) and Fabian Triefenbach (UGent)
Author
Organization
Abstract
Deep neural networks (DNNs) have been shown to outperform Gaussian Mixture Models (GMM) on a variety of speech recognition benchmarks. In this paper we analyze the differences between the DNN and GMM modeling techniques and port the best ideas from the DNN-based modeling to a GMM-based system. By going both deep (multiple layers) and wide (multiple parallel sub-models) and by sharing model parameters, we are able to close the gap between the two modeling techniques on the TIMIT database. Since the 'deep' GMMs retain the maximum-likelihood trained Gaussians as first layer, advanced techniques such as speaker adaptation and model-based noise robustness can be readily incorporated. Regardless of their similarities, the DNNs and the deep GMMs still show a sufficient amount of complementarity to allow effective system combination.
Keywords
Gaussian mixture models, UNCERTAINTY, GMM, speech recognition, deep neural networks, DNN, SPEECH RECOGNITION, deep structures

Downloads

  • demyunck 2013 ASRU.pdf
    • full text
    • |
    • open access
    • |
    • PDF
    • |
    • 179.42 KB

Citation

Please use this url to cite or link to this publication:

MLA
Demuynck, Kris, and Fabian Triefenbach. “Porting Concepts from DNNs Back to GMMs.” 2013 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), IEEE, 2013, pp. 356–61, doi:10.1109/ASRU.2013.6707756.
APA
Demuynck, K., & Triefenbach, F. (2013). Porting concepts from DNNs back to GMMs. 2013 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), 356–361. https://doi.org/10.1109/ASRU.2013.6707756
Chicago author-date
Demuynck, Kris, and Fabian Triefenbach. 2013. “Porting Concepts from DNNs Back to GMMs.” In 2013 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), 356–61. New York, NY, USA: IEEE. https://doi.org/10.1109/ASRU.2013.6707756.
Chicago author-date (all authors)
Demuynck, Kris, and Fabian Triefenbach. 2013. “Porting Concepts from DNNs Back to GMMs.” In 2013 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), 356–361. New York, NY, USA: IEEE. doi:10.1109/ASRU.2013.6707756.
Vancouver
1.
Demuynck K, Triefenbach F. Porting concepts from DNNs back to GMMs. In: 2013 IEEE Workshop on automatic speech recognition and understanding (ASRU). New York, NY, USA: IEEE; 2013. p. 356–61.
IEEE
[1]
K. Demuynck and F. Triefenbach, “Porting concepts from DNNs back to GMMs,” in 2013 IEEE Workshop on automatic speech recognition and understanding (ASRU), Olomouc, Czech Republic, 2013, pp. 356–361.
@inproceedings{4382368,
  abstract     = {{Deep neural networks (DNNs) have been shown to outperform Gaussian Mixture Models (GMM) on a variety of speech recognition benchmarks. In this paper we analyze the differences between the DNN and GMM modeling techniques and port the best ideas from the DNN-based modeling to a GMM-based system. By going both deep (multiple layers) and wide (multiple parallel sub-models) and by sharing model parameters, we are able to close the gap between the two modeling techniques on the TIMIT database. Since the 'deep' GMMs retain the maximum-likelihood trained Gaussians as first layer, advanced techniques such as speaker adaptation and model-based noise robustness can be readily incorporated. Regardless of their similarities, the DNNs and the deep GMMs still show a sufficient amount of complementarity to allow effective system combination.}},
  author       = {{Demuynck, Kris and Triefenbach, Fabian}},
  booktitle    = {{2013 IEEE Workshop on automatic speech recognition and understanding (ASRU)}},
  isbn         = {{9781479927562}},
  keywords     = {{Gaussian mixture models,UNCERTAINTY,GMM,speech recognition,deep neural networks,DNN,SPEECH RECOGNITION,deep structures}},
  language     = {{eng}},
  location     = {{Olomouc, Czech Republic}},
  pages        = {{356--361}},
  publisher    = {{IEEE}},
  title        = {{Porting concepts from DNNs back to GMMs}},
  url          = {{http://doi.org/10.1109/ASRU.2013.6707756}},
  year         = {{2013}},
}

Altmetric
View in Altmetric
Web of Science
Times cited: