Ghent University Academic Bibliography

Advanced

Context-dependent modeling and speaker normalization applied to reservoir-based phone recognition

Fabian Triefenbach, Azarakhsh Jalalvand UGent, Kris Demuynck UGent and Jean-Pierre Martens UGent (2013) 14th Annual conference of the International Speech Communication Association, Proceedings. p.3342-3346
abstract
Reservoir Computing (RC) has recently been introduced as an interesting alternative for acoustic modeling. For phone and continuous digit recognition, the reservoir approach obtained quite promising results. In this work, we further elaborate this concept by porting some well-known techniques used to enhance recognition rates of GMM-based models to Reservoir Computing. In particular, we introduce context-dependent (CD) triphone states to model co-articulation and pronunciation mismatches arising from an imperfect lexicon. We also propose to incorporate two speaker normalization methods in the feature space, namely mean \& variance normalization and vocal tract length normalization. The impact of the investigated techniques is studied in the context of phone recognition on the TIMIT corpus. Our CD-RC-HMM hybrid yields a speaker-independent phone error rate (PER) of 22\% and a speaker-dependent PER of 20.5\%. By combining GMM and RC-based likelihoods at the state level, these scores can be reduced further.
Please use this url to cite or link to this publication:
author
organization
year
type
conference
publication status
published
subject
in
14th Annual conference of the International Speech Communication Association, Proceedings
pages
3342 - 3346
conference name
14th Annual conference of the International Speech Communication Association
conference location
Lyon, France
conference start
2013-08-25
conference end
2013-08-29
ISSN
2308-457X
ISBN
9781629934433
language
English
UGent publication?
yes
classification
C1
copyright statement
I have transferred the copyright for this publication to the publisher
id
3237634
handle
http://hdl.handle.net/1854/LU-3237634
date created
2013-06-07 10:41:50
date last changed
2016-12-19 15:37:32
@inproceedings{3237634,
  abstract     = {Reservoir Computing (RC) has recently been introduced as an interesting alternative for acoustic modeling. For phone and continuous digit recognition, the reservoir approach obtained quite promising results. In this work, we further elaborate this concept by porting some well-known techniques used to enhance recognition rates of GMM-based models to Reservoir Computing. In particular, we introduce context-dependent (CD) triphone states to model co-articulation and pronunciation mismatches arising from an imperfect lexicon. We also propose to incorporate two speaker normalization methods in the feature space, namely mean {\textbackslash}\& variance normalization and vocal tract length normalization. The impact of the investigated techniques is studied in the context of phone recognition on the TIMIT corpus. Our CD-RC-HMM hybrid yields a speaker-independent phone error rate (PER) of 22{\textbackslash}\% and a speaker-dependent PER of 20.5{\textbackslash}\%. By combining GMM and RC-based likelihoods at the state level, these scores can be reduced further.},
  author       = {Triefenbach, Fabian and Jalalvand, Azarakhsh and Demuynck, Kris and Martens, Jean-Pierre},
  booktitle    = {14th Annual conference of the International Speech Communication Association, Proceedings},
  isbn         = {9781629934433},
  issn         = {2308-457X},
  language     = {eng},
  location     = {Lyon, France},
  pages        = {3342--3346},
  title        = {Context-dependent modeling and speaker normalization applied to reservoir-based phone recognition},
  year         = {2013},
}

Chicago
Triefenbach, Fabian, Azarakhsh Jalalvand, Kris Demuynck, and Jean-Pierre Martens. 2013. “Context-dependent Modeling and Speaker Normalization Applied to Reservoir-based Phone Recognition.” In 14th Annual Conference of the International Speech Communication Association, Proceedings, 3342–3346.
APA
Triefenbach, F., Jalalvand, A., Demuynck, K., & Martens, J.-P. (2013). Context-dependent modeling and speaker normalization applied to reservoir-based phone recognition. 14th Annual conference of the International Speech Communication Association, Proceedings (pp. 3342–3346). Presented at the 14th Annual conference of the International Speech Communication Association.
Vancouver
1.
Triefenbach F, Jalalvand A, Demuynck K, Martens J-P. Context-dependent modeling and speaker normalization applied to reservoir-based phone recognition. 14th Annual conference of the International Speech Communication Association, Proceedings. 2013. p. 3342–6.
MLA
Triefenbach, Fabian, Azarakhsh Jalalvand, Kris Demuynck, et al. “Context-dependent Modeling and Speaker Normalization Applied to Reservoir-based Phone Recognition.” 14th Annual Conference of the International Speech Communication Association, Proceedings. 2013. 3342–3346. Print.