
How to improve TTS systems for emotional expressivity
- Author
- Antonio Ferreira Rebordao (UGent) , Mostafa Al Masum Shaikh, Keikichi Hirose and Nobuaki Minematsu
- Organization
- Abstract
- Several experiments have been carried out that revealed weaknesses of the current Text-To-Speech (TTS) systems in their emotional expressivity. Although some TTS systems allow XML-based representations of prosodic and/or phonetic variables, few publications considered, as a pre-processing stage, the use of intelligent text processing to detect affective information that can be used to tailor the parameters needed for emotional expressivity. This paper describes a technique for an automatic prosodic parameterization based on affective clues. This technique recognizes the affective information conveyed in a text and, accordingly to its emotional connotation, assigns appropriate pitch accents and other prosodic parameters by XML-tagging. This pre-processing assists the TTS system to generate synthesized speech that contains emotional clues. The experimental results are encouraging and suggest the possibility of suitable emotional expressivity in speech synthesis.
- Keywords
- speech synthesis, emotional expressivity, TTS, MaryXML, intelligent text processing, affect sensing
Downloads
-
InterSpeech2009.pdf
- full text
- |
- open access
- |
- |
- 77.40 KB
Citation
Please use this url to cite or link to this publication: http://hdl.handle.net/1854/LU-817056
- MLA
- Ferreira Rebordao, Antonio, et al. “How to Improve TTS Systems for Emotional Expressivity.” INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, International Speech Communication Association (ISCA), 2009, pp. 520–23.
- APA
- Ferreira Rebordao, A., Shaikh, M. A. M., Hirose, K., & Minematsu, N. (2009). How to improve TTS systems for emotional expressivity. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 520–523. Baixas, France: International Speech Communication Association (ISCA).
- Chicago author-date
- Ferreira Rebordao, Antonio, Mostafa Al Masum Shaikh, Keikichi Hirose, and Nobuaki Minematsu. 2009. “How to Improve TTS Systems for Emotional Expressivity.” In INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 520–23. Baixas, France: International Speech Communication Association (ISCA).
- Chicago author-date (all authors)
- Ferreira Rebordao, Antonio, Mostafa Al Masum Shaikh, Keikichi Hirose, and Nobuaki Minematsu. 2009. “How to Improve TTS Systems for Emotional Expressivity.” In INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 520–523. Baixas, France: International Speech Communication Association (ISCA).
- Vancouver
- 1.Ferreira Rebordao A, Shaikh MAM, Hirose K, Minematsu N. How to improve TTS systems for emotional expressivity. In: INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5. Baixas, France: International Speech Communication Association (ISCA); 2009. p. 520–3.
- IEEE
- [1]A. Ferreira Rebordao, M. A. M. Shaikh, K. Hirose, and N. Minematsu, “How to improve TTS systems for emotional expressivity,” in INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, Brighton, UK, 2009, pp. 520–523.
@inproceedings{817056, abstract = {{Several experiments have been carried out that revealed weaknesses of the current Text-To-Speech (TTS) systems in their emotional expressivity. Although some TTS systems allow XML-based representations of prosodic and/or phonetic variables, few publications considered, as a pre-processing stage, the use of intelligent text processing to detect affective information that can be used to tailor the parameters needed for emotional expressivity. This paper describes a technique for an automatic prosodic parameterization based on affective clues. This technique recognizes the affective information conveyed in a text and, accordingly to its emotional connotation, assigns appropriate pitch accents and other prosodic parameters by XML-tagging. This pre-processing assists the TTS system to generate synthesized speech that contains emotional clues. The experimental results are encouraging and suggest the possibility of suitable emotional expressivity in speech synthesis.}}, author = {{Ferreira Rebordao, Antonio and Shaikh, Mostafa Al Masum and Hirose, Keikichi and Minematsu, Nobuaki}}, booktitle = {{INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5}}, isbn = {{9781615676927}}, issn = {{1990-9772}}, keywords = {{speech synthesis,emotional expressivity,TTS,MaryXML,intelligent text processing,affect sensing}}, language = {{eng}}, location = {{Brighton, UK}}, pages = {{520--523}}, publisher = {{International Speech Communication Association (ISCA)}}, title = {{How to improve TTS systems for emotional expressivity}}, year = {{2009}}, }