Advanced search
2 files | 2.87 MB Add to list

No more mumbles : enhancing robot intelligibility through speech adaptation

Qiaoqiao Ren (UGent) , Yuanbo Hou (UGent) , Dick Botteldooren (UGent) and Tony Belpaeme (UGent)
Author
Organization
Project
Abstract
Spoken language interaction is at the heart of interpersonal communication, and people flexibly adapt their speech to different individuals and environments. It is surprising that robots, and by extension other digital devices, are not equipped to adapt their speech and instead rely on fixed speech parameters, which often hinder comprehension by the user. We conducted a speech comprehension study involving 39 participants who were exposed to different environmental and contextual conditions. During the experiment, the robot articulated words using different vocal parameters, and the participants were tasked with both recognising the spoken words and rating their subjective impression of the robot's speech. The experiment's primary outcome shows that spaces with good acoustic quality positively correlate with intelligibility and user experience. However, increasing the distance between the user and the robot exacerbated the user experience, while distracting background sounds significantly reduced speech recognition accuracy and user satisfaction. We next built an adaptive voice for the robot. For this, the robot needs to know how difficult it is for a user to understand spoken language in a particular setting. We present a prediction model that rates how annoying the ambient acoustic environment is and, consequentially, how hard it is to understand someone in this setting. Then, we develop a convolutional neural network model to adapt the robot's speech parameters to different users and spaces, while taking into account the influence of ambient acoustics on intelligibility. Finally, we present an evaluation with 27 users, demonstrating superior intelligibility and user experience with adaptive voice parameters compared to fixed voice.
Keywords
NEURAL-NETWORKS, NOISE, REVERBERATION, PERCEPTION, PITCH, Robots, Auditory system, User experience, Task analysis, Particle measurements, Noise measurement, Atmospheric measurements, Human-centered robotics, design and human factors, social HRI

Downloads

  • DS765 acc.pdf
    • full text (Accepted manuscript)
    • |
    • open access
    • |
    • PDF
    • |
    • 1.86 MB
  • (...).pdf
    • full text (Published version)
    • |
    • UGent only
    • |
    • PDF
    • |
    • 1.01 MB

Citation

Please use this url to cite or link to this publication:

MLA
Ren, Qiaoqiao, et al. “No More Mumbles : Enhancing Robot Intelligibility through Speech Adaptation.” IEEE ROBOTICS AND AUTOMATION LETTERS, vol. 9, no. 7, 2024, pp. 6162–69, doi:10.1109/LRA.2024.3401117.
APA
Ren, Q., Hou, Y., Botteldooren, D., & Belpaeme, T. (2024). No more mumbles : enhancing robot intelligibility through speech adaptation. IEEE ROBOTICS AND AUTOMATION LETTERS, 9(7), 6162–6169. https://doi.org/10.1109/LRA.2024.3401117
Chicago author-date
Ren, Qiaoqiao, Yuanbo Hou, Dick Botteldooren, and Tony Belpaeme. 2024. “No More Mumbles : Enhancing Robot Intelligibility through Speech Adaptation.” IEEE ROBOTICS AND AUTOMATION LETTERS 9 (7): 6162–69. https://doi.org/10.1109/LRA.2024.3401117.
Chicago author-date (all authors)
Ren, Qiaoqiao, Yuanbo Hou, Dick Botteldooren, and Tony Belpaeme. 2024. “No More Mumbles : Enhancing Robot Intelligibility through Speech Adaptation.” IEEE ROBOTICS AND AUTOMATION LETTERS 9 (7): 6162–6169. doi:10.1109/LRA.2024.3401117.
Vancouver
1.
Ren Q, Hou Y, Botteldooren D, Belpaeme T. No more mumbles : enhancing robot intelligibility through speech adaptation. IEEE ROBOTICS AND AUTOMATION LETTERS. 2024;9(7):6162–9.
IEEE
[1]
Q. Ren, Y. Hou, D. Botteldooren, and T. Belpaeme, “No more mumbles : enhancing robot intelligibility through speech adaptation,” IEEE ROBOTICS AND AUTOMATION LETTERS, vol. 9, no. 7, pp. 6162–6169, 2024.
@article{01J0JSHMSGVJ81R3HF2JXFA6SS,
  abstract     = {{Spoken language interaction is at the heart of interpersonal communication, and people flexibly adapt their speech to different individuals and environments. It is surprising that robots, and by extension other digital devices, are not equipped to adapt their speech and instead rely on fixed speech parameters, which often hinder comprehension by the user. We conducted a speech comprehension study involving 39 participants who were exposed to different environmental and contextual conditions. During the experiment, the robot articulated words using different vocal parameters, and the participants were tasked with both recognising the spoken words and rating their subjective impression of the robot's speech. The experiment's primary outcome shows that spaces with good acoustic quality positively correlate with intelligibility and user experience. However, increasing the distance between the user and the robot exacerbated the user experience, while distracting background sounds significantly reduced speech recognition accuracy and user satisfaction. We next built an adaptive voice for the robot. For this, the robot needs to know how difficult it is for a user to understand spoken language in a particular setting. We present a prediction model that rates how annoying the ambient acoustic environment is and, consequentially, how hard it is to understand someone in this setting. Then, we develop a convolutional neural network model to adapt the robot's speech parameters to different users and spaces, while taking into account the influence of ambient acoustics on intelligibility. Finally, we present an evaluation with 27 users, demonstrating superior intelligibility and user experience with adaptive voice parameters compared to fixed voice.}},
  author       = {{Ren, Qiaoqiao and Hou, Yuanbo and Botteldooren, Dick and Belpaeme, Tony}},
  issn         = {{2377-3766}},
  journal      = {{IEEE ROBOTICS AND AUTOMATION LETTERS}},
  keywords     = {{NEURAL-NETWORKS,NOISE,REVERBERATION,PERCEPTION,PITCH,Robots,Auditory system,User experience,Task analysis,Particle measurements,Noise measurement,Atmospheric measurements,Human-centered robotics,design and human factors,social HRI}},
  language     = {{eng}},
  number       = {{7}},
  pages        = {{6162--6169}},
  title        = {{No more mumbles : enhancing robot intelligibility through speech adaptation}},
  url          = {{http://doi.org/10.1109/LRA.2024.3401117}},
  volume       = {{9}},
  year         = {{2024}},
}

Altmetric
View in Altmetric
Web of Science
Times cited: