Advanced search
1 file | 880.29 KB Add to list

Exploring the effectiveness of evaluation practices for computer-generated nonverbal behaviour

Author
Organization
Project
Abstract
This paper compares three methods for evaluating computer-generated motion behaviour for animated characters: two commonly used direct rating methods and a newly designed questionnaire. The questionnaire is specifically designed to measure the human-likeness, appropriateness, and intelligibility of the generated motion. Furthermore, this study investigates the suitability of these evaluation tools for assessing subtle forms of human behaviour, such as the subdued motion cues shown when listening to someone. This paper reports six user studies, namely studies that directly rate the appropriateness and human-likeness of a computer character’s motion, along with studies that instead rely on a questionnaire to measure the quality of the motion. As test data, we used the motion generated by two generative models and recorded human gestures, which served as a gold standard. Our findings indicate that when evaluating gesturing motion, the direct rating of human-likeness and appropriateness is to be preferred over a questionnaire. However, when assessing the subtle motion of a computer character, even the direct rating method yields less conclusive results. Despite demonstrating high internal consistency, our questionnaire proves to be less sensitive than directly rating the quality of the motion. The results provide insights into the evaluation of human motion behaviour and highlight the complexities involved in capturing subtle nuances in nonverbal communication. These findings have implications for the development and improvement of motion generation models and can guide researchers in selecting appropriate evaluation methodologies for specific aspects of human behaviour.
Keywords
Computer Science Applications, ROBOT, ANTHROPOMORPHISM, GESTURE, human-computer interaction, embodied conversational agents, subjective, evaluations

Downloads

  • applsci-14-01460-v3.pdf
    • full text (Published version)
    • |
    • open access
    • |
    • PDF
    • |
    • 880.29 KB

Citation

Please use this url to cite or link to this publication:

MLA
Wolfert, Pieter, et al. “Exploring the Effectiveness of Evaluation Practices for Computer-Generated Nonverbal Behaviour.” APPLIED SCIENCES-BASEL, vol. 14, no. 4, 2024, doi:10.3390/app14041460.
APA
Wolfert, P., Henter, G. E., & Belpaeme, T. (2024). Exploring the effectiveness of evaluation practices for computer-generated nonverbal behaviour. APPLIED SCIENCES-BASEL, 14(4). https://doi.org/10.3390/app14041460
Chicago author-date
Wolfert, Pieter, Gustav Eje Henter, and Tony Belpaeme. 2024. “Exploring the Effectiveness of Evaluation Practices for Computer-Generated Nonverbal Behaviour.” APPLIED SCIENCES-BASEL 14 (4). https://doi.org/10.3390/app14041460.
Chicago author-date (all authors)
Wolfert, Pieter, Gustav Eje Henter, and Tony Belpaeme. 2024. “Exploring the Effectiveness of Evaluation Practices for Computer-Generated Nonverbal Behaviour.” APPLIED SCIENCES-BASEL 14 (4). doi:10.3390/app14041460.
Vancouver
1.
Wolfert P, Henter GE, Belpaeme T. Exploring the effectiveness of evaluation practices for computer-generated nonverbal behaviour. APPLIED SCIENCES-BASEL. 2024;14(4).
IEEE
[1]
P. Wolfert, G. E. Henter, and T. Belpaeme, “Exploring the effectiveness of evaluation practices for computer-generated nonverbal behaviour,” APPLIED SCIENCES-BASEL, vol. 14, no. 4, 2024.
@article{01HQQPNB7K6W1M47KZHSNW45P7,
  abstract     = {{This paper compares three methods for evaluating computer-generated motion behaviour for animated characters: two commonly used direct rating methods and a newly designed questionnaire. The questionnaire is specifically designed to measure the human-likeness, appropriateness, and intelligibility of the generated motion. Furthermore, this study investigates the suitability of these evaluation tools for assessing subtle forms of human behaviour, such as the subdued motion cues shown when listening to someone. This paper reports six user studies, namely studies that directly rate the appropriateness and human-likeness of a computer character’s motion, along with studies that instead rely on a questionnaire to measure the quality of the motion. As test data, we used the motion generated by two generative models and recorded human gestures, which served as a gold standard. Our findings indicate that when evaluating gesturing motion, the direct rating of human-likeness and appropriateness is to be preferred over a questionnaire. However, when assessing the subtle motion of a computer character, even the direct rating method yields less conclusive results. Despite demonstrating high internal consistency, our questionnaire proves to be less sensitive than directly rating the quality of the motion. The results provide insights into the evaluation of human motion behaviour and highlight the complexities involved in capturing subtle nuances in nonverbal communication. These findings have implications for the development and improvement of motion generation models and can guide researchers in selecting appropriate evaluation methodologies for specific aspects of human behaviour.}},
  articleno    = {{1460}},
  author       = {{Wolfert, Pieter and Henter, Gustav Eje and Belpaeme, Tony}},
  issn         = {{2076-3417}},
  journal      = {{APPLIED SCIENCES-BASEL}},
  keywords     = {{Computer Science Applications,ROBOT,ANTHROPOMORPHISM,GESTURE,human-computer interaction,embodied conversational agents,subjective,evaluations}},
  language     = {{eng}},
  number       = {{4}},
  pages        = {{20}},
  title        = {{Exploring the effectiveness of evaluation practices for computer-generated nonverbal behaviour}},
  url          = {{http://doi.org/10.3390/app14041460}},
  volume       = {{14}},
  year         = {{2024}},
}

Altmetric
View in Altmetric
Web of Science
Times cited: