Sentiment analysis on video transcripts : comparing the value of textual and multimodal annotations
- Author
- Quanqi Du (UGent) , Loic De Langhe (UGent) , Els Lefever (UGent) and Veronique Hoste (UGent)
- Organization
- Project
- Abstract
- This study explores the differences between textual and multimodal sentiment annotations on videos and their impact on transcript-based sentiment modelling. Using the UniC and CH-SIMS datasets which are annotated at both the unimodal and multimodal level, we conducted a statistical analysis and sentiment modelling experiments. Results reveal significant differences between the two annotation types, with textual annotations yielding better performance in sentiment modelling and demonstrating superior generalization ability. These findings highlight the challenges of cross-modality generalization and provide insights for advancing sentiment analysis.
- Keywords
- Sentiment analysis, Video transcripts, Textual annotations, Multimodal annotations
Downloads
-
2025.wnut-1.2.pdf
- full text (Published version)
- |
- open access
- |
- |
- 2.53 MB
Citation
Please use this url to cite or link to this publication: http://hdl.handle.net/1854/LU-01JSYBNZN7898H8AEHD9H556S8
- MLA
- Du, Quanqi, et al. “Sentiment Analysis on Video Transcripts : Comparing the Value of Textual and Multimodal Annotations.” Proceedings of the 10th Workshop on Noisy and User-Generated Text (W-NUT 2025), edited by JinYeong Bak et al., Association for Computational Linguistics (ACL), 2025, pp. 10–15.
- APA
- Du, Q., De Langhe, L., Lefever, E., & Hoste, V. (2025). Sentiment analysis on video transcripts : comparing the value of textual and multimodal annotations. In J. Bak, R. van der Goot, H. Jang, W. Buaphet, A. Ramponi, W. Xu, & A. Ritter (Eds.), Proceedings of the 10th Workshop on Noisy and User-generated Text (W-NUT 2025) (pp. 10–15). Association for Computational Linguistics (ACL).
- Chicago author-date
- Du, Quanqi, Loic De Langhe, Els Lefever, and Veronique Hoste. 2025. “Sentiment Analysis on Video Transcripts : Comparing the Value of Textual and Multimodal Annotations.” In Proceedings of the 10th Workshop on Noisy and User-Generated Text (W-NUT 2025), edited by JinYeong Bak, Rob van der Goot, Hyeju Jang, Weerayut Buaphet, Alan Ramponi, Wei Xu, and Alan Ritter, 10–15. Association for Computational Linguistics (ACL).
- Chicago author-date (all authors)
- Du, Quanqi, Loic De Langhe, Els Lefever, and Veronique Hoste. 2025. “Sentiment Analysis on Video Transcripts : Comparing the Value of Textual and Multimodal Annotations.” In Proceedings of the 10th Workshop on Noisy and User-Generated Text (W-NUT 2025), ed by. JinYeong Bak, Rob van der Goot, Hyeju Jang, Weerayut Buaphet, Alan Ramponi, Wei Xu, and Alan Ritter, 10–15. Association for Computational Linguistics (ACL).
- Vancouver
- 1.Du Q, De Langhe L, Lefever E, Hoste V. Sentiment analysis on video transcripts : comparing the value of textual and multimodal annotations. In: Bak J, Goot R van der, Jang H, Buaphet W, Ramponi A, Xu W, et al., editors. Proceedings of the 10th Workshop on Noisy and User-generated Text (W-NUT 2025). Association for Computational Linguistics (ACL); 2025. p. 10–5.
- IEEE
- [1]Q. Du, L. De Langhe, E. Lefever, and V. Hoste, “Sentiment analysis on video transcripts : comparing the value of textual and multimodal annotations,” in Proceedings of the 10th Workshop on Noisy and User-generated Text (W-NUT 2025), Albuquerque, New Mexico, USA, 2025, pp. 10–15.
@inproceedings{01JSYBNZN7898H8AEHD9H556S8,
abstract = {{This study explores the differences between textual and multimodal sentiment annotations on videos and their impact on transcript-based sentiment modelling. Using the UniC and CH-SIMS datasets which are annotated at both the unimodal and multimodal level, we conducted a statistical analysis and sentiment modelling experiments. Results reveal significant differences between the two annotation types, with textual annotations yielding better performance in sentiment modelling and demonstrating superior generalization ability. These findings highlight the challenges of cross-modality generalization and provide insights for advancing sentiment analysis.}},
author = {{Du, Quanqi and De Langhe, Loic and Lefever, Els and Hoste, Veronique}},
booktitle = {{Proceedings of the 10th Workshop on Noisy and User-generated Text (W-NUT 2025)}},
editor = {{Bak, JinYeong and Goot, Rob van der and Jang, Hyeju and Buaphet, Weerayut and Ramponi, Alan and Xu, Wei and Ritter, Alan}},
isbn = {{9798891762329}},
keywords = {{Sentiment analysis,Video transcripts,Textual annotations,Multimodal annotations}},
language = {{eng}},
location = {{Albuquerque, New Mexico, USA}},
pages = {{10--15}},
publisher = {{Association for Computational Linguistics (ACL)}},
title = {{Sentiment analysis on video transcripts : comparing the value of textual and multimodal annotations}},
url = {{https://aclanthology.org/2025.wnut-1.2/}},
year = {{2025}},
}