Social value alignment in large language models
(2024)
VALUE ENGINEERING IN ARTIFICIAL INTELLIGENCE, VALE 2023.
In Lecture Notes in Computer Science
14520.
p.83-97
- Author
- Giulio Antonio Abbo (UGent) , Serena Marchesi, Agnieszka Wykowska and Tony Belpaeme (UGent)
- Organization
- Project
- Abstract
- Large Language Models (LLMs) have demonstrated remarkable proficiency in text generation and display an apparent understanding of both physical and social aspects of the world. In this study, we look into the capabilities of LLMs to generate responses that align with human values. We focus on five prominent LLMs - GPT-3, GPT-4, PaLM-2, LLaMA-2 and BLOOM - and compare their generated responses with those provided by human participants. To evaluate the value alignment of LLMs, we presented domestic scenarios to the model and elicited a response with minimal prompting instructions. Human raters judged the responses on appropriateness and value alignment. The results revealed that GPT-3, 4 and PaLM-2 performed on par with human participants, displaying a notable level of value alignment in their generated responses. However, LLaMA-2 and BLOOM fell short in this aspect, indicating a possible divergence from human values. Furthermore, our findings indicate that the raters faced difficulty in distinguishing between responses generated by LLMs and those by humans, with raters exhibiting a preference for machine-generated responses in certain cases. These findings shed light on the capabilities of state-of-the-art LLMs to align with human values, but also allow us to speculate on whether these models could be value-aware. This research contributes to the ongoing exploration of LLMs' understanding of ethical considerations and provides insights into their potential for engaging in value-driven interactions.
- Keywords
- MIND, Values, Large Language Models, LLM, Alignment
Downloads
-
DS826.pdf
- full text (Accepted manuscript)
- |
- open access
- |
- |
- 794.48 KB
Citation
Please use this url to cite or link to this publication: http://hdl.handle.net/1854/LU-01JAFACFW1M9K4TMQD9MQT8X5Q
- MLA
- Abbo, Giulio Antonio, et al. “Social Value Alignment in Large Language Models.” VALUE ENGINEERING IN ARTIFICIAL INTELLIGENCE, VALE 2023, edited by N Osman and L Steels, vol. 14520, Springer Cham, 2024, pp. 83–97, doi:10.1007/978-3-031-58202-8_6.
- APA
- Abbo, G. A., Marchesi, S., Wykowska, A., & Belpaeme, T. (2024). Social value alignment in large language models. In N. Osman & L. Steels (Eds.), VALUE ENGINEERING IN ARTIFICIAL INTELLIGENCE, VALE 2023 (Vol. 14520, pp. 83–97). https://doi.org/10.1007/978-3-031-58202-8_6
- Chicago author-date
- Abbo, Giulio Antonio, Serena Marchesi, Agnieszka Wykowska, and Tony Belpaeme. 2024. “Social Value Alignment in Large Language Models.” In VALUE ENGINEERING IN ARTIFICIAL INTELLIGENCE, VALE 2023, edited by N Osman and L Steels, 14520:83–97. Springer Cham. https://doi.org/10.1007/978-3-031-58202-8_6.
- Chicago author-date (all authors)
- Abbo, Giulio Antonio, Serena Marchesi, Agnieszka Wykowska, and Tony Belpaeme. 2024. “Social Value Alignment in Large Language Models.” In VALUE ENGINEERING IN ARTIFICIAL INTELLIGENCE, VALE 2023, ed by. N Osman and L Steels, 14520:83–97. Springer Cham. doi:10.1007/978-3-031-58202-8_6.
- Vancouver
- 1.Abbo GA, Marchesi S, Wykowska A, Belpaeme T. Social value alignment in large language models. In: Osman N, Steels L, editors. VALUE ENGINEERING IN ARTIFICIAL INTELLIGENCE, VALE 2023. Springer Cham; 2024. p. 83–97.
- IEEE
- [1]G. A. Abbo, S. Marchesi, A. Wykowska, and T. Belpaeme, “Social value alignment in large language models,” in VALUE ENGINEERING IN ARTIFICIAL INTELLIGENCE, VALE 2023, Krakow, Poland, 2024, vol. 14520, pp. 83–97.
@inproceedings{01JAFACFW1M9K4TMQD9MQT8X5Q,
abstract = {{Large Language Models (LLMs) have demonstrated remarkable proficiency in text generation and display an apparent understanding of both physical and social aspects of the world. In this study, we look into the capabilities of LLMs to generate responses that align with human values. We focus on five prominent LLMs - GPT-3, GPT-4, PaLM-2, LLaMA-2 and BLOOM - and compare their generated responses with those provided by human participants. To evaluate the value alignment of LLMs, we presented domestic scenarios to the model and elicited a response with minimal prompting instructions. Human raters judged the responses on appropriateness and value alignment. The results revealed that GPT-3, 4 and PaLM-2 performed on par with human participants, displaying a notable level of value alignment in their generated responses. However, LLaMA-2 and BLOOM fell short in this aspect, indicating a possible divergence from human values. Furthermore, our findings indicate that the raters faced difficulty in distinguishing between responses generated by LLMs and those by humans, with raters exhibiting a preference for machine-generated responses in certain cases. These findings shed light on the capabilities of state-of-the-art LLMs to align with human values, but also allow us to speculate on whether these models could be value-aware. This research contributes to the ongoing exploration of LLMs' understanding of ethical considerations and provides insights into their potential for engaging in value-driven interactions.}},
author = {{Abbo, Giulio Antonio and Marchesi, Serena and Wykowska, Agnieszka and Belpaeme, Tony}},
booktitle = {{VALUE ENGINEERING IN ARTIFICIAL INTELLIGENCE, VALE 2023}},
editor = {{Osman, N and Steels, L}},
isbn = {{9783031582042}},
issn = {{0302-9743}},
keywords = {{MIND,Values,Large Language Models,LLM,Alignment}},
language = {{eng}},
location = {{Krakow, Poland}},
pages = {{83--97}},
publisher = {{Springer Cham}},
title = {{Social value alignment in large language models}},
url = {{http://doi.org/10.1007/978-3-031-58202-8_6}},
volume = {{14520}},
year = {{2024}},
}
- Altmetric
- View in Altmetric
- Web of Science
- Times cited: