Advanced search
1 file | 794.48 KB Add to list
Author
Organization
Project
Abstract
Large Language Models (LLMs) have demonstrated remarkable proficiency in text generation and display an apparent understanding of both physical and social aspects of the world. In this study, we look into the capabilities of LLMs to generate responses that align with human values. We focus on five prominent LLMs - GPT-3, GPT-4, PaLM-2, LLaMA-2 and BLOOM - and compare their generated responses with those provided by human participants. To evaluate the value alignment of LLMs, we presented domestic scenarios to the model and elicited a response with minimal prompting instructions. Human raters judged the responses on appropriateness and value alignment. The results revealed that GPT-3, 4 and PaLM-2 performed on par with human participants, displaying a notable level of value alignment in their generated responses. However, LLaMA-2 and BLOOM fell short in this aspect, indicating a possible divergence from human values. Furthermore, our findings indicate that the raters faced difficulty in distinguishing between responses generated by LLMs and those by humans, with raters exhibiting a preference for machine-generated responses in certain cases. These findings shed light on the capabilities of state-of-the-art LLMs to align with human values, but also allow us to speculate on whether these models could be value-aware. This research contributes to the ongoing exploration of LLMs' understanding of ethical considerations and provides insights into their potential for engaging in value-driven interactions.
Keywords
MIND, Values, Large Language Models, LLM, Alignment

Downloads

  • DS826.pdf
    • full text (Accepted manuscript)
    • |
    • open access
    • |
    • PDF
    • |
    • 794.48 KB

Citation

Please use this url to cite or link to this publication:

MLA
Abbo, Giulio Antonio, et al. “Social Value Alignment in Large Language Models.” VALUE ENGINEERING IN ARTIFICIAL INTELLIGENCE, VALE 2023, edited by N Osman and L Steels, vol. 14520, Springer Cham, 2024, pp. 83–97, doi:10.1007/978-3-031-58202-8_6.
APA
Abbo, G. A., Marchesi, S., Wykowska, A., & Belpaeme, T. (2024). Social value alignment in large language models. In N. Osman & L. Steels (Eds.), VALUE ENGINEERING IN ARTIFICIAL INTELLIGENCE, VALE 2023 (Vol. 14520, pp. 83–97). https://doi.org/10.1007/978-3-031-58202-8_6
Chicago author-date
Abbo, Giulio Antonio, Serena Marchesi, Agnieszka Wykowska, and Tony Belpaeme. 2024. “Social Value Alignment in Large Language Models.” In VALUE ENGINEERING IN ARTIFICIAL INTELLIGENCE, VALE 2023, edited by N Osman and L Steels, 14520:83–97. Springer Cham. https://doi.org/10.1007/978-3-031-58202-8_6.
Chicago author-date (all authors)
Abbo, Giulio Antonio, Serena Marchesi, Agnieszka Wykowska, and Tony Belpaeme. 2024. “Social Value Alignment in Large Language Models.” In VALUE ENGINEERING IN ARTIFICIAL INTELLIGENCE, VALE 2023, ed by. N Osman and L Steels, 14520:83–97. Springer Cham. doi:10.1007/978-3-031-58202-8_6.
Vancouver
1.
Abbo GA, Marchesi S, Wykowska A, Belpaeme T. Social value alignment in large language models. In: Osman N, Steels L, editors. VALUE ENGINEERING IN ARTIFICIAL INTELLIGENCE, VALE 2023. Springer Cham; 2024. p. 83–97.
IEEE
[1]
G. A. Abbo, S. Marchesi, A. Wykowska, and T. Belpaeme, “Social value alignment in large language models,” in VALUE ENGINEERING IN ARTIFICIAL INTELLIGENCE, VALE 2023, Krakow, Poland, 2024, vol. 14520, pp. 83–97.
@inproceedings{01JAFACFW1M9K4TMQD9MQT8X5Q,
  abstract     = {{Large Language Models (LLMs) have demonstrated remarkable proficiency in text generation and display an apparent understanding of both physical and social aspects of the world. In this study, we look into the capabilities of LLMs to generate responses that align with human values. We focus on five prominent LLMs - GPT-3, GPT-4, PaLM-2, LLaMA-2 and BLOOM - and compare their generated responses with those provided by human participants. To evaluate the value alignment of LLMs, we presented domestic scenarios to the model and elicited a response with minimal prompting instructions. Human raters judged the responses on appropriateness and value alignment. The results revealed that GPT-3, 4 and PaLM-2 performed on par with human participants, displaying a notable level of value alignment in their generated responses. However, LLaMA-2 and BLOOM fell short in this aspect, indicating a possible divergence from human values. Furthermore, our findings indicate that the raters faced difficulty in distinguishing between responses generated by LLMs and those by humans, with raters exhibiting a preference for machine-generated responses in certain cases. These findings shed light on the capabilities of state-of-the-art LLMs to align with human values, but also allow us to speculate on whether these models could be value-aware. This research contributes to the ongoing exploration of LLMs' understanding of ethical considerations and provides insights into their potential for engaging in value-driven interactions.}},
  author       = {{Abbo, Giulio Antonio and Marchesi, Serena and Wykowska, Agnieszka and Belpaeme, Tony}},
  booktitle    = {{VALUE ENGINEERING IN ARTIFICIAL INTELLIGENCE, VALE 2023}},
  editor       = {{Osman, N and Steels, L}},
  isbn         = {{9783031582042}},
  issn         = {{0302-9743}},
  keywords     = {{MIND,Values,Large Language Models,LLM,Alignment}},
  language     = {{eng}},
  location     = {{Krakow, Poland}},
  pages        = {{83--97}},
  publisher    = {{Springer Cham}},
  title        = {{Social value alignment in large language models}},
  url          = {{http://doi.org/10.1007/978-3-031-58202-8_6}},
  volume       = {{14520}},
  year         = {{2024}},
}

Altmetric
View in Altmetric
Web of Science
Times cited: