Advanced search
1 file | 268.45 KB Add to list

A million tweets are worth a few points : tuning transformers for customer service tasks

Amir Hadifar (UGent) , Sofie Labat (UGent) , Veronique Hoste (UGent) , Chris Develder (UGent) and Thomas Demeester (UGent)
Author
Organization
Project
Abstract
In online domain-specific customer service applications, many companies struggle to deploy advanced NLP models successfully, due to the limited availability of and noise in their datasets. While prior research demonstrated the potential of migrating large open-domain pretrained models for domain-specific tasks, the appropriate (pre)training strategies have not yet been rigorously evaluated in such social media customer service settings, especially under multilingual conditions. We address this gap by collecting a multilingual social media corpus containing customer service conversations (865k tweets), comparing various pipelines of pretraining and finetuning approaches, applying them on 5 different end tasks. We show that pretraining a generic multilingual transformer model on our in-domain dataset, before finetuning on specific end tasks, consistently boosts performance, especially in non-English settings.

Downloads

  • Hadifar 2021.naacl-main.21.pdf
    • full text (Published version)
    • |
    • open access
    • |
    • PDF
    • |
    • 268.45 KB

Citation

Please use this url to cite or link to this publication:

MLA
Hadifar, Amir, et al. “A Million Tweets Are Worth a Few Points : Tuning Transformers for Customer Service Tasks.” 2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), edited by Kristina Toutanova et al., 2021, pp. 220–25, doi:10.18653/v1/2021.naacl-main.21.
APA
Hadifar, A., Labat, S., Hoste, V., Develder, C., & Demeester, T. (2021). A million tweets are worth a few points : tuning transformers for customer service tasks. In K. Toutanova, A. Rumshisky, L. Zettlemoyer, D. Hakkani-Tur, I. Beltagy, S. Bethard, … Y. Zhou (Eds.), 2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021) (pp. 220–225). https://doi.org/10.18653/v1/2021.naacl-main.21
Chicago author-date
Hadifar, Amir, Sofie Labat, Veronique Hoste, Chris Develder, and Thomas Demeester. 2021. “A Million Tweets Are Worth a Few Points : Tuning Transformers for Customer Service Tasks.” In 2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), edited by Kristina Toutanova, Anna Rumshisky, Luke Zettlemoyer, Dilek Hakkani-Tur, Iz Beltagy, Steven Bethard, Ryan Cotterell, Tanmoy Chakraborty, and Yichao Zhou, 220–25. https://doi.org/10.18653/v1/2021.naacl-main.21.
Chicago author-date (all authors)
Hadifar, Amir, Sofie Labat, Veronique Hoste, Chris Develder, and Thomas Demeester. 2021. “A Million Tweets Are Worth a Few Points : Tuning Transformers for Customer Service Tasks.” In 2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), ed by. Kristina Toutanova, Anna Rumshisky, Luke Zettlemoyer, Dilek Hakkani-Tur, Iz Beltagy, Steven Bethard, Ryan Cotterell, Tanmoy Chakraborty, and Yichao Zhou, 220–225. doi:10.18653/v1/2021.naacl-main.21.
Vancouver
1.
Hadifar A, Labat S, Hoste V, Develder C, Demeester T. A million tweets are worth a few points : tuning transformers for customer service tasks. In: Toutanova K, Rumshisky A, Zettlemoyer L, Hakkani-Tur D, Beltagy I, Bethard S, et al., editors. 2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021). 2021. p. 220–5.
IEEE
[1]
A. Hadifar, S. Labat, V. Hoste, C. Develder, and T. Demeester, “A million tweets are worth a few points : tuning transformers for customer service tasks,” in 2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), Online, 2021, pp. 220–225.
@inproceedings{8708982,
  abstract     = {{In online domain-specific customer service applications, many companies struggle to deploy advanced NLP models successfully, due to the limited availability of and noise in their datasets. While prior research demonstrated the potential of migrating large open-domain pretrained models for domain-specific tasks, the appropriate (pre)training strategies have not yet been rigorously evaluated in such social media customer service settings, especially under multilingual conditions. We address this gap by collecting a multilingual social media corpus containing customer service conversations (865k tweets), comparing various pipelines of pretraining and finetuning approaches, applying them on 5 different end tasks. We show that pretraining a generic multilingual transformer model on our in-domain dataset, before finetuning on specific end tasks, consistently boosts performance, especially in non-English settings.}},
  author       = {{Hadifar, Amir and Labat, Sofie and Hoste, Veronique and Develder, Chris and Demeester, Thomas}},
  booktitle    = {{2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021)}},
  editor       = {{Toutanova, Kristina and Rumshisky, Anna and Zettlemoyer, Luke and Hakkani-Tur, Dilek and Beltagy, Iz and Bethard, Steven and Cotterell, Ryan and Chakraborty, Tanmoy and Zhou, Yichao}},
  isbn         = {{9781954085466}},
  language     = {{eng}},
  location     = {{Online}},
  pages        = {{220--225}},
  title        = {{A million tweets are worth a few points : tuning transformers for customer service tasks}},
  url          = {{http://doi.org/10.18653/v1/2021.naacl-main.21}},
  year         = {{2021}},
}

Altmetric
View in Altmetric
Web of Science
Times cited: