Advanced search
1 file | 3.33 MB Add to list

LETS : a label-efficient training scheme for aspect-based sentiment analysis by using a pre-trained language model

(2021) IEEE ACCESS. 9. p.115563-115578
Author
Organization
Abstract
Recently proposed pre-trained language models can be easily fine-tuned to a wide range of downstream tasks. However, a large-scale labelled task-specific dataset is required for fine-tuning creating a bottleneck in the development process of machine learning applications. To foster a fast development by reducing manual labelling efforts, we propose a Label-Efficient Training Scheme (LETS). The proposed LETS consists of three elements: (i) task-specific pre-training to exploit unlabelled task-specific corpus data, (ii) label augmentation to maximise the utility of labelled data, and (iii) active learning to label data strategically. In this paper, we apply LETS to a novel aspect-based sentiment analysis (ABSA) use-case for analysing the reviews of the health-related program supporting people to improve their sleep quality. We validate the proposed LETS on a custom health-related program-reviews dataset and another ABSA benchmark dataset. Experimental results show that the LETS can reduce manual labelling efforts 2-3 times compared to labelling with random sampling on both datasets. The LETS also outperforms other state-of-the-art active learning methods. Furthermore, the experimental results show that LETS can contribute to better generalisability with both datasets compared to other methods thanks to the task-specific pre-training and the proposed label augmentation. We expect this work could contribute to the natural language processing (NLP) domain by addressing the issue of the high cost of manually labelling data. Also, our work could contribute to the healthcare domain by introducing a new potential application of NLP techniques.
Keywords
General Engineering, General Materials Science, General Computer Science, Task analysis, Labeling, Uncertainty, Sentiment analysis, Natural language processing, Data models, Training, Active learning, machine learning, natural language processing, neural networks, sentiment analysis

Downloads

  • Published.pdf
    • full text (Published version)
    • |
    • open access
    • |
    • PDF
    • |
    • 3.33 MB

Citation

Please use this url to cite or link to this publication:

MLA
Shim, Heereen, et al. “LETS : A Label-Efficient Training Scheme for Aspect-Based Sentiment Analysis by Using a Pre-Trained Language Model.” IEEE ACCESS, vol. 9, 2021, pp. 115563–78, doi:10.1109/access.2021.3101867.
APA
Shim, H., Lowet, D., Luca, S., & Vanrumste, B. (2021). LETS : a label-efficient training scheme for aspect-based sentiment analysis by using a pre-trained language model. IEEE ACCESS, 9, 115563–115578. https://doi.org/10.1109/access.2021.3101867
Chicago author-date
Shim, Heereen, Dietwig Lowet, Stijn Luca, and Bart Vanrumste. 2021. “LETS : A Label-Efficient Training Scheme for Aspect-Based Sentiment Analysis by Using a Pre-Trained Language Model.” IEEE ACCESS 9: 115563–78. https://doi.org/10.1109/access.2021.3101867.
Chicago author-date (all authors)
Shim, Heereen, Dietwig Lowet, Stijn Luca, and Bart Vanrumste. 2021. “LETS : A Label-Efficient Training Scheme for Aspect-Based Sentiment Analysis by Using a Pre-Trained Language Model.” IEEE ACCESS 9: 115563–115578. doi:10.1109/access.2021.3101867.
Vancouver
1.
Shim H, Lowet D, Luca S, Vanrumste B. LETS : a label-efficient training scheme for aspect-based sentiment analysis by using a pre-trained language model. IEEE ACCESS. 2021;9:115563–78.
IEEE
[1]
H. Shim, D. Lowet, S. Luca, and B. Vanrumste, “LETS : a label-efficient training scheme for aspect-based sentiment analysis by using a pre-trained language model,” IEEE ACCESS, vol. 9, pp. 115563–115578, 2021.
@article{8717226,
  abstract     = {{Recently proposed pre-trained language models can be easily fine-tuned to a wide range of downstream tasks. However, a large-scale labelled task-specific dataset is required for fine-tuning creating a bottleneck in the development process of machine learning applications. To foster a fast development by reducing manual labelling efforts, we propose a Label-Efficient Training Scheme (LETS). The proposed LETS consists of three elements: (i) task-specific pre-training to exploit unlabelled task-specific corpus data, (ii) label augmentation to maximise the utility of labelled data, and (iii) active learning to label data strategically. In this paper, we apply LETS to a novel aspect-based sentiment analysis (ABSA) use-case for analysing the reviews of the health-related program supporting people to improve their sleep quality. We validate the proposed LETS on a custom health-related program-reviews dataset and another ABSA benchmark dataset. Experimental results show that the LETS can reduce manual labelling efforts 2-3 times compared to labelling with random sampling on both datasets. The LETS also outperforms other state-of-the-art active learning methods. Furthermore, the experimental results show that LETS can contribute to better generalisability with both datasets compared to other methods thanks to the task-specific pre-training and the proposed label augmentation. We expect this work could contribute to the natural language processing (NLP) domain by addressing the issue of the high cost of manually labelling data. Also, our work could contribute to the healthcare domain by introducing a new potential application of NLP techniques.}},
  author       = {{Shim, Heereen and Lowet, Dietwig and Luca, Stijn and Vanrumste, Bart}},
  issn         = {{2169-3536}},
  journal      = {{IEEE ACCESS}},
  keywords     = {{General Engineering,General Materials Science,General Computer Science,Task analysis,Labeling,Uncertainty,Sentiment analysis,Natural language processing,Data models,Training,Active learning,machine learning,natural language processing,neural networks,sentiment analysis}},
  language     = {{eng}},
  pages        = {{115563--115578}},
  title        = {{LETS : a label-efficient training scheme for aspect-based sentiment analysis by using a pre-trained language model}},
  url          = {{http://dx.doi.org/10.1109/access.2021.3101867}},
  volume       = {{9}},
  year         = {{2021}},
}

Altmetric
View in Altmetric
Web of Science
Times cited: