Advanced search
1 file | 285.06 KB Add to list

Noise or music? Investigating the usefulness of normalisation for robust sentiment analysis on social media data

Author
Organization
Abstract
In the past decade, sentiment analysis research has thrived, especially on social media. While this data genre is suitable to extract opinions and sentiment, it is known to be noisy. Complex normalisation methods have been developed to transform noisy text into its standard form, but their effect on tasks like sentiment analysis remains underinvestigated. Sentiment analysis approaches mostly include spell checking or rule-based normalisation as preprocess- ing and rarely investigate its impact on the task performance. We present an optimised sentiment classifier and investigate to what extent its performance can be enhanced by integrating SMT-based normalisation as preprocessing. Experiments on a test set comprising a variety of user-generated content genres revealed that normalisation improves sentiment classification performance on tweets and blog posts, showing the model’s ability to generalise to other data genres.
Keywords
LT3

Downloads

  • TAL58-1 NoiseOrMusic VanHee-et-al.pdf
    • full text
    • |
    • open access
    • |
    • PDF
    • |
    • 285.06 KB

Citation

Please use this url to cite or link to this publication:

MLA
Van Hee, Cynthia, et al. “Noise or Music? Investigating the Usefulness of Normalisation for Robust Sentiment Analysis on Social Media Data.” TRAITEMENT AUTOMATIQUE DES LANGUES, vol. 58, no. 1, 2017, pp. 63–87.
APA
Van Hee, C., Van de Kauter, M., De Clercq, O., Lefever, E., Desmet, B., & Hoste, V. (2017). Noise or music? Investigating the usefulness of normalisation for robust sentiment analysis on social media data. TRAITEMENT AUTOMATIQUE DES LANGUES, 58(1), 63–87.
Chicago author-date
Van Hee, Cynthia, Marjan Van de Kauter, Orphée De Clercq, Els Lefever, Bart Desmet, and Veronique Hoste. 2017. “Noise or Music? Investigating the Usefulness of Normalisation for Robust Sentiment Analysis on Social Media Data.” TRAITEMENT AUTOMATIQUE DES LANGUES 58 (1): 63–87.
Chicago author-date (all authors)
Van Hee, Cynthia, Marjan Van de Kauter, Orphée De Clercq, Els Lefever, Bart Desmet, and Veronique Hoste. 2017. “Noise or Music? Investigating the Usefulness of Normalisation for Robust Sentiment Analysis on Social Media Data.” TRAITEMENT AUTOMATIQUE DES LANGUES 58 (1): 63–87.
Vancouver
1.
Van Hee C, Van de Kauter M, De Clercq O, Lefever E, Desmet B, Hoste V. Noise or music? Investigating the usefulness of normalisation for robust sentiment analysis on social media data. TRAITEMENT AUTOMATIQUE DES LANGUES. 2017;58(1):63–87.
IEEE
[1]
C. Van Hee, M. Van de Kauter, O. De Clercq, E. Lefever, B. Desmet, and V. Hoste, “Noise or music? Investigating the usefulness of normalisation for robust sentiment analysis on social media data,” TRAITEMENT AUTOMATIQUE DES LANGUES, vol. 58, no. 1, pp. 63–87, 2017.
@article{8548017,
  abstract     = {{In the past decade, sentiment analysis research has thrived, especially on social media. While this data genre is suitable to extract opinions and sentiment, it is known to be noisy. Complex normalisation methods have been developed to transform noisy text into its standard form, but their effect on tasks like sentiment analysis remains underinvestigated. Sentiment analysis approaches mostly include spell checking or rule-based normalisation as preprocess- ing and rarely investigate its impact on the task performance. We present an optimised sentiment classifier and investigate to what extent its performance can be enhanced by integrating SMT-based normalisation as preprocessing. Experiments on a test set comprising a variety of user-generated content genres revealed that normalisation improves sentiment classification performance on tweets and blog posts, showing the model’s ability to generalise to other data genres.}},
  author       = {{Van Hee, Cynthia and Van de Kauter, Marjan and De Clercq, Orphée and Lefever, Els and Desmet, Bart and Hoste, Veronique}},
  issn         = {{1248-9433}},
  journal      = {{TRAITEMENT AUTOMATIQUE DES LANGUES}},
  keywords     = {{LT3}},
  language     = {{eng}},
  number       = {{1}},
  pages        = {{63--87}},
  title        = {{Noise or music? Investigating the usefulness of normalisation for robust sentiment analysis on social media data}},
  volume       = {{58}},
  year         = {{2017}},
}