Advanced search
Add to list

Benchmark study for Flemish twitter sentiment analysis

Author
Organization
Abstract
Microblogging websites such as Twitter have caused sentiment analysis research to increase in popularity over the last several decades. However, most studies focus on the English language, which leaves other languages underrepresented. Therefore, we conduct a benchmark study on several modeling techniques for sentiment analysis with a new dataset containing Flemish tweets. The key contribution of our paper lies in its innovative experimental design, where we compare different preprocessing techniques and vector representations to find the best-performing combination for a Flemish dataset. We compare models belonging to four different categories: lexicon-based methods, traditional machine learning models, neural networks, and attention-based models. We find that more preprocessing leads to better results, but the best-performing vector representation approach depends on the applied model. Moreover, an immense gap is observed between the performance of the lexicon-based approaches and that of the other models. The traditional machine learning approaches and the neural networks produce similar results, but the attention-based model is the best-performing technique. Nevertheless, a tradeoff should be made between computational expenses and performance gains.
Keywords
Pharmacology (medical)

Citation

Please use this url to cite or link to this publication:

MLA
Reusens, Manon, et al. “Benchmark Study for Flemish Twitter Sentiment Analysis.” SOCIAL SCIENCE RESEARCH NETWORK, 2022, doi:10.2139/ssrn.4096559.
APA
Reusens, M., Reusens, M., Callens, M., vanden Broucke, S., & Baesens, B. (2022). Benchmark study for Flemish twitter sentiment analysis. https://doi.org/10.2139/ssrn.4096559
Chicago author-date
Reusens, Manon, Michael Reusens, Marc Callens, Seppe vanden Broucke, and Bart Baesens. 2022. “Benchmark Study for Flemish Twitter Sentiment Analysis.” SOCIAL SCIENCE RESEARCH NETWORK. https://doi.org/10.2139/ssrn.4096559.
Chicago author-date (all authors)
Reusens, Manon, Michael Reusens, Marc Callens, Seppe vanden Broucke, and Bart Baesens. 2022. “Benchmark Study for Flemish Twitter Sentiment Analysis.” SOCIAL SCIENCE RESEARCH NETWORK. doi:10.2139/ssrn.4096559.
Vancouver
1.
Reusens M, Reusens M, Callens M, vanden Broucke S, Baesens B. Benchmark study for Flemish twitter sentiment analysis. SOCIAL SCIENCE RESEARCH NETWORK. 2022.
IEEE
[1]
M. Reusens, M. Reusens, M. Callens, S. vanden Broucke, and B. Baesens, “Benchmark study for Flemish twitter sentiment analysis,” SOCIAL SCIENCE RESEARCH NETWORK. 2022.
@misc{8754364,
  abstract     = {{Microblogging websites such as Twitter have caused sentiment analysis research to increase in popularity over the last several decades. However, most studies focus on the English language, which leaves other languages underrepresented. Therefore, we conduct a benchmark study on several modeling techniques for sentiment analysis with a new dataset containing Flemish tweets. The key contribution of our paper lies in its innovative experimental design, where we compare different preprocessing techniques and vector representations to find the best-performing combination for a Flemish dataset. We compare models belonging to four different categories: lexicon-based methods, traditional machine learning models, neural networks, and attention-based models. We find that more preprocessing leads to better results, but the best-performing vector representation approach depends on the applied model. Moreover, an immense gap is observed between the performance of the lexicon-based approaches and that of the other models. The traditional machine learning approaches and the neural networks produce similar results, but the attention-based model is the best-performing technique. Nevertheless, a tradeoff should be made between computational expenses and performance gains.}},
  author       = {{Reusens, Manon and Reusens, Michael and Callens, Marc and vanden Broucke, Seppe and Baesens, Bart}},
  issn         = {{1556-5068}},
  keywords     = {{Pharmacology (medical)}},
  language     = {{eng}},
  pages        = {{39}},
  series       = {{SOCIAL SCIENCE RESEARCH NETWORK}},
  title        = {{Benchmark study for Flemish twitter sentiment analysis}},
  url          = {{http://doi.org/10.2139/ssrn.4096559}},
  year         = {{2022}},
}

Altmetric
View in Altmetric