Advanced search
1 file | 911.26 KB Add to list

Current limitations in cyberbullying detection : on evaluation criteria, reproducibility, and data scarcity

Author
Organization
Project
Abstract
The detection of online cyberbullying has seen an increase in societal importance, popularity in research, and available open data. Nevertheless, while computational power and affordability of resources continue to increase, the access restrictions on high-quality data limit the applicability of state-of-the-art techniques. Consequently, much of the recent research uses small, heterogeneous datasets, without a thorough evaluation of applicability. In this paper, we further illustrate these issues, as we (i) evaluate many publicly available resources for this task and demonstrate difficulties with data collection. These predominantly yield small datasets that fail to capture the required complex social dynamics and impede direct comparison of progress. We (ii) conduct an extensive set of experiments that indicate a general lack of cross-domain generalization of classifiers trained on these sources, and openly provide this framework to replicate and extend our evaluation criteria. Finally, we (iii) present an effective crowdsourcing method: simulating real-life bullying scenarios in a lab setting generates plausible data that can be effectively used to enrich real data. This largely circumvents the restrictions on data that can be collected, and increases classifier performance. We believe these contributions can aid in improving the empirical practices of future research in the field.
Keywords
Cyberbullying detection, Cross-domain evaluation, Reproducibility, Crowdsourcing, Data enrichment, lt3

Downloads

  • published article.pdf
    • full text (Published version)
    • |
    • open access
    • |
    • PDF
    • |
    • 911.26 KB

Citation

Please use this url to cite or link to this publication:

MLA
Emmery, Chris, et al. “Current Limitations in Cyberbullying Detection : On Evaluation Criteria, Reproducibility, and Data Scarcity.” LANGUAGE RESOURCES AND EVALUATION, vol. 55, no. 3, 2021, pp. 597–633, doi:10.1007/s10579-020-09509-1.
APA
Emmery, C., Verhoeven, B., De Pauw, G., Jacobs, G., Van Hee, C., Lefever, E., … Daelemans, W. (2021). Current limitations in cyberbullying detection : on evaluation criteria, reproducibility, and data scarcity. LANGUAGE RESOURCES AND EVALUATION, 55(3), 597–633. https://doi.org/10.1007/s10579-020-09509-1
Chicago author-date
Emmery, Chris, Ben Verhoeven, Guy De Pauw, Gilles Jacobs, Cynthia Van Hee, Els Lefever, Bart Desmet, Veronique Hoste, and Walter Daelemans. 2021. “Current Limitations in Cyberbullying Detection : On Evaluation Criteria, Reproducibility, and Data Scarcity.” LANGUAGE RESOURCES AND EVALUATION 55 (3): 597–633. https://doi.org/10.1007/s10579-020-09509-1.
Chicago author-date (all authors)
Emmery, Chris, Ben Verhoeven, Guy De Pauw, Gilles Jacobs, Cynthia Van Hee, Els Lefever, Bart Desmet, Veronique Hoste, and Walter Daelemans. 2021. “Current Limitations in Cyberbullying Detection : On Evaluation Criteria, Reproducibility, and Data Scarcity.” LANGUAGE RESOURCES AND EVALUATION 55 (3): 597–633. doi:10.1007/s10579-020-09509-1.
Vancouver
1.
Emmery C, Verhoeven B, De Pauw G, Jacobs G, Van Hee C, Lefever E, et al. Current limitations in cyberbullying detection : on evaluation criteria, reproducibility, and data scarcity. LANGUAGE RESOURCES AND EVALUATION. 2021;55(3):597–633.
IEEE
[1]
C. Emmery et al., “Current limitations in cyberbullying detection : on evaluation criteria, reproducibility, and data scarcity,” LANGUAGE RESOURCES AND EVALUATION, vol. 55, no. 3, pp. 597–633, 2021.
@article{8680974,
  abstract     = {{The detection of online cyberbullying has seen an increase in societal importance, popularity in research, and available open data. Nevertheless, while computational power and affordability of resources continue to increase, the access restrictions on high-quality data limit the applicability of state-of-the-art techniques. Consequently, much of the recent research uses small, heterogeneous datasets, without a thorough evaluation of applicability. In this paper, we further illustrate these issues, as we (i) evaluate many publicly available resources for this task and demonstrate difficulties with data collection. These predominantly yield small datasets that fail to capture the required complex social dynamics and impede direct comparison of progress. We (ii) conduct an extensive set of experiments that indicate a general lack of cross-domain generalization of classifiers trained on these sources, and openly provide this framework to replicate and extend our evaluation criteria. Finally, we (iii) present an effective crowdsourcing method: simulating real-life bullying scenarios in a lab setting generates plausible data that can be effectively used to enrich real data. This largely circumvents the restrictions on data that can be collected, and increases classifier performance. We believe these contributions can aid in improving the empirical practices of future research in the field.}},
  author       = {{Emmery, Chris and Verhoeven, Ben and De Pauw, Guy and Jacobs, Gilles and Van Hee, Cynthia and Lefever, Els and Desmet, Bart and Hoste, Veronique and Daelemans, Walter}},
  issn         = {{1574-020X}},
  journal      = {{LANGUAGE RESOURCES AND EVALUATION}},
  keywords     = {{Cyberbullying detection,Cross-domain evaluation,Reproducibility,Crowdsourcing,Data enrichment,lt3}},
  language     = {{eng}},
  number       = {{3}},
  pages        = {{597--633}},
  title        = {{Current limitations in cyberbullying detection : on evaluation criteria, reproducibility, and data scarcity}},
  url          = {{http://dx.doi.org/10.1007/s10579-020-09509-1}},
  volume       = {{55}},
  year         = {{2021}},
}

Altmetric
View in Altmetric
Web of Science
Times cited: