Advanced search
1 file | 460.39 KB Add to list

The GW/LT3 VarDial 2016 shared task system for dialects and similar languages detection

Author
Organization
Abstract
This paper describes the GW/LT3 contribution to the 2016 VarDial shared task on the identification of similar languages (task 1) and Arabic dialects (task 2). For both tasks, we experimented with Logistic Regression and Neural Network classifiers in isolation. Additionally, we implemented a cascaded classifier that consists of coarse and fine-grained classifiers (task 1) and a classifier ensemble with majority voting for task 2. The submitted systems obtained state-of-the-art performance and ranked first for the evaluation on social media data (test sets B1 and B2 for task 1), with a maximum weighted F1 score of 91.94%.
Keywords
dialect, language identification, machine learning, natural language processing

Downloads

  • VarDial304.pdf
    • full text
    • |
    • open access
    • |
    • PDF
    • |
    • 460.39 KB

Citation

Please use this url to cite or link to this publication:

MLA
Zirikly, Ayah, et al. “The GW/LT3 VarDial 2016 Shared Task System for Dialects and Similar Languages Detection.” Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial3), edited by Preslav Nakov et al., The COLING 2016 Organizing Committee, 2016, pp. 33–41.
APA
Zirikly, A., Desmet, B., & Diab, M. (2016). The GW/LT3 VarDial 2016 shared task system for dialects and similar languages detection. In P. Nakov, M. Zampieri, L. Tan, N. Ljubešić, J. Tiedemann, & S. Malmasi (Eds.), Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial3) (pp. 33–41). Osaka, Japan: The COLING 2016 Organizing Committee.
Chicago author-date
Zirikly, Ayah, Bart Desmet, and Mona Diab. 2016. “The GW/LT3 VarDial 2016 Shared Task System for Dialects and Similar Languages Detection.” In Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial3), edited by Preslav Nakov, Marcos Zampieri, Liling Tan, Nikola Ljubešić, Jørg Tiedemann, and Shervin Malmasi, 33–41. Osaka, Japan: The COLING 2016 Organizing Committee.
Chicago author-date (all authors)
Zirikly, Ayah, Bart Desmet, and Mona Diab. 2016. “The GW/LT3 VarDial 2016 Shared Task System for Dialects and Similar Languages Detection.” In Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial3), ed by. Preslav Nakov, Marcos Zampieri, Liling Tan, Nikola Ljubešić, Jørg Tiedemann, and Shervin Malmasi, 33–41. Osaka, Japan: The COLING 2016 Organizing Committee.
Vancouver
1.
Zirikly A, Desmet B, Diab M. The GW/LT3 VarDial 2016 shared task system for dialects and similar languages detection. In: Nakov P, Zampieri M, Tan L, Ljubešić N, Tiedemann J, Malmasi S, editors. Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial3). Osaka, Japan: The COLING 2016 Organizing Committee; 2016. p. 33–41.
IEEE
[1]
A. Zirikly, B. Desmet, and M. Diab, “The GW/LT3 VarDial 2016 shared task system for dialects and similar languages detection,” in Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial3), Osaka, Japan, 2016, pp. 33–41.
@inproceedings{8131951,
  abstract     = {{This paper describes the GW/LT3 contribution to the 2016 VarDial shared task on the identification of similar languages (task 1) and Arabic dialects (task 2). For both tasks, we experimented with Logistic Regression and Neural Network classifiers in isolation. Additionally, we implemented a cascaded classifier that consists of coarse and fine-grained classifiers (task 1) and a classifier ensemble with majority voting for task 2. The submitted systems obtained state-of-the-art performance and ranked first for the evaluation on social media data (test sets B1 and B2 for task 1), with a maximum weighted F1 score of 91.94%.}},
  author       = {{Zirikly, Ayah and Desmet, Bart and Diab, Mona}},
  booktitle    = {{Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial3)}},
  editor       = {{Nakov, Preslav and Zampieri, Marcos and Tan, Liling and Ljubešić, Nikola and Tiedemann, Jørg and Malmasi, Shervin}},
  isbn         = {{9784879747167}},
  keywords     = {{dialect,language identification,machine learning,natural language processing}},
  language     = {{eng}},
  location     = {{Osaka, Japan}},
  pages        = {{33--41}},
  publisher    = {{The COLING 2016 Organizing Committee}},
  title        = {{The GW/LT3 VarDial 2016 shared task system for dialects and similar languages detection}},
  url          = {{http://aclweb.org/anthology/W16-4804}},
  year         = {{2016}},
}