Advanced search
1 file | 225.47 KB Add to list

Evaluating existing lemmatisers on unedited Byzantine Greek poetry

Colin Swaelens (UGent) , Ilse De Vos (UGent) and Els Lefever (UGent)
Author
Organization
Project
Abstract
This paper reports on the results of a com- parative evaluation of four existing lemmatizers, all pre-trained on Ancient Greek texts, on a novel corpus of unedited, Byzantine Greek texts. The aim of this study is to get insights into the pitfalls of existing lemmatisation approaches as well as the specific challenges of our Byzantine Greek corpus, in order to develop a new lemmatizer that can cope with its peculiarities. The results of the experiment show an accuracy drop of 20% on our corpus, which is further investigated in a qualitative error analysis.
Keywords
Natural Language Processing

Downloads

  • 2023 Anderson-et-alii Proceedings-ALP-RANLP pp111-116.pdf
    • full text (Published version)
    • |
    • open access
    • |
    • PDF
    • |
    • 225.47 KB

Citation

Please use this url to cite or link to this publication:

MLA
Swaelens, Colin, et al. “Evaluating Existing Lemmatisers on Unedited Byzantine Greek Poetry.” Proceedings of the Ancient Language Processing Workshop, edited by Adam Anderson et al., 2023, pp. 111–16.
APA
Swaelens, C., De Vos, I., & Lefever, E. (2023). Evaluating existing lemmatisers on unedited Byzantine Greek poetry. In A. Anderson, S. Gordin, S. Klein, B. Li, Y. Liu, & M. Passarotti (Eds.), Proceedings of the Ancient Language Processing Workshop (pp. 111–116).
Chicago author-date
Swaelens, Colin, Ilse De Vos, and Els Lefever. 2023. “Evaluating Existing Lemmatisers on Unedited Byzantine Greek Poetry.” In Proceedings of the Ancient Language Processing Workshop, edited by Adam Anderson, Shai Gordin, Stav Klein, Bin Li, Yudong Liu, and Marco Passarotti, 111–16.
Chicago author-date (all authors)
Swaelens, Colin, Ilse De Vos, and Els Lefever. 2023. “Evaluating Existing Lemmatisers on Unedited Byzantine Greek Poetry.” In Proceedings of the Ancient Language Processing Workshop, ed by. Adam Anderson, Shai Gordin, Stav Klein, Bin Li, Yudong Liu, and Marco Passarotti, 111–116.
Vancouver
1.
Swaelens C, De Vos I, Lefever E. Evaluating existing lemmatisers on unedited Byzantine Greek poetry. In: Anderson A, Gordin S, Klein S, Li B, Liu Y, Passarotti M, editors. Proceedings of the Ancient Language Processing Workshop. 2023. p. 111–6.
IEEE
[1]
C. Swaelens, I. De Vos, and E. Lefever, “Evaluating existing lemmatisers on unedited Byzantine Greek poetry,” in Proceedings of the Ancient Language Processing Workshop, Varna, Bulgaria, 2023, pp. 111–116.
@inproceedings{01HCG3FFMQEQP41BP78DADT1AQ,
  abstract     = {{This paper reports on the results of a com- parative evaluation of four existing lemmatizers, all pre-trained on Ancient Greek texts, on a novel corpus of unedited, Byzantine Greek texts. The aim of this study is to get insights into the pitfalls of existing lemmatisation approaches as well as the specific challenges of our Byzantine Greek corpus, in order to develop a new lemmatizer that can cope with its peculiarities. The results of the experiment show an accuracy drop of 20% on our corpus, which is further investigated in a qualitative error analysis.}},
  author       = {{Swaelens, Colin and De Vos, Ilse and Lefever, Els}},
  booktitle    = {{Proceedings of the Ancient Language Processing Workshop}},
  editor       = {{Anderson, Adam and Gordin, Shai and Klein, Stav and Li, Bin and Liu, Yudong and Passarotti, Marco}},
  isbn         = {{9789544520878}},
  keywords     = {{Natural Language Processing}},
  language     = {{eng}},
  location     = {{Varna, Bulgaria}},
  pages        = {{111--116}},
  title        = {{Evaluating existing lemmatisers on unedited Byzantine Greek poetry}},
  url          = {{https://zenodo.org/doi/10.5281/zenodo.8337363}},
  year         = {{2023}},
}