Advanced search
1 file | 1.28 MB Add to list

Linguistic annotation of Byzantine book epigrams

Colin Swaelens (UGent) , Ilse De Vos (UGent) and Els Lefever (UGent)
Author
Organization
Project
Abstract
In this paper, we explore the feasibility of developing a part-of-speech tagger for not-normalised, Byzantine Greek epigrams. Hence, we compared three different transformer-based models with embedding representations, which are then fine-tuned on a fine-grained part-of-speech tagging task. To train the language models, we compiled two data sets: the first consisting of Ancient and Byzantine Greek texts, the second of Ancient, Byzantine and Modern Greek. This allowed us to ascertain whether Modern Greek contributes to the modelling of Byzantine Greek. For the supervised task of part-of-speech tagging, we collected a training set of existing, annotated (Ancient) Greek texts. For evaluation, a gold standard containing 10,000 tokens of unedited Byzantine Greek poems was manually annotated and validated through an inter-annotator agreement study. The experimental results look very promising, with the BERT model trained on all Greek data achieving the best performance for fine-grained part-of-speech tagging.
Keywords
Byzantine Greek, Part-of-speech tagging, Morphological analysis, Computational linguistics, Natural language processing, Machine learning, Neural networks, Language models

Downloads

  • (...).pdf
    • full text (Published version)
    • |
    • UGent only
    • |
    • PDF
    • |
    • 1.28 MB

Citation

Please use this url to cite or link to this publication:

MLA
Swaelens, Colin, et al. “Linguistic Annotation of Byzantine Book Epigrams.” LANGUAGE RESOURCES AND EVALUATION, 2024, doi:10.1007/s10579-023-09703-x.
APA
Swaelens, C., De Vos, I., & Lefever, E. (2024). Linguistic annotation of Byzantine book epigrams. LANGUAGE RESOURCES AND EVALUATION. https://doi.org/10.1007/s10579-023-09703-x
Chicago author-date
Swaelens, Colin, Ilse De Vos, and Els Lefever. 2024. “Linguistic Annotation of Byzantine Book Epigrams.” LANGUAGE RESOURCES AND EVALUATION. https://doi.org/10.1007/s10579-023-09703-x.
Chicago author-date (all authors)
Swaelens, Colin, Ilse De Vos, and Els Lefever. 2024. “Linguistic Annotation of Byzantine Book Epigrams.” LANGUAGE RESOURCES AND EVALUATION. doi:10.1007/s10579-023-09703-x.
Vancouver
1.
Swaelens C, De Vos I, Lefever E. Linguistic annotation of Byzantine book epigrams. LANGUAGE RESOURCES AND EVALUATION. 2024;
IEEE
[1]
C. Swaelens, I. De Vos, and E. Lefever, “Linguistic annotation of Byzantine book epigrams,” LANGUAGE RESOURCES AND EVALUATION, 2024.
@article{01HHJ697YYWMESFKMWT0HDTCZP,
  abstract     = {{In this paper, we explore the feasibility of developing a part-of-speech tagger for not-normalised, Byzantine Greek epigrams. Hence, we compared three different transformer-based models with embedding representations, which are then fine-tuned on a fine-grained part-of-speech tagging task. To train the language models, we compiled two data sets: the first consisting of Ancient and Byzantine Greek texts, the second of Ancient, Byzantine and Modern Greek. This allowed us to ascertain whether Modern Greek contributes to the modelling of Byzantine Greek. For the supervised task of part-of-speech tagging, we collected a training set of existing, annotated (Ancient) Greek texts. For evaluation, a gold standard containing 10,000 tokens of unedited Byzantine Greek poems was manually annotated and validated through an inter-annotator agreement study. The experimental results look very promising, with the BERT model trained on all Greek data achieving the best performance for fine-grained part-of-speech tagging.}},
  author       = {{Swaelens, Colin and De Vos, Ilse and Lefever, Els}},
  issn         = {{1574-020X}},
  journal      = {{LANGUAGE RESOURCES AND EVALUATION}},
  keywords     = {{Byzantine Greek,Part-of-speech tagging,Morphological analysis,Computational linguistics,Natural language processing,Machine learning,Neural networks,Language models}},
  language     = {{eng,gre}},
  pages        = {{26}},
  title        = {{Linguistic annotation of Byzantine book epigrams}},
  url          = {{http://doi.org/10.1007/s10579-023-09703-x}},
  year         = {{2024}},
}

Altmetric
View in Altmetric
Web of Science
Times cited: