Advanced search
1 file | 743.36 KB Add to list

GECO-MT : the Ghent Eye-tracking Corpus of Machine Translation

Toon Colman (UGent) , Margot Fonteyne (UGent) , Joke Daems (UGent) , Nicolas Dirix (UGent) and Lieve Macken (UGent)
Author
Organization
Project
Abstract
In the present paper, we describe a large corpus of eye movement data, collected during natural reading of a human translation and a machine translation of a full novel. This data set, called GECO-MT (Ghent Eye-tracking Corpus of Machine Translation) expands upon an earlier corpus called GECO (Ghent Eye-tracking Corpus) by Cop et al. (2017). The eye movement data in GECO-MT will be used in future research to investigate the effect of machine translation on the reading process and the effects of various error types on reading. In this article, we describe in detail the materials and data collection procedure of GECO-MT. Extensive information on the language proficiency of our participants is given, as well as a comparison with the participants of the original GECO. We investigate the distribution of a selection of important eye movement variables and explore the possibilities for future analyses of the data. GECO-MT is freely available at https://www.lt3.ugent.be/resources/geco-mt.
Keywords
Machine Translation, quality assessment, reading behaviour, eye-tracking, corpus study, LT3

Downloads

  • 2022.lrec-1.4.pdf
    • full text (Published version)
    • |
    • open access
    • |
    • PDF
    • |
    • 743.36 KB

Citation

Please use this url to cite or link to this publication:

MLA
Colman, Toon, et al. “GECO-MT : The Ghent Eye-Tracking Corpus of Machine Translation.” LREC 2022 : THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, edited by Nicoletta Calzolari et al., European Language Resources Association (ELRA), 2022, pp. 29–38.
APA
Colman, T., Fonteyne, M., Daems, J., Dirix, N., & Macken, L. (2022). GECO-MT : the Ghent Eye-tracking Corpus of Machine Translation. In N. Calzolari, F. Béchet, P. Blache, K. Choukri, C. Cieri, T. Declerck, … S. Piperidis (Eds.), LREC 2022 : THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (pp. 29–38). Marseille, France: European Language Resources Association (ELRA).
Chicago author-date
Colman, Toon, Margot Fonteyne, Joke Daems, Nicolas Dirix, and Lieve Macken. 2022. “GECO-MT : The Ghent Eye-Tracking Corpus of Machine Translation.” In LREC 2022 : THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, edited by Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, et al., 29–38. Marseille, France: European Language Resources Association (ELRA).
Chicago author-date (all authors)
Colman, Toon, Margot Fonteyne, Joke Daems, Nicolas Dirix, and Lieve Macken. 2022. “GECO-MT : The Ghent Eye-Tracking Corpus of Machine Translation.” In LREC 2022 : THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, ed by. Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélane Mazo, Jan Odijk, and Stelios Piperidis, 29–38. Marseille, France: European Language Resources Association (ELRA).
Vancouver
1.
Colman T, Fonteyne M, Daems J, Dirix N, Macken L. GECO-MT : the Ghent Eye-tracking Corpus of Machine Translation. In: Calzolari N, Béchet F, Blache P, Choukri K, Cieri C, Declerck T, et al., editors. LREC 2022 : THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION. Marseille, France: European Language Resources Association (ELRA); 2022. p. 29–38.
IEEE
[1]
T. Colman, M. Fonteyne, J. Daems, N. Dirix, and L. Macken, “GECO-MT : the Ghent Eye-tracking Corpus of Machine Translation,” in LREC 2022 : THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, Marseille, France, 2022, pp. 29–38.
@inproceedings{8757156,
  abstract     = {{In the present paper, we describe a large corpus of eye movement data, collected during natural reading of a human translation and a machine translation of a full novel. This data set, called GECO-MT (Ghent Eye-tracking Corpus of Machine Translation) expands upon an earlier corpus called GECO (Ghent Eye-tracking Corpus) by Cop et al. (2017). The eye movement data in GECO-MT will be used in future research to investigate the effect of machine translation on the reading process and the effects of various error types on reading. In this article, we describe in detail the materials and data collection procedure of GECO-MT. Extensive information on the language proficiency of our participants is given, as well as a comparison with the participants of the original GECO. We investigate the distribution of a selection of important eye movement variables and explore the possibilities for future analyses of the data. GECO-MT is freely available at https://www.lt3.ugent.be/resources/geco-mt.}},
  author       = {{Colman, Toon and Fonteyne, Margot and Daems, Joke and Dirix, Nicolas and Macken, Lieve}},
  booktitle    = {{LREC 2022 : THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION}},
  editor       = {{Calzolari, Nicoletta and Béchet, Frédéric and Blache, Philippe and Choukri, Khalid and Cieri, Christopher and Declerck, Thierry and Goggi, Sara and Isahara, Hitoshi and Maegaard, Bente and Mariani, Joseph and Mazo, Hélane and Odijk, Jan and Piperidis, Stelios}},
  isbn         = {{9791095546726}},
  issn         = {{2522-2686}},
  keywords     = {{Machine Translation,quality assessment,reading behaviour,eye-tracking,corpus study,LT3}},
  language     = {{eng}},
  location     = {{Marseille, France}},
  pages        = {{29--38}},
  publisher    = {{European Language Resources Association (ELRA)}},
  title        = {{GECO-MT : the Ghent Eye-tracking Corpus of Machine Translation}},
  url          = {{http://www.lrec-conf.org/proceedings/lrec2022/index.html}},
  year         = {{2022}},
}

Web of Science
Times cited: