Advanced search
1 file | 1.41 MB Add to list

Evaluating the impact of integrating similar translations into neural machine translation

Arda Tezcan (UGent) and Bram Bulté
(2022) INFORMATION. 13(1).
Author
Organization
Abstract
Previous research has shown that simple methods of augmenting machine translation training data and input sentences with translations of similar sentences (or fuzzy matches), retrieved from a translation memory or bilingual corpus, lead to considerable improvements in translation quality, as assessed by a limited set of automatic evaluation metrics. In this study, we extend this evaluation by calculating a wider range of automated quality metrics that tap into different aspects of translation quality and by performing manual MT error analysis. Moreover, we investigate in more detail how fuzzy matches influence translations and where potential quality improvements could still be made by carrying out a series of quantitative analyses that focus on different characteristics of the retrieved fuzzy matches. The automated evaluation shows that the quality of NFR translations is higher than the NMT baseline in terms of all metrics. However, the manual error analysis did not reveal a difference between the two systems in terms of total number of translation errors; yet, different profiles emerged when considering the types of errors made. Finally, in our analysis of how fuzzy matches influence NFR translations, we identified a number of features that could be used to improve the selection of fuzzy matches for NFR data augmentation.
Keywords
neural machine translation, translation memory, evaluation, data augmentation, lt3

Downloads

  • information-13-00019.pdf
    • full text (Published version)
    • |
    • open access
    • |
    • PDF
    • |
    • 1.41 MB

Citation

Please use this url to cite or link to this publication:

MLA
Tezcan, Arda, and Bram Bulté. “Evaluating the Impact of Integrating Similar Translations into Neural Machine Translation.” INFORMATION, vol. 13, no. 1, 2022, doi:10.3390/info13010019.
APA
Tezcan, A., & Bulté, B. (2022). Evaluating the impact of integrating similar translations into neural machine translation. INFORMATION, 13(1). https://doi.org/10.3390/info13010019
Chicago author-date
Tezcan, Arda, and Bram Bulté. 2022. “Evaluating the Impact of Integrating Similar Translations into Neural Machine Translation.” INFORMATION 13 (1). https://doi.org/10.3390/info13010019.
Chicago author-date (all authors)
Tezcan, Arda, and Bram Bulté. 2022. “Evaluating the Impact of Integrating Similar Translations into Neural Machine Translation.” INFORMATION 13 (1). doi:10.3390/info13010019.
Vancouver
1.
Tezcan A, Bulté B. Evaluating the impact of integrating similar translations into neural machine translation. INFORMATION. 2022;13(1).
IEEE
[1]
A. Tezcan and B. Bulté, “Evaluating the impact of integrating similar translations into neural machine translation,” INFORMATION, vol. 13, no. 1, 2022.
@article{8732790,
  abstract     = {{Previous research has shown that simple methods of augmenting machine translation training data and input sentences with translations of similar sentences (or fuzzy matches), retrieved from a translation memory or bilingual corpus, lead to considerable improvements in translation quality, as assessed by a limited set of automatic evaluation metrics. In this study, we extend this evaluation by calculating a wider range of automated quality metrics that tap into different aspects of translation quality and by performing manual MT error analysis. Moreover, we investigate in more detail how fuzzy matches influence translations and where potential quality improvements could still be made by carrying out a series of quantitative analyses that focus on different characteristics of the retrieved fuzzy matches. The automated evaluation shows that the quality of NFR translations is higher than the NMT baseline in terms of all metrics. However, the manual error analysis did not reveal a difference between the two systems in terms of total number of translation errors; yet, different profiles emerged when considering the types of errors made. Finally, in our analysis of how fuzzy matches influence NFR translations, we identified a number of features that could be used to improve the selection of fuzzy matches for NFR data augmentation.}},
  articleno    = {{19}},
  author       = {{Tezcan, Arda and Bulté, Bram}},
  issn         = {{2078-2489}},
  journal      = {{INFORMATION}},
  keywords     = {{neural machine translation,translation memory,evaluation,data augmentation,lt3}},
  language     = {{eng}},
  number       = {{1}},
  pages        = {{33}},
  title        = {{Evaluating the impact of integrating similar translations into neural machine translation}},
  url          = {{http://dx.doi.org/10.3390/info13010019}},
  volume       = {{13}},
  year         = {{2022}},
}

Altmetric
View in Altmetric