Advanced search
1 file | 760.51 KB Add to list

Towards a better integration of fuzzy matches in neural machine translation through data augmentation

Arda Tezcan (UGent) , Bram Bulté and Bram Vanroy (UGent)
Author
Organization
Abstract
We identify a number of aspects that can boost the performance of Neural Fuzzy Repair (NFR), an easy-to-implement method to integrate translation memory matches and neural machine translation (NMT). We explore various ways of maximising the added value of retrieved matches within the NFR paradigm for eight language combinations, using Transformer NMT systems. In particular, we test the impact of different fuzzy matching techniques, sub-word-level segmentation methods and alignment-based features on overall translation quality. Furthermore, we propose a fuzzy match combination technique that aims to maximise the coverage of source words. This is supplemented with an analysis of how translation quality is affected by input sentence length and fuzzy match score. The results show that applying a combination of the tested modifications leads to a significant increase in estimated translation quality over all baselines for all language combinations.
Keywords
translation memories, data augmentation, fuzzy matching, NMT, sub-word units, lt3

Downloads

  • Tezcan et al - 2021 - Towards a better integration of fuzzy matches in neural machine translation through data augmentation.pdf
    • full text (Published version)
    • |
    • open access
    • |
    • PDF
    • |
    • 760.51 KB

Citation

Please use this url to cite or link to this publication:

MLA
Tezcan, Arda, et al. “Towards a Better Integration of Fuzzy Matches in Neural Machine Translation through Data Augmentation.” INFORMATICS-BASEL, vol. 8, no. 1, 2021, doi:10.3390/informatics8010007.
APA
Tezcan, A., Bulté, B., & Vanroy, B. (2021). Towards a better integration of fuzzy matches in neural machine translation through data augmentation. INFORMATICS-BASEL, 8(1). https://doi.org/10.3390/informatics8010007
Chicago author-date
Tezcan, Arda, Bram Bulté, and Bram Vanroy. 2021. “Towards a Better Integration of Fuzzy Matches in Neural Machine Translation through Data Augmentation.” INFORMATICS-BASEL 8 (1). https://doi.org/10.3390/informatics8010007.
Chicago author-date (all authors)
Tezcan, Arda, Bram Bulté, and Bram Vanroy. 2021. “Towards a Better Integration of Fuzzy Matches in Neural Machine Translation through Data Augmentation.” INFORMATICS-BASEL 8 (1). doi:10.3390/informatics8010007.
Vancouver
1.
Tezcan A, Bulté B, Vanroy B. Towards a better integration of fuzzy matches in neural machine translation through data augmentation. INFORMATICS-BASEL. 2021;8(1).
IEEE
[1]
A. Tezcan, B. Bulté, and B. Vanroy, “Towards a better integration of fuzzy matches in neural machine translation through data augmentation,” INFORMATICS-BASEL, vol. 8, no. 1, 2021.
@article{8690842,
  abstract     = {{We identify a number of aspects that can boost the performance of Neural Fuzzy Repair (NFR), an easy-to-implement method to integrate translation memory matches and neural machine translation (NMT). We explore various ways of maximising the added value of retrieved matches within the NFR paradigm for eight language combinations, using Transformer NMT systems. In particular, we test the impact of different fuzzy matching techniques, sub-word-level segmentation methods and alignment-based features on overall translation quality. Furthermore, we propose a fuzzy match combination technique that aims to maximise the coverage of source words. This is supplemented with an analysis of how translation quality is affected by input sentence length and fuzzy match score. The results show that applying a combination of the tested modifications leads to a significant increase in estimated translation quality over all baselines for all language combinations.}},
  articleno    = {{7}},
  author       = {{Tezcan, Arda and Bulté, Bram and Vanroy, Bram}},
  issn         = {{2227-9709}},
  journal      = {{INFORMATICS-BASEL}},
  keywords     = {{translation memories,data augmentation,fuzzy matching,NMT,sub-word units,lt3}},
  language     = {{eng}},
  number       = {{1}},
  pages        = {{27}},
  title        = {{Towards a better integration of fuzzy matches in neural machine translation through data augmentation}},
  url          = {{http://doi.org/10.3390/informatics8010007}},
  volume       = {{8}},
  year         = {{2021}},
}

Altmetric
View in Altmetric