Advanced search
2 files | 5.55 MB Add to list

Metrics of syntactic equivalence to assess translation difficulty

Bram Vanroy (UGent) , Orphée De Clercq (UGent) , Arda Tezcan (UGent) , Joke Daems (UGent) and Lieve Macken (UGent)
Author
Organization
Abstract
We propose three linguistically motivated metrics to quantify syntactic equivalence between a source sentence and its translation. Syntactically Aware Cross (SACr) measures the degree of word group reordering by creating syntactically motivated groups of words that are aligned. Secondly, an intuitive approach is to compare the linguistic labels of the word-aligned source and target tokens. Finally, on a deeper linguistic level, Aligned Syntactic Tree Edit Distance (ASTrED) compares the dependency structure of both sentences. To be able to compare source and target dependency labels we make use of Universal Dependencies (UD). We provide an analysis of our metrics by comparing them with translation process data in mixed models. Even though our examples and analysis focus on English as the source language and Dutch as the target language, the proposed metrics can be applied to any language for which UD models are attainable. An open-source implementation is made available.
Keywords
lt3, translation studies, computational linguistics, tree edit distance, syntax

Downloads

  • (...).pdf
    • full text (Accepted manuscript)
    • |
    • UGent only (changes to open access on 2023-02-16)
    • |
    • PDF
    • |
    • 3.58 MB
  • (...).pdf
    • full text (Published version)
    • |
    • UGent only
    • |
    • PDF
    • |
    • 1.97 MB

Citation

Please use this url to cite or link to this publication:

MLA
Vanroy, Bram, et al. “Metrics of Syntactic Equivalence to Assess Translation Difficulty.” Explorations in Empirical Translation Process Research, edited by Michael Carl, vol. 3, Springer, 2021, pp. 259–94, doi:10.1007/978-3-030-69777-8_10.
APA
Vanroy, B., De Clercq, O., Tezcan, A., Daems, J., & Macken, L. (2021). Metrics of syntactic equivalence to assess translation difficulty. In M. Carl (Ed.), Explorations in empirical translation process research (Vol. 3, pp. 259–294). https://doi.org/10.1007/978-3-030-69777-8_10
Chicago author-date
Vanroy, Bram, Orphée De Clercq, Arda Tezcan, Joke Daems, and Lieve Macken. 2021. “Metrics of Syntactic Equivalence to Assess Translation Difficulty.” In Explorations in Empirical Translation Process Research, edited by Michael Carl, 3:259–94. Springer. https://doi.org/10.1007/978-3-030-69777-8_10.
Chicago author-date (all authors)
Vanroy, Bram, Orphée De Clercq, Arda Tezcan, Joke Daems, and Lieve Macken. 2021. “Metrics of Syntactic Equivalence to Assess Translation Difficulty.” In Explorations in Empirical Translation Process Research, ed by. Michael Carl, 3:259–294. Springer. doi:10.1007/978-3-030-69777-8_10.
Vancouver
1.
Vanroy B, De Clercq O, Tezcan A, Daems J, Macken L. Metrics of syntactic equivalence to assess translation difficulty. In: Carl M, editor. Explorations in empirical translation process research. Springer; 2021. p. 259–94.
IEEE
[1]
B. Vanroy, O. De Clercq, A. Tezcan, J. Daems, and L. Macken, “Metrics of syntactic equivalence to assess translation difficulty,” in Explorations in empirical translation process research, vol. 3, M. Carl, Ed. Springer, 2021, pp. 259–294.
@incollection{8700133,
  abstract     = {{We propose three linguistically motivated metrics to quantify syntactic equivalence between a source sentence and its translation. Syntactically Aware Cross (SACr) measures the degree of word group reordering by creating syntactically motivated groups of words that are aligned. Secondly, an intuitive approach is to compare the linguistic labels of the word-aligned source and target tokens. Finally, on a deeper linguistic level, Aligned Syntactic Tree Edit Distance (ASTrED) compares the dependency structure of both sentences. To be able to compare source and target dependency labels we make use of Universal Dependencies (UD). We provide an analysis of our metrics by comparing them with translation process data in mixed models. Even though our examples and analysis focus on English as the source language and Dutch as the target language, the proposed metrics can be applied to any language for which UD models are attainable. An open-source implementation is made available.}},
  author       = {{Vanroy, Bram and De Clercq, Orphée and Tezcan, Arda and Daems, Joke and Macken, Lieve}},
  booktitle    = {{Explorations in empirical translation process research}},
  editor       = {{Carl, Michael}},
  isbn         = {{9783030697761}},
  issn         = {{2522-8021}},
  keywords     = {{lt3,translation studies,computational linguistics,tree edit distance,syntax}},
  language     = {{eng}},
  pages        = {{259--294}},
  publisher    = {{Springer}},
  series       = {{Machine Translation: Technologies and Applications}},
  title        = {{Metrics of syntactic equivalence to assess translation difficulty}},
  url          = {{http://dx.doi.org/10.1007/978-3-030-69777-8_10}},
  volume       = {{3}},
  year         = {{2021}},
}

Altmetric
View in Altmetric