
Metrics of syntactic equivalence to assess translation difficulty
- Author
- Bram Vanroy (UGent) , Orphée De Clercq (UGent) , Arda Tezcan (UGent) , Joke Daems (UGent) and Lieve Macken (UGent)
- Organization
- Abstract
- We propose three linguistically motivated metrics to quantify syntactic equivalence between a source sentence and its translation. Syntactically Aware Cross (SACr) measures the degree of word group reordering by creating syntactically motivated groups of words that are aligned. Secondly, an intuitive approach is to compare the linguistic labels of the word-aligned source and target tokens. Finally, on a deeper linguistic level, Aligned Syntactic Tree Edit Distance (ASTrED) compares the dependency structure of both sentences. To be able to compare source and target dependency labels we make use of Universal Dependencies (UD). We provide an analysis of our metrics by comparing them with translation process data in mixed models. Even though our examples and analysis focus on English as the source language and Dutch as the target language, the proposed metrics can be applied to any language for which UD models are attainable. An open-source implementation is made available.
- Keywords
- lt3, translation studies, computational linguistics, tree edit distance, syntax
Downloads
-
8-2020 Metrics of syntactic equivalence to assess translation difficulty.pdf
- full text (Accepted manuscript)
- |
- open access
- |
- |
- 3.58 MB
-
(...).pdf
- full text (Published version)
- |
- UGent only
- |
- |
- 1.97 MB
Citation
Please use this url to cite or link to this publication: http://hdl.handle.net/1854/LU-8700133
- MLA
- Vanroy, Bram, et al. “Metrics of Syntactic Equivalence to Assess Translation Difficulty.” Explorations in Empirical Translation Process Research, edited by Michael Carl, vol. 3, Springer, 2021, pp. 259–94, doi:10.1007/978-3-030-69777-8_10.
- APA
- Vanroy, B., De Clercq, O., Tezcan, A., Daems, J., & Macken, L. (2021). Metrics of syntactic equivalence to assess translation difficulty. In M. Carl (Ed.), Explorations in empirical translation process research (Vol. 3, pp. 259–294). https://doi.org/10.1007/978-3-030-69777-8_10
- Chicago author-date
- Vanroy, Bram, Orphée De Clercq, Arda Tezcan, Joke Daems, and Lieve Macken. 2021. “Metrics of Syntactic Equivalence to Assess Translation Difficulty.” In Explorations in Empirical Translation Process Research, edited by Michael Carl, 3:259–94. Springer. https://doi.org/10.1007/978-3-030-69777-8_10.
- Chicago author-date (all authors)
- Vanroy, Bram, Orphée De Clercq, Arda Tezcan, Joke Daems, and Lieve Macken. 2021. “Metrics of Syntactic Equivalence to Assess Translation Difficulty.” In Explorations in Empirical Translation Process Research, ed by. Michael Carl, 3:259–294. Springer. doi:10.1007/978-3-030-69777-8_10.
- Vancouver
- 1.Vanroy B, De Clercq O, Tezcan A, Daems J, Macken L. Metrics of syntactic equivalence to assess translation difficulty. In: Carl M, editor. Explorations in empirical translation process research. Springer; 2021. p. 259–94.
- IEEE
- [1]B. Vanroy, O. De Clercq, A. Tezcan, J. Daems, and L. Macken, “Metrics of syntactic equivalence to assess translation difficulty,” in Explorations in empirical translation process research, vol. 3, M. Carl, Ed. Springer, 2021, pp. 259–294.
@incollection{8700133, abstract = {{We propose three linguistically motivated metrics to quantify syntactic equivalence between a source sentence and its translation. Syntactically Aware Cross (SACr) measures the degree of word group reordering by creating syntactically motivated groups of words that are aligned. Secondly, an intuitive approach is to compare the linguistic labels of the word-aligned source and target tokens. Finally, on a deeper linguistic level, Aligned Syntactic Tree Edit Distance (ASTrED) compares the dependency structure of both sentences. To be able to compare source and target dependency labels we make use of Universal Dependencies (UD). We provide an analysis of our metrics by comparing them with translation process data in mixed models. Even though our examples and analysis focus on English as the source language and Dutch as the target language, the proposed metrics can be applied to any language for which UD models are attainable. An open-source implementation is made available.}}, author = {{Vanroy, Bram and De Clercq, Orphée and Tezcan, Arda and Daems, Joke and Macken, Lieve}}, booktitle = {{Explorations in empirical translation process research}}, editor = {{Carl, Michael}}, isbn = {{9783030697761}}, issn = {{2522-8021}}, keywords = {{lt3,translation studies,computational linguistics,tree edit distance,syntax}}, language = {{eng}}, pages = {{259--294}}, publisher = {{Springer}}, series = {{Machine Translation: Technologies and Applications}}, title = {{Metrics of syntactic equivalence to assess translation difficulty}}, url = {{http://doi.org/10.1007/978-3-030-69777-8_10}}, volume = {{3}}, year = {{2021}}, }
- Altmetric
- View in Altmetric