Ghent University Academic Bibliography

Advanced

Word sense disambiguation in text-to-pictograph translation

Gilles Jacobs, Leen Sevens, Vincent Vandeghinste, Ineke Schuurman and Frank Van Eynde (2015)
abstract
We describe the implementation and evaluation of a word sense disambiguation (WSD) tool in a translation system that converts English text messages into sequences of pictographic images. The Text-to-Picto tool for Dutch, English, and Spanish is used on the online communication platform “WAI-NOT” by people who have trouble reading and writing. The translation system relies on WordNets, in which synsets are populated with pictographs. In the original system, many ambiguous words are translated into an incorrect pictograph, because the pictograph is linked to the wrong word sense. The WSD method required for our translation engine must work on general domain text and use WordNet sense inventories. We opted for the gloss-overlap, extended lesk algorithm as described by Banerjee and Pedersen (2002). During translation, each possible WordNet synset of every content word in the input sentence receives a disambiguation score. This score, alongside other parameters, is used in a path-finding algorithm to determine the optimal pictograph sequence during translation. This implementation approach is easily generalised to other sense labelling algorithms, such as an SVM-based WSD tool for Dutch (Izquierdo 2015). In evaluation of the translation output, an improvement over the baseline system without WSD was not obtained. However, we found that WSD works well for ambiguous words for which sufficient pictographs are linked in our lexical-pictorial database.
Please use this url to cite or link to this publication:
author
organization
year
type
conference
publication status
published
subject
keyword
extended lesk, word sense disambiguation, pictograph translation, wordnet, image to text
conference name
CLIN26
conference location
Amsterdam, Nederland
conference start
2015-12-18
conference end
2015-12-18
project
LT3
language
Dutch
UGent publication?
yes
classification
C3
id
7197471
handle
http://hdl.handle.net/1854/LU-7197471
date created
2016-04-28 12:10:00
date last changed
2016-12-19 15:36:49
@inproceedings{7197471,
  abstract     = {We describe the implementation and evaluation of a word sense disambiguation (WSD) tool in a translation system that converts English text messages into sequences of pictographic images. The Text-to-Picto tool for Dutch, English, and Spanish is used on the online communication platform {\textquotedblleft}WAI-NOT{\textquotedblright} by people who have trouble reading and writing. The translation system relies on WordNets, in which synsets are populated with pictographs. In the original system, many ambiguous words are translated into an incorrect pictograph, because the pictograph is linked to the wrong word sense. The WSD method required for our translation engine must work on general domain text and use WordNet sense inventories. We opted for the gloss-overlap, extended lesk algorithm as described by Banerjee and Pedersen (2002). During translation, each possible WordNet synset of every content word in the input sentence receives a disambiguation score. This score, alongside other parameters, is used in a path-finding algorithm to determine the optimal pictograph sequence during translation. This implementation approach is easily generalised to other sense labelling algorithms, such as an SVM-based WSD tool for Dutch (Izquierdo 2015). In evaluation of the translation output, an improvement over the baseline system without WSD was not obtained. However, we found that WSD works well for ambiguous words for which sufficient pictographs are linked in our lexical-pictorial database.},
  author       = {Jacobs, Gilles and Sevens, Leen and Vandeghinste, Vincent and Schuurman, Ineke and Van Eynde, Frank},
  keyword      = {extended lesk,word sense disambiguation,pictograph translation,wordnet,image to text},
  language     = {dut},
  location     = {Amsterdam, Nederland},
  title        = {Word sense disambiguation in text-to-pictograph translation},
  year         = {2015},
}

Chicago
Jacobs, Gilles, Leen Sevens, Vincent Vandeghinste, Ineke Schuurman, and Frank Van Eynde. 2015. “Word Sense Disambiguation in Text-to-pictograph Translation.” In .
APA
Jacobs, Gilles, Sevens, L., Vandeghinste, V., Schuurman, I., & Van Eynde, F. (2015). Word sense disambiguation in text-to-pictograph translation. Presented at the CLIN26.
Vancouver
1.
Jacobs G, Sevens L, Vandeghinste V, Schuurman I, Van Eynde F. Word sense disambiguation in text-to-pictograph translation. 2015.
MLA
Jacobs, Gilles, Leen Sevens, Vincent Vandeghinste, et al. “Word Sense Disambiguation in Text-to-pictograph Translation.” 2015. Print.