Advanced search
3 files | 3.22 MB Add to list

The relationship between dictionary look-up frequency and corpus frequency revisited : a log-file analysis of a decade of user interaction with a Swahili-English dictionary

Author
Organization
Abstract
In an earlier publication it was claimed that there is no useful relationship between Swahili-English dictionary look-up frequencies and the occurrence frequencies for the same wordforms in Swahili-English corpora, at least not beyond the top few thousand wordforms. This result was challenged using data for German by a different team of researchers using an improved methodology. In the present article the original Swahili-English data is revisited, using ten years’ worth of it rather than just two, and using the improved methodology. We conclude that there is indeed a positive relationship. In addition, we show that online dictionary look-up behaviour is remarkably similar across languages, even when, as in our case, one is dealing with languages from very dissimilar language families. Furthermore, online dictionaries turn out to have minimum look-up success rates, below which they simply cannot go. These minima are language-sensitive and vary depending on the regularity of the searched-for entries, but are otherwise constant no matter the size of randomly sampled dictionaries. Corpus-informed sampling always improves on any random method. Lastly, from the point of view of the graphical user interface, we argue that the average user of an online bilingual dictionary is better served with a single search box, rather than separate search boxes for each dictionary side.
Keywords
lexicography, online dictionaries, log files, corpus frequencies, Swahili, English, language universals

Downloads

  • de Schryver et al. 2019 A decade of user interaction with a Swahili-English dictionary .pdf
    • full text (Published version)
    • |
    • open access
    • |
    • PDF
    • |
    • 1.16 MB
  • Addendum 2.xlsx
    • supplementary material
    • |
    • open access
    • |
    • Excel
    • |
    • 18.91 KB
  • Addendum 3.xlsx
    • supplementary material
    • |
    • open access
    • |
    • Excel
    • |
    • 2.04 MB

Citation

Please use this url to cite or link to this publication:

MLA
de Schryver, Gilles-Maurice, et al. “The Relationship between Dictionary Look-up Frequency and Corpus Frequency Revisited : A Log-File Analysis of a Decade of User Interaction with a Swahili-English Dictionary.” GEMA ONLINE JOURNAL OF LANGUAGE STUDIES, vol. 19, no. 4, 2019, pp. 1–27.
APA
de Schryver, G.-M., Wolfer, S., & Lew, R. (2019). The relationship between dictionary look-up frequency and corpus frequency revisited : a log-file analysis of a decade of user interaction with a Swahili-English dictionary. GEMA ONLINE JOURNAL OF LANGUAGE STUDIES, 19(4), 1–27.
Chicago author-date
Schryver, Gilles-Maurice de, Sascha Wolfer, and Robert Lew. 2019. “The Relationship between Dictionary Look-up Frequency and Corpus Frequency Revisited : A Log-File Analysis of a Decade of User Interaction with a Swahili-English Dictionary.” GEMA ONLINE JOURNAL OF LANGUAGE STUDIES 19 (4): 1–27.
Chicago author-date (all authors)
de Schryver, Gilles-Maurice, Sascha Wolfer, and Robert Lew. 2019. “The Relationship between Dictionary Look-up Frequency and Corpus Frequency Revisited : A Log-File Analysis of a Decade of User Interaction with a Swahili-English Dictionary.” GEMA ONLINE JOURNAL OF LANGUAGE STUDIES 19 (4): 1–27.
Vancouver
1.
de Schryver G-M, Wolfer S, Lew R. The relationship between dictionary look-up frequency and corpus frequency revisited : a log-file analysis of a decade of user interaction with a Swahili-English dictionary. GEMA ONLINE JOURNAL OF LANGUAGE STUDIES. 2019;19(4):1–27.
IEEE
[1]
G.-M. de Schryver, S. Wolfer, and R. Lew, “The relationship between dictionary look-up frequency and corpus frequency revisited : a log-file analysis of a decade of user interaction with a Swahili-English dictionary,” GEMA ONLINE JOURNAL OF LANGUAGE STUDIES, vol. 19, no. 4, pp. 1–27, 2019.
@article{8636647,
  abstract     = {{In an earlier publication it was claimed that there is no useful relationship between Swahili-English dictionary look-up frequencies and the occurrence frequencies for the same wordforms in Swahili-English corpora, at least not beyond the top few thousand wordforms. This result was challenged using data for German by a different team of researchers using an improved methodology. In the present article the original Swahili-English data is revisited, using ten years’ worth of it rather than just two, and using the improved methodology. We conclude that there is indeed a positive relationship. In addition, we show that online dictionary look-up behaviour is remarkably similar across languages, even when, as in our case, one is dealing with languages from very dissimilar language families. Furthermore, online dictionaries turn out to have minimum look-up success rates, below which they simply cannot go. These minima are language-sensitive and vary depending on the regularity of the searched-for entries, but are otherwise constant no matter the size of randomly sampled dictionaries. Corpus-informed sampling always improves on any random method. Lastly, from the point of view of the graphical user interface, we argue that the average user of an online bilingual dictionary is better served with a single search box, rather than separate search boxes for each dictionary side.}},
  author       = {{de Schryver, Gilles-Maurice and Wolfer, Sascha and Lew, Robert}},
  issn         = {{1675-8021}},
  journal      = {{GEMA ONLINE JOURNAL OF LANGUAGE STUDIES}},
  keywords     = {{lexicography,online dictionaries,log files,corpus frequencies,Swahili,English,language universals}},
  language     = {{eng}},
  number       = {{4}},
  pages        = {{1--27}},
  title        = {{The relationship between dictionary look-up frequency and corpus frequency revisited : a log-file analysis of a decade of user interaction with a Swahili-English dictionary}},
  url          = {{http://dx.doi.org/10.17576/gema-2019-1904-01}},
  volume       = {{19}},
  year         = {{2019}},
}

Altmetric
View in Altmetric