Advanced search
1 file | 734.37 KB

A database of orthography-semantics consistency (OSC) estimates for 15,017 English words

(2018) BEHAVIOR RESEARCH METHODS . 50(4). p.1482-1495
Author
Organization
Abstract
Orthography–semantics consistency (OSC) is a measure that quantifies the degree of semantic relatedness between a word and its orthographic relatives. OSC is computed as the frequency-weighted average semantic similarity between the meaning of a given word and the meanings of all the words containing that very same orthographic string, as captured by distributional semantic models. We present a resource including optimized estimates of OSC for 15,017 English words. In a series of analyses, we provide a progressive optimization of the OSC variable. We show that computing OSC from word-embeddings models (in place of traditional count models), limiting preprocessing of the corpus used for inducing semantic vectors (in particular, avoiding part-of-speech tagging and lemmatization), and relying on a wider pool of orthographic relatives provide better performance for the measure in a lexical-processing task. We further show that OSC is an important and significant predictor of reaction times in visual word recognition and word naming, one that correlates only weakly with other psycholinguistic variables (e.g., family size, word frequency), indicating that it captures a novel source of variance in lexical access. Finally, some theoretical and methodological implications are discussed of adopting OSC as one of the predictors of reaction times in studies of visual word recognition.

Downloads

  • (...).pdf
    • full text
    • |
    • UGent only
    • |
    • PDF
    • |
    • 734.37 KB

Citation

Please use this url to cite or link to this publication:

Chicago
Marelli, Marco, and Simona Amenta. 2018. “A Database of Orthography-semantics Consistency (OSC) Estimates for 15,017 English Words.” Behavior Research Methods  50 (4): 1482–1495.
APA
Marelli, M., & Amenta, S. (2018). A database of orthography-semantics consistency (OSC) estimates for 15,017 English words. BEHAVIOR RESEARCH METHODS  , 50(4), 1482–1495.
Vancouver
1.
Marelli M, Amenta S. A database of orthography-semantics consistency (OSC) estimates for 15,017 English words. BEHAVIOR RESEARCH METHODS  . Springer; 2018;50(4):1482–95.
MLA
Marelli, Marco, and Simona Amenta. “A Database of Orthography-semantics Consistency (OSC) Estimates for 15,017 English Words.” BEHAVIOR RESEARCH METHODS  50.4 (2018): 1482–1495. Print.
@article{8552515,
  abstract     = {Orthography–semantics consistency (OSC) is a measure that quantifies the degree of semantic relatedness between a word and its orthographic relatives. OSC is computed as the frequency-weighted average semantic similarity between the meaning of a given word and the meanings of all the words containing that very same orthographic string, as captured by distributional semantic models. We present a resource including optimized estimates of OSC for 15,017 English words. In a series of analyses, we provide a progressive optimization of the OSC variable. We show that computing OSC from word-embeddings models (in place of traditional count models), limiting preprocessing of the corpus used for inducing semantic vectors (in particular, avoiding part-of-speech tagging and lemmatization), and relying on a wider pool of orthographic relatives provide better performance for the measure in a lexical-processing task. We further show that OSC is an important and significant predictor of reaction times in visual word recognition and word naming, one that correlates only weakly with other psycholinguistic variables (e.g., family size, word frequency), indicating that it captures a novel source of variance in lexical access. Finally, some theoretical and methodological implications are discussed of adopting OSC as one of the predictors of reaction times in studies of visual word recognition.},
  author       = {Marelli, Marco and Amenta, Simona},
  issn         = {1554-3528},
  journal      = {BEHAVIOR RESEARCH METHODS                                },
  language     = {eng},
  number       = {4},
  pages        = {1482--1495},
  publisher    = {Springer},
  title        = {A database of orthography-semantics consistency (OSC) estimates for 15,017 English words},
  url          = {http://dx.doi.org/10.3758/s13428-018-1017-8},
  volume       = {50},
  year         = {2018},
}

Altmetric
View in Altmetric
Web of Science
Times cited: