Ghent University Academic Bibliography

Advanced

From character to word level: enabling the linguistic analyses of Inputlog process data

Mariëlle Leijten, Lieve Macken UGent, Veronique Hoste UGent, Eric Van Horenbeeck and Luuk Van Waes (2012) Proceedings of the EACL 2012 workshop on computational linguistics and writing.
abstract
Keystroke-logging tools are widely used in writing process research. These applications are designed to capture each character and mouse movement as isolated events as an indicator of cognitive processes. The current research project explores the possibilities of aggregating the logged process data from the letter level (keystroke) to the word level by merging them with existing lexica and using NLP tools. Linking writing process data to lexica and using NLP tools enables researchers to analyze the data on a higher, more complex level. In this project the output data of Inputlog are segmented on the sentence level and then tokenized. However, by definition writing process data do not always represent clean and grammatical text. Coping with this problem was one of the main challenges in the current project. Therefore, a parser has been developed that extracts three types of data from the S-notation: word-level revisions, deleted fragments, and the final writing product. The within-word typing errors are identified and excluded from further analyses. At this stage the Inputlog process data are enriched with the following linguistic information: part-of-speech tags, lemmas, chunks, syllable boundaries and word frequencies.
Please use this url to cite or link to this publication:
author
organization
year
type
conference
publication status
published
subject
keyword
linguistic annotation, keystroke logging, linguistic analysis, Inputlog
in
Proceedings of the EACL 2012 workshop on computational linguistics and writing
pages
8 pages
publisher
Association for Computational Linguistics (ACL)
conference name
EACL 2012 workshop on Computational Linguistics and Writing (CL&W 2012)
conference location
Avignon, France
conference start
2012-04-23
conference end
2012-04-23
ISBN
9781937284190
language
English
UGent publication?
yes
classification
C1
copyright statement
I have retained and own the full copyright for this publication
id
2128385
handle
http://hdl.handle.net/1854/LU-2128385
alternative location
http://aclweb.org/anthology/W/W12/W12-0301.pdf
date created
2012-06-01 10:43:05
date last changed
2014-02-05 11:43:47
@inproceedings{2128385,
  abstract     = {Keystroke-logging tools are widely used in writing process research. These applications are designed to capture each character and mouse movement as isolated events as an indicator of cognitive processes. The current research project explores the possibilities of aggregating the logged process data from the letter level (keystroke) to the word level by merging them with existing lexica and using NLP tools. Linking writing process data to lexica and using NLP tools enables researchers to analyze the data on a higher, more complex level.
In this project the output data of Inputlog are segmented on the sentence level and then tokenized. However, by definition writing process data do not always represent clean and grammatical text.
Coping with this problem was one of the main challenges in the current project. Therefore, a parser has been developed that extracts three types of data from the S-notation: word-level revisions, deleted fragments, and the final writing product. The within-word typing errors are identified and excluded from further analyses. At this stage the Inputlog process data are enriched with the following linguistic information: part-of-speech tags, lemmas, chunks, syllable boundaries and word frequencies.},
  author       = {Leijten, Mari{\"e}lle and Macken, Lieve and Hoste, Veronique and Van Horenbeeck, Eric and Van Waes, Luuk},
  booktitle    = {Proceedings of the EACL 2012 workshop on computational linguistics and writing},
  isbn         = {9781937284190},
  keyword      = {linguistic annotation,keystroke logging,linguistic analysis,Inputlog},
  language     = {eng},
  location     = {Avignon, France},
  pages        = {8},
  publisher    = {Association for Computational Linguistics (ACL)},
  title        = {From character to word level: enabling the linguistic analyses of Inputlog process data},
  url          = {http://aclweb.org/anthology/W/W12/W12-0301.pdf},
  year         = {2012},
}

Chicago
Leijten, Mariëlle, Lieve Macken, Veronique Hoste, Eric Van Horenbeeck, and Luuk Van Waes. 2012. “From Character to Word Level: Enabling the Linguistic Analyses of Inputlog Process Data.” In Proceedings of the EACL 2012 Workshop on Computational Linguistics and Writing. Association for Computational Linguistics (ACL).
APA
Leijten, M., Macken, L., Hoste, V., Van Horenbeeck, E., & Van Waes, L. (2012). From character to word level: enabling the linguistic analyses of Inputlog process data. Proceedings of the EACL 2012 workshop on computational linguistics and writing. Presented at the EACL 2012 workshop on Computational Linguistics and Writing (CL&W 2012), Association for Computational Linguistics (ACL).
Vancouver
1.
Leijten M, Macken L, Hoste V, Van Horenbeeck E, Van Waes L. From character to word level: enabling the linguistic analyses of Inputlog process data. Proceedings of the EACL 2012 workshop on computational linguistics and writing. Association for Computational Linguistics (ACL); 2012.
MLA
Leijten, Mariëlle, Lieve Macken, Veronique Hoste, et al. “From Character to Word Level: Enabling the Linguistic Analyses of Inputlog Process Data.” Proceedings of the EACL 2012 Workshop on Computational Linguistics and Writing. Association for Computational Linguistics (ACL), 2012. Print.