Project: A Respeaking and Collaborative Game-Based Approach to Building a Parsed Corpus of European Spanish Dialects
2018-05-01 – 2022-04-30
- Abstract
The study of dialectal microvariation of Spanish spoken in Spain has until recently mainly focused on lexical and phonetic features. The morphosyntax of these dialects, on the contrary, remains largely unexplored, despite the recent surge in interest in dialect grammars. This is due to the lack of large annotated dialectal corpora. The proposed project aims to fill this lacuna and will create the first morphosyntactically annotated and parsed corpus of the European Spanish dialects. This dialect corpus will be designed in a geographically balanced way and its material will proceed from the COSER corpus (Corpus Oral y Sonoro del Español Rural `Audible Corpus of Spoken Rural Spanish'), which is the largest collection of oral data in the Spanishspeaking world but which remains largely un-transcribed. As transcribing and annotating are expensive and laborintensive, this project takes a respeaking and collaborative game-based approach to building the parsed corpus of European Spanish dialects. In other words, we intend to obtain automatic transcriptions using a speech recognizer. These will then be processing using Natural Language Processing tools and can then be used to create a crowdsourced game through which members of the public contribute to the co-creation of the parsed corpus by providing annotations in the context of a game.
-
Universal dependencies for spoken Spanish
(2024) -
The influence of personality traits and game design elements on player enjoyment : an empirical study on gwaps for linguistics
(2024) International Conference on Games and Learning Alliance. In Lecture Notes in Computer Science 14475. p.204-213 -
The influence of personality traits and game design elements on player enjoyment : a demo on GWAPs for part-of-speech tagging
-
Games with a purpose for the annotation of the Oral Sound Corpus of rural Spanish
-
La construcción del Corpus Oral y Sonoro del Español Rural - Anotado y Parseado (COSER-AP) : avances en el etiquetado de partes del discurso
-
Building blocks for creating enjoyable games : a systematic literature review