A coreference corpus and resolution system for Dutch
- Author
- Iris Hendrickx, Gosse Bouma, Frederik Coppens, Walter Daelemans, Veronique Hoste (UGent) , Geert Kloosterman, Anne-Marie Mineur, Joeri Van Der Vloet and Jean-Luc Verschelde
- Organization
- Abstract
- We present the main outcomes of the COREA project: a corpus annotated with coreferential relations and a coreference resolution system for Dutch. In the project we developed annotation guidelines for coreference resolution for Dutch and annotated a corpus of 135K tokens. We discuss these guidelines, the annotation tool, and the inter-annotator agreement. We also show a visualization of the annotated relations. The standard approach to evaluate a coreference resolution system is to compare the predictions of the system to a hand-annotated gold standard test set (cross-validation). A more practically oriented evaluation is to test the usefulness of coreference relation information in an NLP application. We run experiments with an Information Extraction module for the medical domain, and measure the performance of this module with and without the coreference relation information. We present the results of both this application-oriented evaluation of our system and of a standard cross-validation evaluation. In a separate experiment we also evaluate the effect of coreference information produced by a simple rule-based coreference module in a Question Answering application.
Downloads
-
(...).pdf
- full text
- |
- UGent only
- |
- |
- 503.34 KB
Citation
Please use this url to cite or link to this publication: http://hdl.handle.net/1854/LU-598362
- MLA
- Hendrickx, Iris, et al. “A Coreference Corpus and Resolution System for Dutch.” LREC 2008 : Sixth International Conference on Language Resources and Evaluation, European Language Resources Association (ELRA), 2008, pp. 144–49.
- APA
- Hendrickx, I., Bouma, G., Coppens, F., Daelemans, W., Hoste, V., Kloosterman, G., … Verschelde, J.-L. (2008). A coreference corpus and resolution system for Dutch. LREC 2008 : Sixth International Conference on Language Resources and Evaluation, 144–149. Paris, France: European Language Resources Association (ELRA).
- Chicago author-date
- Hendrickx, Iris, Gosse Bouma, Frederik Coppens, Walter Daelemans, Veronique Hoste, Geert Kloosterman, Anne-Marie Mineur, Joeri Van Der Vloet, and Jean-Luc Verschelde. 2008. “A Coreference Corpus and Resolution System for Dutch.” In LREC 2008 : Sixth International Conference on Language Resources and Evaluation, 144–49. Paris, France: European Language Resources Association (ELRA).
- Chicago author-date (all authors)
- Hendrickx, Iris, Gosse Bouma, Frederik Coppens, Walter Daelemans, Veronique Hoste, Geert Kloosterman, Anne-Marie Mineur, Joeri Van Der Vloet, and Jean-Luc Verschelde. 2008. “A Coreference Corpus and Resolution System for Dutch.” In LREC 2008 : Sixth International Conference on Language Resources and Evaluation, 144–149. Paris, France: European Language Resources Association (ELRA).
- Vancouver
- 1.Hendrickx I, Bouma G, Coppens F, Daelemans W, Hoste V, Kloosterman G, et al. A coreference corpus and resolution system for Dutch. In: LREC 2008 : sixth international conference on language resources and evaluation. Paris, France: European Language Resources Association (ELRA); 2008. p. 144–9.
- IEEE
- [1]I. Hendrickx et al., “A coreference corpus and resolution system for Dutch,” in LREC 2008 : sixth international conference on language resources and evaluation, Marrakech, Morocco, 2008, pp. 144–149.
@inproceedings{598362, abstract = {{We present the main outcomes of the COREA project: a corpus annotated with coreferential relations and a coreference resolution system for Dutch. In the project we developed annotation guidelines for coreference resolution for Dutch and annotated a corpus of 135K tokens. We discuss these guidelines, the annotation tool, and the inter-annotator agreement. We also show a visualization of the annotated relations. The standard approach to evaluate a coreference resolution system is to compare the predictions of the system to a hand-annotated gold standard test set (cross-validation). A more practically oriented evaluation is to test the usefulness of coreference relation information in an NLP application. We run experiments with an Information Extraction module for the medical domain, and measure the performance of this module with and without the coreference relation information. We present the results of both this application-oriented evaluation of our system and of a standard cross-validation evaluation. In a separate experiment we also evaluate the effect of coreference information produced by a simple rule-based coreference module in a Question Answering application.}}, author = {{Hendrickx, Iris and Bouma, Gosse and Coppens, Frederik and Daelemans, Walter and Hoste, Veronique and Kloosterman, Geert and Mineur, Anne-Marie and Van Der Vloet, Joeri and Verschelde, Jean-Luc}}, booktitle = {{LREC 2008 : sixth international conference on language resources and evaluation}}, isbn = {{9782951740846}}, language = {{eng}}, location = {{Marrakech, Morocco}}, pages = {{144--149}}, publisher = {{European Language Resources Association (ELRA)}}, title = {{A coreference corpus and resolution system for Dutch}}, year = {{2008}}, }