- Author
- Hans Paulussen and Lieve Macken (UGent)
- Organization
- Abstract
- The Dutch Parallel Corpus (DPC) is a translation corpus containing Dutch, English and French text samples aligned at sentence level. Next to sentence alignment, the corpus has also been grammatically annotated, thus improving exploitation for different domains, including natural language processing, translation research or CALL (computer-assisted language learning). In this paper, we describe the compilation of DPC and the alignment procedures used. This is followed by a description of the annotation task for the three languages, which required different tools and different tag sets. Finally the impact of different grammatical annotations on multilingual corpus exploitation is discussed.
- Keywords
- linguistic annotation, parallel corpus
Downloads
-
(...).pdf
- full text
- |
- UGent only
- |
- |
- 75.83 KB
Citation
Please use this url to cite or link to this publication: http://hdl.handle.net/1854/LU-1085253
- MLA
- Paulussen, Hans, and Lieve Macken. “Annotating the Dutch Parallel Corpus.” NEALT Proceedings Series, edited by Lars Ahrenberg et al., vol. 10, Northern European Association for Language Technology (NEALT), 2010, pp. 63–72.
- APA
- Paulussen, H., & Macken, L. (2010). Annotating the Dutch parallel corpus. In L. Ahrenberg, J. Tiedemann, & M. Volk (Eds.), NEALT Proceedings Series (Vol. 10, pp. 63–72). Tartu, Estonia: Northern European Association for Language Technology (NEALT).
- Chicago author-date
- Paulussen, Hans, and Lieve Macken. 2010. “Annotating the Dutch Parallel Corpus.” In NEALT Proceedings Series, edited by Lars Ahrenberg, Jörg Tiedemann, and Martin Volk, 10:63–72. Tartu, Estonia: Northern European Association for Language Technology (NEALT).
- Chicago author-date (all authors)
- Paulussen, Hans, and Lieve Macken. 2010. “Annotating the Dutch Parallel Corpus.” In NEALT Proceedings Series, ed by. Lars Ahrenberg, Jörg Tiedemann, and Martin Volk, 10:63–72. Tartu, Estonia: Northern European Association for Language Technology (NEALT).
- Vancouver
- 1.Paulussen H, Macken L. Annotating the Dutch parallel corpus. In: Ahrenberg L, Tiedemann J, Volk M, editors. NEALT Proceedings Series. Tartu, Estonia: Northern European Association for Language Technology (NEALT); 2010. p. 63–72.
- IEEE
- [1]H. Paulussen and L. Macken, “Annotating the Dutch parallel corpus,” in NEALT Proceedings Series, Tartu, Estonia, 2010, vol. 10, pp. 63–72.
@inproceedings{1085253, abstract = {{The Dutch Parallel Corpus (DPC) is a translation corpus containing Dutch, English and French text samples aligned at sentence level. Next to sentence alignment, the corpus has also been grammatically annotated, thus improving exploitation for different domains, including natural language processing, translation research or CALL (computer-assisted language learning). In this paper, we describe the compilation of DPC and the alignment procedures used. This is followed by a description of the annotation task for the three languages, which required different tools and different tag sets. Finally the impact of different grammatical annotations on multilingual corpus exploitation is discussed.}}, author = {{Paulussen, Hans and Macken, Lieve}}, booktitle = {{NEALT Proceedings Series}}, editor = {{Ahrenberg, Lars and Tiedemann, Jörg and Volk, Martin}}, issn = {{1736-6305}}, keywords = {{linguistic annotation,parallel corpus}}, language = {{eng}}, location = {{Tartu, Estonia}}, pages = {{63--72}}, publisher = {{Northern European Association for Language Technology (NEALT)}}, title = {{Annotating the Dutch parallel corpus}}, volume = {{10}}, year = {{2010}}, }