Advanced search
1 file | 61.42 KB

A named entity recognition system for Dutch

Author
Organization
Abstract
We describe a Named Entity Recognition system for Dutch that combines gazetteers, hand-crafted rules, and machine learning on the basis of seed material. We used gazetteers and a corpus to construct training material for Ripper, a rule learner. Instead of using Ripper to train a complete system, we used many different runs of Ripper in order to derive rules which we then interpreted and implemented in our own, hand-crafted system. This speeded up the building of a hand-crafted system, and allowed us to use many different rule sets in order to improve performance. We discuss the advantages of using machine learning software as a toot in knowledge acquisition, and evaluate the resulting system for Dutch.

Downloads

  • 10.1.1.12.6825 1 .pdf
    • full text
    • |
    • open access
    • |
    • PDF
    • |
    • 61.42 KB

Citation

Please use this url to cite or link to this publication:

Chicago
De Meulder, Fien, Walter Daelemans, and Veronique Hoste. 2002. “A Named Entity Recognition System for Dutch.” In Language and Computers : Studies in Practical Linguistics, ed. Mariet Theune, Anton Nijholt, and Hendri Hondrop, 45:77–88. Amsterdam, The Netherlands: Rodopi.
APA
De Meulder, Fien, Daelemans, W., & Hoste, V. (2002). A named entity recognition system for Dutch. In M. Theune, A. Nijholt, & H. Hondrop (Eds.), Language and Computers : Studies in Practical Linguistics (Vol. 45, pp. 77–88). Presented at the 12th Meeting on Computational Linguistics in the Netherlands (CLIN), Amsterdam, The Netherlands: Rodopi.
Vancouver
1.
De Meulder F, Daelemans W, Hoste V. A named entity recognition system for Dutch. In: Theune M, Nijholt A, Hondrop H, editors. Language and Computers : Studies in Practical Linguistics. Amsterdam, The Netherlands: Rodopi; 2002. p. 77–88.
MLA
De Meulder, Fien, Walter Daelemans, and Veronique Hoste. “A Named Entity Recognition System for Dutch.” Language and Computers : Studies in Practical Linguistics. Ed. Mariet Theune, Anton Nijholt, & Hendri Hondrop. Vol. 45. Amsterdam, The Netherlands: Rodopi, 2002. 77–88. Print.
@inproceedings{598016,
  abstract     = {We describe a Named Entity Recognition system for Dutch that combines gazetteers, hand-crafted rules, and machine learning on the basis of seed material. We used gazetteers and a corpus to construct training material for Ripper, a rule learner. Instead of using Ripper to train a complete system, we used many different runs of Ripper in order to derive rules which we then interpreted and implemented in our own, hand-crafted system. This speeded up the building of a hand-crafted system, and allowed us to use many different rule sets in order to improve performance. We discuss the advantages of using machine learning software as a toot in knowledge acquisition, and evaluate the resulting system for Dutch.},
  author       = {De Meulder, Fien and Daelemans, Walter and Hoste, Veronique},
  booktitle    = {Language and Computers : Studies in Practical Linguistics},
  editor       = {Theune, Mariet and Nijholt, Anton and Hondrop, Hendri},
  isbn         = {9789042009431},
  issn         = {0921-5034},
  language     = {eng},
  location     = {Enschede, The Netherlands},
  pages        = {77--88},
  publisher    = {Rodopi},
  title        = {A named entity recognition system for Dutch},
  volume       = {45},
  year         = {2002},
}

Web of Science
Times cited: