EventDNA : a dataset for Dutch news event extraction as a basis for news diversification
- Author
- Camiel Colruyt, Orphée De Clercq (UGent) , Thierry Desot (UGent) and Veronique Hoste (UGent)
- Organization
- Abstract
- News organizations increasingly tailor their news offering to the reader through personalized recommendation algorithms. However, automated recommendation algorithms reflect a commercial logic based on calculated relevance to the user, rather than aiming at a well-informed citizenry. In this paper, we introduce the EventDNA corpus, a dataset of 1773 Dutch-language news articles annotated with information on entities, news events and IPTC Media Topic codes, with the ultimate goal to outline a recommendation algorithm that uses news event diversity rather than previous reading behaviour as a key driver for personalized news recommendation. We describe the EventDNA annotation guidelines, which are inspired by the well-known ERE framework and conclude that it is not practical to apply a fixed event typology such as used in ERE to an unrestricted data context. The corpus and related source code is made available at haps://github.com/NewsDNA-LT3/.github.
- Keywords
- News recommendation, Event annotation, Event extraction
Downloads
-
EventDNA LRE resubmit authorversion.pdf
- full text (Accepted manuscript)
- |
- open access
- |
- |
- 423.20 KB
-
(...).pdf
- full text (Published version)
- |
- UGent only
- |
- |
- 1.35 MB
Citation
Please use this url to cite or link to this publication: http://hdl.handle.net/1854/LU-01GK18R97NGS8S3XM5CYZ3K3TJ
- MLA
- Colruyt, Camiel, et al. “EventDNA : A Dataset for Dutch News Event Extraction as a Basis for News Diversification.” LANGUAGE RESOURCES AND EVALUATION, vol. 57, no. 1, 2023, pp. 189–221, doi:10.1007/s10579-022-09623-2.
- APA
- Colruyt, C., De Clercq, O., Desot, T., & Hoste, V. (2023). EventDNA : a dataset for Dutch news event extraction as a basis for news diversification. LANGUAGE RESOURCES AND EVALUATION, 57(1), 189–221. https://doi.org/10.1007/s10579-022-09623-2
- Chicago author-date
- Colruyt, Camiel, Orphée De Clercq, Thierry Desot, and Veronique Hoste. 2023. “EventDNA : A Dataset for Dutch News Event Extraction as a Basis for News Diversification.” LANGUAGE RESOURCES AND EVALUATION 57 (1): 189–221. https://doi.org/10.1007/s10579-022-09623-2.
- Chicago author-date (all authors)
- Colruyt, Camiel, Orphée De Clercq, Thierry Desot, and Veronique Hoste. 2023. “EventDNA : A Dataset for Dutch News Event Extraction as a Basis for News Diversification.” LANGUAGE RESOURCES AND EVALUATION 57 (1): 189–221. doi:10.1007/s10579-022-09623-2.
- Vancouver
- 1.Colruyt C, De Clercq O, Desot T, Hoste V. EventDNA : a dataset for Dutch news event extraction as a basis for news diversification. LANGUAGE RESOURCES AND EVALUATION. 2023;57(1):189–221.
- IEEE
- [1]C. Colruyt, O. De Clercq, T. Desot, and V. Hoste, “EventDNA : a dataset for Dutch news event extraction as a basis for news diversification,” LANGUAGE RESOURCES AND EVALUATION, vol. 57, no. 1, pp. 189–221, 2023.
@article{01GK18R97NGS8S3XM5CYZ3K3TJ, abstract = {{News organizations increasingly tailor their news offering to the reader through personalized recommendation algorithms. However, automated recommendation algorithms reflect a commercial logic based on calculated relevance to the user, rather than aiming at a well-informed citizenry. In this paper, we introduce the EventDNA corpus, a dataset of 1773 Dutch-language news articles annotated with information on entities, news events and IPTC Media Topic codes, with the ultimate goal to outline a recommendation algorithm that uses news event diversity rather than previous reading behaviour as a key driver for personalized news recommendation. We describe the EventDNA annotation guidelines, which are inspired by the well-known ERE framework and conclude that it is not practical to apply a fixed event typology such as used in ERE to an unrestricted data context. The corpus and related source code is made available at haps://github.com/NewsDNA-LT3/.github.}}, author = {{Colruyt, Camiel and De Clercq, Orphée and Desot, Thierry and Hoste, Veronique}}, issn = {{1574-020X}}, journal = {{LANGUAGE RESOURCES AND EVALUATION}}, keywords = {{News recommendation,Event annotation,Event extraction}}, language = {{eng}}, number = {{1}}, pages = {{189--221}}, title = {{EventDNA : a dataset for Dutch news event extraction as a basis for news diversification}}, url = {{http://doi.org/10.1007/s10579-022-09623-2}}, volume = {{57}}, year = {{2023}}, }
- Altmetric
- View in Altmetric
- Web of Science
- Times cited: