Exploring LLMs’ capabilities for error detection in Dutch L1 and L2 writing products
- Author
- Joni Kruijsbergen (UGent) , Serafina Van Geertruyen (UGent) , Veronique Hoste (UGent) and Orphée De Clercq (UGent)
- Organization
- Project
- Abstract
- This research examines the capabilities of Large Language Models for writing error detection, which can be seen as a first step towards automated writing support. Our work focuses on Dutch writing error detection, targeting two envisaged end-users: L1 and L2 adult speakers of Dutch. We relied on proprietary L1 and L2 datasets comprising writing products annotated with a variety of writing errors. Following the recent paradigms in NLP research, we experimented with both a fine-tuning approach combining different mono- (BERTje, RobBERT) and multilingual (mBERT, XLM-RoBERTa) models, as well as a zero-shot approach through prompting a generative autoregressive language model (GPT-3.5). The results reveal that the fine-tuning approach outperforms zero-shotting to a large extent, both for L1 and L2, even though there is much room left for improvement.
Downloads
-
CLIN13 Exploring LLMs for GED Kruijsbergen etal.pdf
- full text (Published version)
- |
- open access
- |
- |
- 555.95 KB
Citation
Please use this url to cite or link to this publication: http://hdl.handle.net/1854/LU-01HTFDVQ3Y3J7F79KGHEXQ3RZ1
- MLA
- Kruijsbergen, Joni, et al. “Exploring LLMs’ Capabilities for Error Detection in Dutch L1 and L2 Writing Products.” COMPUTATIONAL LINGUISTICS IN THE NETHERLANDS JOURNAL, vol. 13, 2024, pp. 173–91.
- APA
- Kruijsbergen, J., Van Geertruyen, S., Hoste, V., & De Clercq, O. (2024). Exploring LLMs’ capabilities for error detection in Dutch L1 and L2 writing products. COMPUTATIONAL LINGUISTICS IN THE NETHERLANDS JOURNAL, 13, 173–191.
- Chicago author-date
- Kruijsbergen, Joni, Serafina Van Geertruyen, Veronique Hoste, and Orphée De Clercq. 2024. “Exploring LLMs’ Capabilities for Error Detection in Dutch L1 and L2 Writing Products.” COMPUTATIONAL LINGUISTICS IN THE NETHERLANDS JOURNAL 13: 173–91.
- Chicago author-date (all authors)
- Kruijsbergen, Joni, Serafina Van Geertruyen, Veronique Hoste, and Orphée De Clercq. 2024. “Exploring LLMs’ Capabilities for Error Detection in Dutch L1 and L2 Writing Products.” COMPUTATIONAL LINGUISTICS IN THE NETHERLANDS JOURNAL 13: 173–191.
- Vancouver
- 1.Kruijsbergen J, Van Geertruyen S, Hoste V, De Clercq O. Exploring LLMs’ capabilities for error detection in Dutch L1 and L2 writing products. COMPUTATIONAL LINGUISTICS IN THE NETHERLANDS JOURNAL. 2024;13:173–91.
- IEEE
- [1]J. Kruijsbergen, S. Van Geertruyen, V. Hoste, and O. De Clercq, “Exploring LLMs’ capabilities for error detection in Dutch L1 and L2 writing products,” COMPUTATIONAL LINGUISTICS IN THE NETHERLANDS JOURNAL, vol. 13, pp. 173–191, 2024.
@article{01HTFDVQ3Y3J7F79KGHEXQ3RZ1, abstract = {{This research examines the capabilities of Large Language Models for writing error detection, which can be seen as a first step towards automated writing support. Our work focuses on Dutch writing error detection, targeting two envisaged end-users: L1 and L2 adult speakers of Dutch. We relied on proprietary L1 and L2 datasets comprising writing products annotated with a variety of writing errors. Following the recent paradigms in NLP research, we experimented with both a fine-tuning approach combining different mono- (BERTje, RobBERT) and multilingual (mBERT, XLM-RoBERTa) models, as well as a zero-shot approach through prompting a generative autoregressive language model (GPT-3.5). The results reveal that the fine-tuning approach outperforms zero-shotting to a large extent, both for L1 and L2, even though there is much room left for improvement.}}, author = {{Kruijsbergen, Joni and Van Geertruyen, Serafina and Hoste, Veronique and De Clercq, Orphée}}, issn = {{2211-4009}}, journal = {{COMPUTATIONAL LINGUISTICS IN THE NETHERLANDS JOURNAL}}, language = {{eng}}, pages = {{173--191}}, title = {{Exploring LLMs’ capabilities for error detection in Dutch L1 and L2 writing products}}, volume = {{13}}, year = {{2024}}, }