Challenges and opportunities of automated essay scoring for low-proficient L2 English writers
- Author
- Vanessa De Wilde (UGent) and Orphée De Clercq (UGent)
- Organization
- Project
- Abstract
- Assessing students' writing can be a challenging activity. To make writing assessment more feasible, researchers have investigated the possibilities of automated essay scoring (AES). Most studies investigating AES have focused on L1 writing or intermediate to advanced L2 writing. In this study we explored the possibilities of using AES with low proficiency L2 English writers. We used a dataset which comprised writing samples from 3166 young L2 English learners who were at the very start of L2 English instruction. All tasks received a score assigned by humans. For automated scoring we experimented with two machine learning methods. First, a feature-based approach for which the dataset was linguistically preprocessed using natural language processing tools. The second approach employed deep learning by fine-tuning various large language models. Because we were particularly interested in the influence of spelling errors, we also created a corrected, spell-checked version of our dataset. Models trained on the uncorrected samples yield the best results. Especially the deep learning approach leads to a satisfying performance with a quadratic weighted kappa above .70. The model which was fine-tuned on an underlying Dutch large language model was superior, which might be linked to the low L2 English proficiency of the young L1 Dutch writers in our sample.
- Keywords
- Automated essay scoring, L2 Writing, Large Language Models, Adolescent learners, AGREEMENT
Downloads
-
(...).pdf
- full text (Accepted manuscript)
- |
- UGent only (changes to open access on 2026-09-20)
- |
- |
- 769.04 KB
-
(...).pdf
- full text (Published version)
- |
- UGent only
- |
- |
- 1.59 MB
Citation
Please use this url to cite or link to this publication: http://hdl.handle.net/1854/LU-01K6AJV5GQ5PCC03CAB80SG6MW
- MLA
- De Wilde, Vanessa, and Orphée De Clercq. “Challenges and Opportunities of Automated Essay Scoring for Low-Proficient L2 English Writers.” ASSESSING WRITING, vol. 66, 2025, doi:10.1016/j.asw.2025.100982.
- APA
- De Wilde, V., & De Clercq, O. (2025). Challenges and opportunities of automated essay scoring for low-proficient L2 English writers. ASSESSING WRITING, 66. https://doi.org/10.1016/j.asw.2025.100982
- Chicago author-date
- De Wilde, Vanessa, and Orphée De Clercq. 2025. “Challenges and Opportunities of Automated Essay Scoring for Low-Proficient L2 English Writers.” ASSESSING WRITING 66. https://doi.org/10.1016/j.asw.2025.100982.
- Chicago author-date (all authors)
- De Wilde, Vanessa, and Orphée De Clercq. 2025. “Challenges and Opportunities of Automated Essay Scoring for Low-Proficient L2 English Writers.” ASSESSING WRITING 66. doi:10.1016/j.asw.2025.100982.
- Vancouver
- 1.De Wilde V, De Clercq O. Challenges and opportunities of automated essay scoring for low-proficient L2 English writers. ASSESSING WRITING. 2025;66.
- IEEE
- [1]V. De Wilde and O. De Clercq, “Challenges and opportunities of automated essay scoring for low-proficient L2 English writers,” ASSESSING WRITING, vol. 66, 2025.
@article{01K6AJV5GQ5PCC03CAB80SG6MW,
abstract = {{Assessing students' writing can be a challenging activity. To make writing assessment more feasible, researchers have investigated the possibilities of automated essay scoring (AES). Most studies investigating AES have focused on L1 writing or intermediate to advanced L2 writing. In this study we explored the possibilities of using AES with low proficiency L2 English writers. We used a dataset which comprised writing samples from 3166 young L2 English learners who were at the very start of L2 English instruction. All tasks received a score assigned by humans. For automated scoring we experimented with two machine learning methods. First, a feature-based approach for which the dataset was linguistically preprocessed using natural language processing tools. The second approach employed deep learning by fine-tuning various large language models. Because we were particularly interested in the influence of spelling errors, we also created a corrected, spell-checked version of our dataset. Models trained on the uncorrected samples yield the best results. Especially the deep learning approach leads to a satisfying performance with a quadratic weighted kappa above .70. The model which was fine-tuned on an underlying Dutch large language model was superior, which might be linked to the low L2 English proficiency of the young L1 Dutch writers in our sample.}},
articleno = {{100982}},
author = {{De Wilde, Vanessa and De Clercq, Orphée}},
issn = {{1075-2935}},
journal = {{ASSESSING WRITING}},
keywords = {{Automated essay scoring,L2 Writing,Large Language Models,Adolescent learners,AGREEMENT}},
language = {{eng}},
pages = {{10}},
title = {{Challenges and opportunities of automated essay scoring for low-proficient L2 English writers}},
url = {{http://doi.org/10.1016/j.asw.2025.100982}},
volume = {{66}},
year = {{2025}},
}
- Altmetric
- View in Altmetric
- Web of Science
- Times cited: