Guiding labeling efforts in question difficulty estimation using active learning
- Author
- Arthur Thuy (UGent) and Dries Benoit (UGent)
- Organization
- Abstract
- Estimating the difficulty of exam questions is crucial for effectively evaluating students’ knowledge and facilitating personalized exercise recommendations. Obtaining labels for the training dataset typically involves time-consuming and expensive pretesting and manual calibra- tion. Despite recent advances in fine-tuned Transformer-based mod- els outperforming traditional machine learning approaches, the label- ing expenses remain considerable. Our study addresses this labeling challenge by leveraging active learning to drive the annotation pro- cess, directing human expert attention toward the most informative data points. Given the lack of uncertainty in standard regression neural networks, we employ Monte Carlo Dropout to capture model uncer- tainty in predictions on the unlabeled set. Model uncertainty tends to be high on data points in underrepresented areas of the input space, precisely the observations we aim to label. Fine-tuning a DistilBERT model with Monte Carlo Dropout on a dataset comprising science and math multiple-choice questions yields promising results.
Downloads
-
(...).pdf
- full text (Published version)
- |
- UGent only
- |
- |
- 6.80 MB
Citation
Please use this url to cite or link to this publication: http://hdl.handle.net/1854/LU-01J3G6WY4A0S5BS5452R0B3Z1T
- MLA
- Thuy, Arthur, and Dries Benoit. “Guiding Labeling Efforts in Question Difficulty Estimation Using Active Learning.” EURO 2024 Conference Handbook & Abstracts : 33rd European Conference on Operational Research (EURO XXXIII), Copenhagen, Denmark, June 30 - July 3, 2024, 2024, pp. 384–384.
- APA
- Thuy, A., & Benoit, D. (2024). Guiding labeling efforts in question difficulty estimation using active learning. EURO 2024 Conference Handbook & Abstracts : 33rd European Conference on Operational Research (EURO XXXIII), Copenhagen, Denmark, June 30 - July 3, 2024, 384–384.
- Chicago author-date
- Thuy, Arthur, and Dries Benoit. 2024. “Guiding Labeling Efforts in Question Difficulty Estimation Using Active Learning.” In EURO 2024 Conference Handbook & Abstracts : 33rd European Conference on Operational Research (EURO XXXIII), Copenhagen, Denmark, June 30 - July 3, 2024, 384–384.
- Chicago author-date (all authors)
- Thuy, Arthur, and Dries Benoit. 2024. “Guiding Labeling Efforts in Question Difficulty Estimation Using Active Learning.” In EURO 2024 Conference Handbook & Abstracts : 33rd European Conference on Operational Research (EURO XXXIII), Copenhagen, Denmark, June 30 - July 3, 2024, 384–384.
- Vancouver
- 1.Thuy A, Benoit D. Guiding labeling efforts in question difficulty estimation using active learning. In: EURO 2024 Conference Handbook & Abstracts : 33rd European Conference on Operational Research (EURO XXXIII), Copenhagen, Denmark, June 30 - July 3, 2024. 2024. p. 384–384.
- IEEE
- [1]A. Thuy and D. Benoit, “Guiding labeling efforts in question difficulty estimation using active learning,” in EURO 2024 Conference Handbook & Abstracts : 33rd European Conference on Operational Research (EURO XXXIII), Copenhagen, Denmark, June 30 - July 3, 2024, Copenhagen, Denmark, 2024, pp. 384–384.
@inproceedings{01J3G6WY4A0S5BS5452R0B3Z1T,
abstract = {{Estimating the difficulty of exam questions is crucial for effectively evaluating students’ knowledge and facilitating personalized exercise recommendations. Obtaining labels for the training dataset typically involves time-consuming and expensive pretesting and manual calibra- tion. Despite recent advances in fine-tuned Transformer-based mod- els outperforming traditional machine learning approaches, the label- ing expenses remain considerable. Our study addresses this labeling challenge by leveraging active learning to drive the annotation pro- cess, directing human expert attention toward the most informative data points. Given the lack of uncertainty in standard regression neural networks, we employ Monte Carlo Dropout to capture model uncer- tainty in predictions on the unlabeled set. Model uncertainty tends to be high on data points in underrepresented areas of the input space, precisely the observations we aim to label. Fine-tuning a DistilBERT model with Monte Carlo Dropout on a dataset comprising science and math multiple-choice questions yields promising results.}},
author = {{Thuy, Arthur and Benoit, Dries}},
booktitle = {{EURO 2024 Conference Handbook & Abstracts : 33rd European Conference on Operational Research (EURO XXXIII), Copenhagen, Denmark, June 30 - July 3, 2024}},
isbn = {{9788793458260}},
language = {{eng}},
location = {{Copenhagen, Denmark}},
pages = {{384--384}},
title = {{Guiding labeling efforts in question difficulty estimation using active learning}},
url = {{https://euro2024cph.dk/}},
year = {{2024}},
}