
Automated Scoring of Constructed Response Items in Math Assessment Using Large Language Models
This study details a winning approach for automatically scoring math test responses using Large Language Models (LLMs). By balancing the training data and customizing the input for each question, their method achieved near-human levels of agreement with human scorers on nine out of ten test items.



