Rui Jia, Ruiyi Lan, Fengrui Liu, Zhongxiang Dai, Bo Jiang, Jing Shao, Jingyuan Chen, Guandong Xu, Fei Wu, Min Zhang
CASTLE is a benchmark designed to evaluate the personalized safety of large language models in educational settings, revealing significant deficiencies in current models' abilities to tailor responses to individual student needs and risks.
Large language models (LLMs) are widely used in education to provide personalized learning experiences. However, these models often give the same answers to everyone, ignoring the unique needs and characteristics of different students. This can be unsafe, especially for vulnerable students. The CASTLE benchmark was created to test how well these models can adapt their responses to individual students' needs and safety concerns. The results show that current models struggle with this task, suggesting that more work is needed to ensure they can safely support diverse learners.