PhD Thesis Defense: Alex Scarlatos, Towards Realistic Simulated Students: Aligning Language Models with Student Knowledge and Behavior
Content
Speaker:
Abstract:
As large language models (LLMs) are increasingly used in education, there is a growing need to quickly verify and steer LLM-generated content for the benefit of students. In particular, it is important to ensure that educational AI systems improve learning outcomes and engagement while minimizing potential harms. While it is possible to test new systems and content on real students, this process can be slow, costly, and insecure. Instead, simulated students, i.e., LLM-based models that mimic student behavior, can be used to quickly and safely verify AI systems. However, LLMs do not typically behave like real students, often exhibiting unrealistic behavioral patterns and knowledge trends, limiting how useful they can be. In this thesis, I address this problem by introducing multiple approaches for aligning LLMs with student behavior and knowledge. I further demonstrate how simulated students can be leveraged to align educational content with student-centered goals.
First, I detail SMART, a method for fine-tuning LLMs with reinforcement learning (RL) that aligns the models with realistic response patterns based on student ability and question difficulty, deriving a reward function from item response theory. I demonstrate how LLMs trained with SMART can be used to reliably estimate question difficulty. Second, I introduce dialogue knowledge tracing (KT), where tutor-student dialogues are viewed as sequences of formative assessments, allowing the use of KT to model student knowledge in dialogues. I also detail LLMKT, an LLM-based method for estimating student knowledge that improves on prior KT approaches. Third, I detail a method for leveraging student models in LLM-based tutor training, where I use an RL pipeline that rewards tutor responses that result in better student learning outcomes, which are estimated using LLMKT. Fourth, I detail an evaluation framework that measures the realism of student simulations along multiple behavioral, cognitive, and linguistic dimensions, facilitating the development of more aligned simulated student methods in the future. Finally, I conclude by detailing future directions for improving and broadening simulated students methods, and for leveraging simulated students to tackle current challenges in education.
Advisor:
Emery Berger