Skip to main content
The University of Massachusetts Amherst
  • Visit
  • Apply
  • Give
  • Search UMass.edu
Manning College of Information & Computer Sciences

Main navigation

  • Academics

    Programs

    Undergraduate Programs Master's Programs Doctoral Program Graduate Certificate Programs

    Curriculum

    Academic Policies Courses

    Academic Support

    Advising Career Development Scholarships and Fellowships Commencement
  • Research

    Research

    Research Areas Research Centers & Labs Undergraduate Research Opportunities

    Faculty & Researchers

    Faculty Directory Faculty Achievements Turing Award

    Engage

    Research News Distinguished Lecturer Series Rising Stars in Computer Science Lecture Series
  • Community

    On-Campus

    Community, Outreach, and Organizational Learning Student Organizations Massenberg Summer STEM Program Awards Programs

    External

    Alumni Support CICS
  • People
    Full A-Z Directory Faculty Staff
  • About

    Overview

    College Overview Leadership Our New Building

    News & Events

    News & Stories Events Calendar Significant Bits Magazine

    Connect

    Visiting CICS Contact Us Employment Offices & Services
  • Info For
    Current Undergraduate Students Current Graduate Students Faculty and Staff Newly Accepted Undergraduate Students

Breadcrumb

  1. Events

PhD Dissertation Proposal: Alex Scarlatos, Creating Realistic Simulated Students: Fine-Tuning LLMs with Reinforcement Learning for Knowledge and Behavior Alignment

Content

Wednesday, March 4, 2026, 2:30 PM - Wednesday, March 4, 2026, 4:00 PM

Hybrid
PhD Dissertation Proposal Defense
Presentation

Speaker:

Alex Scarlatos

Abstract:

As large language models (LLMs) are increasingly used in education, there is a growing need to quickly verify and steer LLM-generated content. In particular, it is important to ensure that educational AI systems improve learning outcomes for students. While it is possible to test new AI systems on real students, this process can be slow, costly, and insecure. Instead, simulated students, i.e., LLM-based models that mimic student behavior, can be used to quickly and safely test AI systems on. However, LLMs do not typically behave like real students, often not following realistic behavioral patterns or knowledge trends, limiting how useful they can be. This thesis presents multiple approaches for aligning LLMs with realistic student behavior and demonstrates how simulated students can promote better student outcomes in downstream generated content.

First, we study student simulation in open-ended testing settings, including essay writing and coding. We use simulated student responses to questions to obtain reliable estimates of question difficulty, which is necessary for calibrating standardized tests and question recommendation systems. We introduce SMART, a novel method for fine-tuning LLMs with reinforcement learning (RL) that aligns the models with realistic response patterns based on student ability and question difficulty, deriving a reward function from item response theory (IRT). We show that SMART outperforms state-of-the-art methods for difficulty prediction and is much better aligned with student ability compared to other simulated student methods.

Second, we study student simulation in math tutoring dialogues, which are increasingly used in online learning platforms to provide real-time feedback and guidance to students. We propose a framework for estimating student knowledge across concepts in dialogue turns by adapting knowledge tracing (KT) to the dialogue setting. We introduce an LLM-based student model for this task, LLMKT, which outperforms classic KT approaches. We then show how to use student models to improve student outcomes with LLM-based tutors. We achieve this by training an LLM tutor with RL, where the reward encourages tutor turns that are expected to increase student knowledge using estimates from LLMKT. We find that this RL-trained tutor outperforms much larger models in terms of pedagogical quality and likelihood of positive learning outcomes.

Finally, we propose an evaluation framework for verifying the realism of student simulations in dialogues. Specifically, we will develop a set of automated evaluation metrics that compare behavioral, linguistic, and knowledge-based aspects of simulated student turns to real student turns. We will then train LLM-based simulated students with RL, using our automated metrics as reward functions, to optimize simulations towards realism. Overall, this thesis details multiple approaches for student simulation that prioritize alignment with real student behavior and knowledge, filling a gap in the current field and enabling development of educational AI that benefits from simulated students in-the-loop.

Advisor: 

Andrew Lan

Hybrid event posted in PhD Dissertation Proposal Defense for Faculty and Current students

More link

Join via Zoom

Site footer

Manning College of Information & Computer Sciences
  • Find us on Facebook
  • Find us on YouTube
  • Find us on LinkedIn
  • Find us on Instagram
  • Find us on Flickr
  • Find us on Bluesky Social
Address

140 Governors Dr
Amherst, MA 01003
United States

  • Visit CICS
  • Give
  • Contact Us
  • Employment
  • Events Calendar
  • Offices & Services

Info For

  • Current Undergraduate Students
  • Current Graduate Students
  • Faculty & Staff
  • Newly Accepted Undergraduate Students
University of Massachusetts Amherst
  • ©2026 University of Massachusetts Amherst
  • Site policies
  • Privacy
  • Non-discrimination notice
  • Accessibility
  • Terms of use