PhD Seminar: Mengxue Zhang, AI-Driven Analysis, Scoring, and Generation for Open-Ended Mathematical Reasoning
Content
Abstract
Automated assessment of mathematical reasoning poses unique challenges due to the structured nature of mathematical language, the sensitivity of numerical operations, and the diversity of correct solution paths. Unlike responses in other domains, mathematical answers often intertwine natural language with formal notation, requiring systems to understand not only the final answer but also the underlying reasoning process. Current automated scoring approaches often struggle with generalization across tasks, lack optimization for mathematical syntax and semantics, and focus primarily on final-answer correctness, limiting their diagnostic value. This thesis addresses these limitations by developing a comprehensive framework for the automated assessment of open-ended mathematical responses. The core contributions include (1) enhancing scoring accuracy and generalizability through domain-specific models and in-context learning; (2) analyzing students' multi-step solution processes using math operation embeddings to diagnose errors and deliver targeted feedback; (3) generating interpretable, step-by-step model solutions to support student learning; and (4) establishing robust, interpretable evaluation methodologies to validate model effectiveness. Together, these contributions aim to advance scalable, feedback-driven assessment tools for mathematical learning.
Advisor
Andrew Lan