PhD Dissertation Proposal: Saaduddin Mahmud, Aligning Agentic Systems: Improving Specification and Scalability
Content
Speaker:
Abstract:
Agentic systems consist of goal-driven entities, or agents, that perceive, reason, and act within an environment. With the rise of large language models (LLMs), such agents are now widely deployed in real-world applications, including automated research, conversational assistance, and automated programming. In addition, at the frontier, we are now seeing the emergence of multi-agent systems, in which collections of interacting agents jointly shape high-stakes decisions such as recruiting, insurance claim processing, and online shopping.
As agents are deployed in increasingly critical roles, ensuring that their behavior is consistent with the tasks they are designed to solve becomes both an operational necessity and a safety requirement. This challenge is commonly framed as the alignment problem. This thesis views alignment as comprising two closely related processes: inferring the task objective from direct or indirect signals, and optimizing agent behavior so that its actions are consistent with the inferred objective. When failures arise in either objective inference or behavior optimization, agents may behave ineffectively or unpredictably, leading to both efficacy losses and serious safety risks.
Over the past few decades, the communication of task objectives through signals such as demonstrations, interventions, and preferences has been studied extensively. However, these signals often underspecify the intended task, leading to ambiguity in objective inference in novel scenarios. Even when the objective is correctly inferred, scaling behavior optimization techniques for LLM-based agents remain challenging due to various computational constraints.
This thesis addresses both specification and scalability challenges in agentic system alignment. First, methods are introduced to improve the specification of commonly used signals, such as interventions and pairwise preferences, for objective inference. This is achieved by augmenting these signals with the reasoning processes that generate them, and by designing novel objective inference methods that leverage the augmented signals. Second, scalable approaches are developed for optimizing agent behavior under computational constraints, including efficient prompt optimization and reinforcement learning techniques for both single-agent and multi-agent settings.
Advisor:
Shlomo Zilberstein