Content

Speaker

Saeed Mahloujifar

Abstract

What does it really mean when a language model “memorizes” its training data? memorization remains one of the most puzzling behaviors of foundation models. In this talk, I’ll unpack the conceptual and practical challenges in defining memorization, propose a new definition based on Kolmogorov complexity, and show how this perspective helps us measure memorization efficiently and meaningfully. I’ll also share experiments that reveal surprising patterns in how large models internalize their data, with implications for privacy, generalization, and interpretability.

Bio

Saeed Mahloujifar is a Research Scientist in the Fundamental AI Research (FAIR) team at Meta. His research interests are on theoretical foundations of privacy and security for AI systems and their interplay with cryptography. Previously, he was a postdoctoral researcher at Princeton University working with Prateek Mittal. He received his Ph.D. from the Department of Computer Science at the University of Virginia in the summer of 2020 under the supervision of Mohammad Mahmoody. Prior to UVa he got his B.Sc. degree from the department of Computer Engineering at Sharif University of Technology in the summer of 2015. He spent the summers of 2019 and 2020 working as a research intern at Microsoft Research, Redmond.

Host

UMass AI Security