Faculty Recruiting Support CICS

CS Theory Seminar: Why to Learn Diffusion Models in the Latent Space

16 Apr
Tuesday, 04/16/2024 1:00pm to 2:00pm
Lederle Graduate Research Center, Room A104A
Theory Seminar

Abstract: Prior work demonstrates that diffusion- and flow-based generative models are effective for generating high dimensional objects, such as high-resolution images. Furthermore, specific models, such as score-based generative models, have shown to produce high-quality samples when the model is trained in the latent space of the data distribution. Learning in the latent space has a few obvious advantages. First, we get to work with smaller matrices and neural networks, reducing the computational cost of forward and backpropagation, and leading to faster training and faster sampling. Second, the target function we are learning has less components that we need to learn, further reducing training time. The downside is that, in general, the dimension reduction is lossy, in the case that dimensions we remove are not just noise. However, it is observed that, even when the score is learned in the original data dimension for significantly longer (to account for additional complexity), the latent space models produce strictly better samples.

Our work utilizes connections between score-based generative models and mean-field games to derive novel justification showing that latent score still learns the true score of the data distribution. We use the PDEs from the mean-field game to mathematically justify why learning a score in the latent space greatly outperforms scores learned in the original data dimension.

Bio: Ben Burns is an undergraduate math and CS student at UMass Amherst, advised by Markos Katsoulakis and Benjamin Zhang. His research interests include mathematical machine learning, applications of probability, and computational complexity.