Learning Image Representations: Transfer Learning and Latent Variable Models

05 Mar
Tuesday, 03/05/2013 11:00am to 12:00pm

Ariadna Quattoni
Technical University of Catalonia (UPC)

Computer Science Building, Room 151

Faculty Host: Erik Learned-Miller

One of the main goals of high-level computer vision is to generate semantic descriptions of visual content. At a technical level this involves designing learning algorithms that can map from complex image spaces to semantic categories. To achieve this goal we must have the ability to induce good image representations from available data; i.e., representations that facilitate the semantic categorization of visual content. In this talk I will introduce two approaches to learn such representations. The first approach is based on transfer learning and exploits the fact that some visual categories might share an underlying "semantic" representation. To implement this idea I will present a learning framework based on joint sparse regularization. With this framework we can efficiently induce shared representations from large collections of labeled images.

In the second part of the talk I will show how we can exploit latent-variable models to learn better mappings between visual content and semantic categories. In particular, I will present structured latent-variable models that can capture underlying spatial and temporal dependencies in visual content. I will describe simple and efficient algorithms to learn these models which can easily scale to large datasets. Finally, I will outline future research directions towards the goal of mining large collections of multi-modal corpora. I will highlight the role that new efficient methods for large-scale learning of structured latent-variable models can play in achieving this goal.

A reception will be held at 3:40 in the atrium, outside the presentation room.