Faculty Recruiting Support CICS

Toward Perception Models Beyond Internet Applications

15 Mar
Friday, 03/15/2024 12:00pm to 1:00pm
Computer Science Building, Room 150/151 or virtual via Zoom
Machine Learning and Friends Lunch

Abstract: For the past decade, we have observed tremendous success in perception models in various internet applications, ranging from models that allow us to search through large collections of internet images using natural language to models that could effectively segment out common objects found in internet images. While impressive, these successes are not easily replicable in other problem domains (e.g., remote sensing, medical imaging) due to various issues, such as limited supervision.

In this talk, I will talk about my work on developing label-efficient perception models. Specifically, I will talk about how we can leverage unlabeled data and models pre-trained on large-scale internet data to build label-efficient models. I will also discuss my recent work on building vision-language models (VLMs) for remote sensing without textual annotations. I will end this talk with a brief discussion of my future work on enabling visual perception in various problem domains.

Bio: Cheng Phoo is a PhD candidate at Cornell University, advised by Bharath Hariharan. Prior to Cornell, he received his bachelor's degree in Computer Science and Pure Mathematics from the University of Michigan, Ann Arbor. His research focuses on building perception models that are broadly useful for different problem domains.