Faculty Recruiting Support CICS

Data to Science With AI and Human-in-the-Loop

04 Apr
Tuesday, 04/04/2023 1:00pm to 3:00pm
CS 203
PhD Dissertation Proposal Defense
Speaker: Gustavo Perez

Recent advancements in computer vision using deep neural networks have led to effective solutions for real-world problems across various fields such as biology, earth science, healthcare, astronomy, and more. Specifically, transfer learning from pre-trained deep networks in large-scale color image datasets has allowed the training of powerful visual representation extractors with limited labeled data in these disciplines. However, the gap between novel domains and natural color images can limit the impact of transfer learning. This thesis addresses the problem of representation learning for novel domains with limited data from three different perspectives.


First, we address the case when transfer learning is potentially useful but the data structure does not allow the use of pre-trained networks in color images, by designing light-weight domain adapters that can be plugged in before the pre-trained network to make it compatible with the new domain. This is demonstrated in hyperspectral image classification tasks and detecting roosting birds in radar imagery. Second, we address the case when transfer learning is not effective as the target domain is too different but there is a large amount of unlabeled data available.

We focus on the astronomy domain and develop self-supervised learning approaches to construct star cluster catalogs from high-resolution images of galaxies taken by space telescopes. Third, we aim to develop a framework for efficiently vetting data labeled by an ML system to reduce the cost of deriving scientific conclusions from data and handle the issue of domain shift between training and test data. The framework will be based on importance sampling and Gaussian processes and will be used to analyze spatio-temporal patterns of roosting birds.


Advisor: Subhransu Maji