Learning From Limited Labeled Data for Visual Recognition

26 Jan

Tuesday, 01/26/2021 10:00am to 12:00pm

Zoom Meeting

PhD Dissertation Proposal Defense

Zoom Meeting: https://umass-amherst.zoom.us/j/91447097631 Meeting ID: 914 4709 7631

Recent advances in computer vision are in part due to the widespread use of deep neural networks. However, training deep networks require enormous amounts of labeled data which can be a bottleneck. In this thesis, we propose several approaches to mitigate this in the context of modern deep networks and computer vision tasks.

While transfer learning is an effective strategy for natural image tasks where large labeled datasets such as ImageNet are available, it is less effective for domains that are distant such as medical images and 3D shapes. The first chapter focuses on transfer learning from natural image representations to other modalities. In many cases, cross-modal data can be generated using computer graphics techniques. By forcing the agreement of predictions across modalities, we show that the models are more robust to image degradation such as lower resolution, grayscale, or line drawings instead of color images in high-resolution. Similarly, we show that 3D shape classifiers learned from multi-view images can be transferred to the models of voxel or point cloud representations.

Another line of work has focused on techniques for few-shot learning. In particular, meta-learning approaches explicitly aim to generalize representations by emphasizing transferability to novel tasks. In the second chapter, we analyze how to improve these techniques by exploiting unlabeled data from related tasks. We show that the performance of novel tasks can be boosted by combining unsupervised objectives with meta-learning objectives. However, we find that small amounts of domain-specific data can be more beneficial than large amounts of generic data.

While transfer learning, unsupervised learning, and few-shot learning have been studied in isolation, in practice one often finds that transfer learning from large labeled datasets is more effective. This is in part due to a lack of evaluation on benchmarks that contains challenges such as class imbalance and domain mismatch. In ongoing work, we explore the role of expert models in the context of semi-supervised learning on a realistic benchmark. Different from existing semi-supervised benchmarks, our dataset is designed to expose some of the challenges encountered in a realistic setting, such as the fine-grained similarity between classes, significant class imbalance, and domain mismatch between the labeled and unlabeled data. We show that recently proposed semi-supervised methods provide significant benefits when deep networks are trained from scratch, yet their performance pales in comparison to a transfer learning baseline. Furthermore, in the transfer setting, while existing semi-supervised methods provide improvements, the presence of out-of-class is often detrimental.

Committee Members:
Subhransu Maji (Chair)
Erik Learned-Miller
Rui Wang
Bharath Hariharan (Outside member)

Learning From Limited Labeled Data for Visual Recognition

Subscribe to the CICS eNewsletter