Faculty Recruiting Make a Gift

Higher-order Representations for Visual Recognition

13 Feb
Wednesday, 02/13/2019 4:00pm to 6:00pm
CS 140
Ph.D. Dissertation Proposal Defense
Speaker: Tsung-Yu Lin


Orderless feature aggregation with nonlinear encodings such as Fisher Vector rep- resentation had been shown to be effective with hand-crafted local image features in various image recognition tasks. The encoding captures higher-order statistics on a set of feature activations; however, the feature descriptors are not learned to optimize the end tasks. In this thesis, we present simple and effective encoding models called Bilinear Convolutional Neural Networks (B-CNNs) to capture the correlations between the activations of feature descriptors derived from CNNs. The models belong to the class of orderless texture representations, but unlike prior work, they can be trained in an end-to-end manner. The models outperformed the previous state-of-the-art on fine-grained and texture recognition.

To understand these models, we visualize the convolutional filters and the classifiers for the fine-tuned networks. The visualization of the top-activating patches against the learned CNNs filters demonstrates that the models are able to capture highly localized attributes. At the classifier level, we visualize the invariance of these models by inverting the representations and output the preimages which reveal the properties captured by the models for a given category.

Finally, we study the techniques for rescaling the importance of individual features during aggregation to enhance the discriminative power of the representations. Spectral normalization scales the spectrum of the covariance matrix obtained after bilinear pooling and offers a significant improvement. Another approach using democratic aggregation achieves comparable improvement while the aggregation can be approximated in a low-dimensional embedding and thus the approach is friendly to aggregating higher-dimensional features. We demonstrate that the two approaches are closely-related and we discuss the trade-off between the performance and efficiency.

Advisor: Subhransu Maji