Faculty Recruiting Support CICS

Deep Neural Networks for 3D Processing and High-Dimensional Filtering

06 May
Monday, 05/06/2019 2:00pm to 4:00pm
PhD Dissertation Proposal Defense
Speaker: Hang Su


Deep neural networks (DNN) have seen tremendous success in the past few years, advancing the state of the art in many AI areas by significant margins. Part of the success can be attributed to the wide adoption of convolutional filters. These filters can effectively capture the invariance in data, leading to easier training and more compact representations, and at the same can leverage extremely efficient implementations on modern hardware. Since convolution operates on regularly structured grids, it is a particularly good fit for texts and images where there are inherent 1D or 2D structures. However, extending DNNs to 3D or higher-dimensional spaces is non-trivial, because data in such spaces often lack regular structure and the curse of dimensionality can also adversely impact performance in multiple ways.

In this thesis, we present several new types of neural network operations and architectures for 3D and higher-dimensional spaces and demonstrate how we can mitigate these issues while retaining the favorable properties of standard convolution. First, we investigate view-based representations for 3D shape recognition. We show that a collection of 2D views can be highly informative, and we can adapt standard 2D DNNs with a simple pooling strategy to recognize objects based on their appearances from multiple viewpoints with unprecedented accuracy. Next, we make a connection between 3D point cloud processing and sparse high-dimensional filtering. The resulting representation is highly efficient and flexible, and allows native 3D operations as well as joint 2D-3D reasoning. Finally, we show that high-dimensional filtering is also a powerful tool for content-adaptive image filtering and demonstrate different scenarios where DNNs can incorporate such operations for computer vision applications, including joint upsampling and semantic segmentation.

Advisor: Erik Learned-Miller