Representation Discovery with Multiscale Diffusion Models

12 Nov
Wednesday, 11/12/2008 4:30am to 6:30am
Ph.D. Dissertation Proposal Defense

Chang Wang

Computer Science Building, Room 151

Representations play a major role in intelligent systems. My thesis studies the problem of representation discovery: how to construct a basis so that the new representation of the data is well sited to the given task and geometry of the data space.

My thesis makes three contributions:

The first contribution is a novel multiscale dimensionality reduction approach: multiscale diffusion projections. This approach learns basis functions to span the original problem space at multiple scales and can automatically map the data instances to lower dimensional spaces preserving the relationship inherent in the data. It also offers the following advantages over the state of the art methods: it provides multiscale analysis; it computes basis functions that have local support; it is able to handle non-symmetric relationships, and it is fast to compute.

The second contribution of my thesis is a new manifold alignment approach leveraging basis functions to transfer knowledge between two seemingly quite different domains. Different from the existing work in this area, our manifold alignment is done at multiple scales. Applications of this approach include cross-lingual information retrieval and transfer learning in Markov decision processes.

The third contribution of my thesis is an application of the multiscale diffusion model in the text domain. It is a novel approach to extract hierarchical topics from a given corpora of text documents. Compared to the other approaches in the field, our method is parameter free and can automatically compute the topic hierarchy and topics at each level. Applications to a number of other areas, including computer vision, graph data mining, and bioinformatics will also be investigated.