Faculty Recruiting Support CICS

Graph Representation Learning with Box Embeddings

27 Jul
Wednesday, 07/27/2022 1:00pm to 3:00pm
PhD Dissertation Proposal Defense
Speaker: Dongxu Zhang


Graphs are a ubiquitous data structure, present in many machine-learning tasks, such as link prediction of products and node classification of scientific papers. As gradient descent drives the training of most modern machine learning architectures, the ability to encode graph-structured data using a differentiable representation is essential to make use of this data. Most approaches encode graph structure in Euclidean space, however, it is non-trivial to model directed edges. The naive solution is to represent each node using a separate "source" and "target" vector, however, this can decouple the representation, making it harder for the model to capture information within longer paths in the graph.\

In this dissertation, we proposed to model graphs by representing each node as a box (a Cartesian product of intervals) where directed edges are captured by the relative containment of one box in another. Theoretical proof shows that our proposed box embeddings have the expressiveness to represent any directed acyclic graphs. Extensive experimental results suggest that the box containment can allow for transitive relationships to be modeled easily. In addition, we extend box embeddings to represent any cyclic graphs, by factorizing the graph into multiple acyclic subgraphs and embedding them into different box spaces. 

The remaining work of this dissertation will aim at generalizing box representation to more complicated real-world scenarios. We will cover some of the following topics: Modeling attributed graphs where external node features will be projected into box embedding spaces; Using box embeddings as a differentiable latent graph layer for downstream tasks. Furthermore, we are intrigued to explore a fundamental research topic that leverages functional analysis for similarity learning, where the box membership function can be regarded as a special case that enables an efficient function integral calculation.

Advisor: Andrew McCallum


Join via Zoom