Faculty Recruiting Support CICS

Extracting and Representing Entities, Types and Relations

01 Jul
Monday, 07/01/2019 1:00pm to 3:30pm
CS 150/151
PhD Thesis Defense
Speaker: Patrick Verga


Making complex decisions in areas like science, government policy, finance, and clinical treatments all require integrating and reasoning over disparate data sources. While some decisions can be made from a single source of information, others require considering multiple pieces of evidence and how they relate to one another. Knowledge graphs (KGs) provide a natural approach for addressing this type of problem: they can serve as long-term stores of abstracted knowledge organized around concepts and their relationships, and can be populated from heterogeneous sources including databases and text. KGs can facilitate higher level reasoning, influence the interpretation of new data, and serve as a scaffolding for knowledge that enhances the acquisition of new information. A symbolic graph over a fixed, human-defined schema encoding facts about entities and their relations is the predominant method for representing knowledge, but this approach is brittle, lacks specificity, and is inevitably highly incomplete. On the other extreme, recent work on purely text-based knowledge models lack abstractions necessary for complex reasoning.

In this thesis I will present work incorporating neural models, rich structured ontologies, and unstructured raw text for representing knowledge. I will first discuss my work enhancing universal schema, a method for learning a latent schema over both existing structured resources and unstructured free text, embedding them jointly within a shared semantic space.  Next, I inject additional hierarchical structure into the embedding space of concepts, resulting in more efficient statistical sharing among related concepts and improved accuracy in both fine-grained entity typing and linking. I then present initial work representing knowledge in context, including a single model for extracting all entities and long-range relations simultaneously over full paragraphs while jointly linking these entities to a KG.

Advisor: Andrew McCallum