Faculty Recruiting Support CICS

Machine Learning Models for Efficient and Robust Natural Language Processing

01 Jul
Monday, 07/01/2019 9:00am to 12:00pm
CS 150/151
PhD Thesis Defense
Speaker: Emma Strubell


Natural language processing (NLP) has come of age. For example, semantic role labeling (SRL), which automatically annotates sentences with a labeled graph representing who did what to whom, has in the past ten years seen nearly 40% reduction in error, bringing it to useful accuracy. As a result, a myriad of practitioners now want to deploy NLP systems on billions of documents across many domains. However, state-of-the-art NLP systems are typically not optimized for cross-domain robustness nor computational efficiency. In this dissertation I develop machine learning methods to facilitate fast and robust inference across many common NLP tasks.

First, I describe paired learning and inference algorithms for dynamic feature selection which accelerate inference in linear classifiers, the heart of the fastest NLP models, by 5--10 times. I then present iterated dilated convolutional neural networks (ID-CNNs), a distinct combination of network structure, parameter sharing and training procedures that increase inference speed by 14--20 times with accuracy matching bidirectional LSTMs, the most accurate models for NLP sequence labeling. Finally, I describe linguistically-informed self-attention (LISA), a neural network model that combines multi-head self-attention with multi-task learning to facilitate improved generalization to new domains. We show that incorporating linguistic structure in this way leads to substantial improvements over the previous state-of-the-art (syntax-free) neural network models for SRL, especially when evaluating out-of-domain. I conclude with a brief discussion of potential future directions stemming from my thesis work.

Advisor: Andrew McCallum