Building Robust Natural Language Processing Systems

20 Feb

Thursday, 02/20/2020 4:00pm to 5:00pm

Computer Science Building, Room 150/151

Seminar

Abstract: While modern NLP systems have achieved outstanding performance on static benchmarks, they often fail catastrophically when presented with inputs from different sources or inputs that have been adversarial perturbed. This lack of robustness exposes troubling gaps in current models' understanding capabilities, and poses challenges for deployment of NLP systems in high-stakes situations. In this talk, I will demonstrate that building robust NLP systems requires reexamining all aspects of the current model building paradigm. First, I will show that adversarially constructed test data reveals vulnerabilities that are left unexposed by standard evaluation methods. Second, I will demonstrate that active learning, in which data is adaptively collected based on a model's current predictions, can significantly improve the ability of models to generalize robustly, compared to the use of static training datasets. Finally, I will show how to train NLP models to produce certificates of robustness---guarantees that for a given example and combinatorially large class of possible perturbations, no perturbation can cause a misclassification.

Bio: Robin Jia is a sixth-year Ph.D. student at Stanford University advised by Percy Liang. His research interests lie broadly in building natural language processing systems that can generalize to unexpected test-time inputs. Robin's work has received an Outstanding Paper Award at EMNLP 2017 and a Best Short Paper Award at ACL 2018. He has been supported by an NSF Graduate Research Fellowship.

A reception for attendees will be held at 3:30 in CS 150

Faculty Host

:

Andrew McCallum

Building Robust Natural Language Processing Systems

Subscribe to the CICS eNewsletter