Towards Robust Long-form Text Generation Systems

09 Aug

Wednesday, 08/09/2023 1:00pm to 3:00pm

Hybrid - CS 203 & Zoom

PhD Thesis Defense

Long-form text generation is an emerging AI technology that has seen extensive recent interest due to the success of ChatGPT. However, several key challenges need to be addressed before long-form text generation systems can be practically deployed at scale. In this talk, I will discuss our solutions towards tackling three problems in large language models: 1) outputs are often inconsistent with the inputs; 2) evaluation of long-form generated text is difficult; 3) it is difficult to identify long-form text generated by large language models.

To address the first issue, I will describe our model RankGen, which is a 1.2B parameter encoder trained with large-scale self-supervised contrastive learning on documents. RankGen significantly outperforms competing long-form text generation methods in terms of automatic and human evaluation, generating text more faithful to the input (issue #1). Next, I will describe our efforts to improve human evaluation of long-form generation (issue #2) by proposing the LongEval guidelines. LongEval is a set of three simple empirically-motivated ideas to make human evaluation of long-form generation more consistent, less expensive, and cognitively easier for evaluators. Finally, I talk about our recent work on AI-generated text detection (issue #3), and showcase the brittleness of current methods to paraphrasing attacks we designed. I will describe a simple new AI-generated text detection algorithm using information retrieval, which is significantly more robust to paraphrasing attacks.

I will end the talk by discussing a few future directions in long-form generation that I am excited about, including plan-based long-form text generation and dissecting large language model training dynamics.

Advisor: Mohit Iyyer

Join via Zoom

Towards Robust Long-form Text Generation Systems

Subscribe to the CICS eNewsletter