Unlocking Natural Language Generalization through Adaptive Retrieval-Based Methods

15 Feb

Thursday, 02/15/2024 1:00pm to 3:00pm

Hybrid - CS 203 & Zoom

PhD Thesis Defense

Progress in large language model (LLM) training and inference has contributed to the emergence of ``generative retrieval'' as a new sub-topic in the field of artificial intelligence. Generative retrieval encapsulates a family of methods that leverages the strengths of generative language models for information retrieval applications. These methods have particular utility when embedded in natural language interfaces such as conversational chatbots, which represent an extreme diversity of fine-grained tasks that require both the ability for models to quickly adapt and to generate fluent and relevant responses. In this dissertation, I propose four general methods to further advance the capabilities of generative retrieval systems:

1) I introduce a method for effective adaptation of large language models for retrieval through in-context learning. This technique leverages task-specific demonstrations to quickly learn to rank candidate passages. The criteria for demonstration selection is based on ``demonstration difficulty'' and is inspired by gradient-based learning, where difficult and informative data points often lead to higher magnitude gradients.

2) Generative retrieval enables a massive variety of tasks, including retrieval over structured data. Inspired by previous methods for learning compositional structure with recursive computation, I develop a novel extension of least-to-most prompting that dynamically selects demonstrations to cover the many aspects of the input query. This novel approach leads to state-of-the-art results on a challenging compositional generalization benchmark translating text to a SQL-like query language.

3) Retrieving relevant documents from an external datastore is an effective way for language models to automatically ground their predictions externally rather than solely rely on their internal memory. I design an adaptive algorithm that discards distracting or irrelevant documents, and more heavily weights the influence of relevant text. The more precise usage of the datastore leads to state-of-the-art performance on a language modeling benchmark for generating encyclopedic text.

4) Throughout retrieval augmented generation (RAG) many atomic facts are generated that pertain only to a subset of retrieved passages, leading to inefficient usage of the limited prompt context. I introduce a Retrieval-Driven Memory Manager (ReDMM) for RAG that adaptively selects which passages to include at each step of generation, bypassing context length limits. ReDMM is particularly helpful for generating complex answers as measured by a suite of benchmarks for long-form question answering.

I demonstrate that these methods address limitations of previous generative retrieval systems and provide a path forward for more effective language model use.

Advisors: Mohit Iyyer & Andrew McCallum

Join via Zoom

Unlocking Natural Language Generalization through Adaptive Retrieval-Based Methods

Subscribe to the CICS eNewsletter