Faculty Recruiting Support CICS

Effective and Efficient Transfer Learning in the Era of Large Language Models

07 Jul
Friday, 07/07/2023 2:00pm to 4:00pm
CS 203 & Zoom
PhD Thesis Defense
Speaker: Tu Vu

Substantial progress has been made in the field of natural language processing (NLP) due to the advent of large language models (LLMs), i.e., deep neural networks with millions or billions of parameters pre-trained on large amounts of unlabeled data. However, these models have several weaknesses, including degenerate performance in data-scarce settings, high computational resource requirements, and a tendency to hallucinate facts and information. These limitations need to be addressed for the practical applicability of LLMs in resource-constrained settings with limited data and/or computational resources, and applications where accurate and up-to-date information is crucial (e.g., in dialogue and Q&A systems).

In this talk, I first present two methods that dramatically reduce the need for labeled data in data-scarce settings. The first method leverages beneficial relationships between NLP tasks for transfer learning, while the second method combines data augmentation and self-training to boost few-shot learning performance (the ability to perform a task from only a few labeled examples). Next, I present a parameter-efficient transfer learning approach called SPoT that reuses a single frozen model to perform all tasks while only learning minimal task-specific parameters (soft/continuous prompts) to represent tasks and transfer knowledge. SPoT can match or outperform fine-tuning task-specific models (training the whole model on each task) and can confer benefits in a cross-lingual transfer setting. I conclude by discussing our recent effort in democratizing LLMs to the broad research community, work in progress on grounding LLMs' responses to factual and up-to-date information to improve factuality, and my future research directions that seek to advance NLP through large-scale multi-task learning from multilingual and multimodal data.

Advisor: Mohit Iyyer

Join via Zoom