Faculty Recruiting Support CICS

Extracting Token-level Semantic Matching from Text-pair Classification

10 May
Friday, 05/10/2024 4:00pm to 6:00pm
CS 203
PhD Seminar
Speaker: Youngwoo Kim

Text pair classification tasks, such as natural language inference (NLI) and information retrieval (IR), require predicting whether the information specified in one text (query or hypothesis) is supported by the other text (document or premise). Recent solutions using Transformer architectures effectively address these tasks without explicitly aligning tokens between the two texts. Our work aims to extract token-level matching information that can represent the rationales behind text-pair classification decisions.

We propose a novel strategy of using Transformers for both NLI and IR tasks. Instead of encoding entire text pairs into a single Transformer, we partition the input texts, encode each partition using Transformers, and predict intermediate decisions for each partition. This approach allows us to extract fine-grained token-level semantics and alignment rationales.

In the first part of this seminar, we focus on deriving interpretability from neural NLI models. We propose a sequence labeling task called Conditional-NLI (Cond-NLI) to capture token-level semantic understanding in NLI. The goal is to identify tokens that indicate opposite outcomes and different conditions in apparently contradictory claim pairs from biomedical articles. We introduce a model that applies our partitioning strategy, demonstrating its effectiveness on this task.


The second part targets the ad-hoc retrieval task, specifically explaining the mechanism behind query-document relevance scoring functions. We provide global explanations for neural ranking models by representing their semantic matching behavior as a "relevance thesaurus" containing semantically related query-term and document-term pairs. We employ the same partitioning strategy in our proposed model to extract these term pairs. The resulting thesaurus can reveal corpus-specific features and biases, supporting the utility of our explanation method.

Advisor: James Allan