Faculty Recruiting Support CICS

Neural Approaches for Language-Agnostic Search and Recommendation

19 Aug
Friday, 08/19/2022 2:00pm to 4:00pm
PhD Seminar
Speaker: Hamed Bonab


There are significant efforts to develop better neural approaches for information retrieval problems. However, the vast majority of these studies are conducted using English-only data. In fact, trends and statistics of non-English content and users on the Internet show exponential growth and that novel information retrieval systems need to be language-agnostic; they need to bridge the language barrier between users and content, leverage data from high-resource settings for lower-resourced settings, and be able to extend to new languages and local markets easily. To this end, we focus on search and recommendation as two vital components of information systems. We explore some of the complex cross-lingual issues to help develop an understanding of the challenges that someone designing a neural Cross-Lingual Information Retrieval (CLIR) system will need to address.

We first introduce a contrastive analysis framework for simulating low-resource settings using higher-resourced ones. For this, we start with a true low-resource language and systematically down-sample a high-resource language's data to become an artificial low-resource language alike the true low-resource one. We then focus on neural CLIR approaches by bridging the language gap. We show that neural CLIR models are performing sub-optimal because typical Cross-Lingual Embeddings (CLE) ``translate'' query terms into related terms--i.e., terms that appear in a similar context--rather than synonyms in the target language. Finally, we study the problem of recommending relevant products to users in relatively resource-scarce markets by leveraging data from similar, richer in resource, auxiliary markets. The problems we study in this area are particularly important for the design of language-agnostic systems since knowledge transfer across languages (and broadly across markets) are central to CLIR systems. We explore different market-adaptation techniques for search and recommendation and propose novel methods leveraging data from high-resource markets to improve performance in emerging markets. Our proposed approaches demonstrate robust effectiveness, consistently improving the performance compared to competitive baselines selected for our analysis.

Advisor: James Allan

Join via Zoom