The Quest for Knowledge: Question Answering Beyond Knowledge Bases and Texts

27 Feb
Tuesday, 02/27/2018 2:00pm to 3:00pm
Computer Science Building, Room 151
Speaker: Huan Sun

Abstract: The quest for knowledge using question answering (QA) systems can be traced back to early 1960s. Two major paradigms -- structured knowledge based and unstructured text based -- have been widely studied. There are two observations on the status of quo: (1) The big data age is known for the flood of large-scale heterogeneous information sources. In addition to knowledge bases and texts, QA paradigms based on other information sources should also be explored. (2) What current automated frameworks can achieve is still very limited in many scenarios, which necessitates involving humans in the loop for question answering. In this talk, Huan will present work that broadens QA research beyond the two major paradigms.

She will discuss three perspectives and particularly focus on the second one: (1) Table-based QA paradigm. Owing to their prevalence, we explored semi-structured tables for factoid question answering, and developed an effective framework that can precisely identify table cells to answer a question based on deep neural networks.  (2) Computer programming related QA. Beyond factoid QA, we investigate domain-specific QA seeking procedural knowledge on how to do something. We selected computer programming as domain of interest, given the growing importance of computer science education and programming skills. She will discuss her recent work along this direction, which contributed a systematic framework to mine large-scale high-quality <natural language question, code solution> pairs from Stack Overflow. Such question-code pairs are of fundamental importance for model development, when using machine learning to automate tasks like code retrieval and code generation. (3) Involving humans in the loop. We made the first efforts to quantitatively analyze expert routing behaviors, i.e., how an expert decides where to transfer a question when she could not solve it.  Our ongoing work further introduced a reinforcement learning framework for interactive QA, which allows machine to actively and iteratively query human based on its current understanding of a question, and optimizes the final QA accuracy without querying humans too often.

She will conclude this talk by discussing recently funded projects and longer-term plan, including building intelligent systems in various domains like healthcare, education, and business, and investigating collaboration among humans and machines in text understanding tasks.

Bio: Huan Sun has been an assistant professor in the Department of Computer Science and Engineering at the Ohio State University since Autumn 2016. She was a visiting scientist at the University of Washington during 01-06/2016, and received a Ph.D. in Computer Science from University of California, Santa Barbara (2015) and a B.S. in EEIS from the University of Science and Technology of China (2010). Her research interests lie in data mining and machine learning, with emphasis on question answering, text mining and understanding, deep learning, network analysis, and human behavior understanding. Her research has been funded recently by Army Research Office, Patient-Centered Outcomes Research Institute, and Fujitsu Laboratories of America. Huan received the SIGKDD Ph.D. Dissertation Runner-Up Award (2016), the honor of being MIT EECS Rising Stars (2015), Outstanding Dissertation Award from UCSB CS (2015), the UC Regents' Special Fellowship (2010, 2014).

A reception for attendees will be held at 1:30 p.m. in CS 150.

Faculty Host