Human Language Technologies

Human Language Technologies refers to systems that understand, represent, analyze, and search archives and streams of written and spoken language. We now can talk to our phones to instantly search through billions of webpages and databases across multiple languages -- this was science fiction only 20 years ago. Looking forward, even more exciting developments are possible as we address fundamental research issues.  At UMass, we have developed major research groups with international reputations in information retrieval (IR) and natural language processing (NLP).

Information Retrieval (IR) develops techniques for effective and efficient search of large archives of text.  Search engines now provide near-instant access to billions of pieces of information in multiple languages to millions of simultaneous users, retrieving and connecting web pages, video, social media, news, products, answers, scholarly articles, and many other types of information. They provide user interfaces to simplify the information-seeking "conversation" between user and machine.  IR research makes this possible and drives constant improvements for the future.

Natural Language Processing (NLP) strives to get computers to understand human language. NLP has seen enormous progress, but current systems have a very shallow understanding of the rich subtleties in language. Our research integrates machine learning and computational linguistics to better extract knowledge and insights from text corpora, such as analysis of social media, news, scholarly articles, and computational social science applications. NLP is inherently multidisciplinary, and NLP-related research at UMass also involves our colleagues in linguistics and the social sciences. For more information, see Computation+Language at UMass and the Five Colleges.