Andrew K. McCallum

(413) 545-1323


Information extraction, knowledge discovery from text, statistical natural language processing, machine learning, graphical models.


The main goal of McCallum's research is to dramatically increase our ability to mine actionable knowledge from unstructured text. He is especially interested in information extraction from the Web, understanding the connections among people and between organizations, expert finding, social network analysis, and mining the scientific literature. Toward this end he and his group develop and employ various methods in statistical machine learning, natural language processing, information retrieval and data mining---tending toward probabilistic approaches and graphical models. He is one of the pioneers in the development of conditional random fields. As a demonstration of his research, his group has created the research paper search engine at


Ph.D., Computer Science, University of Rochester (1995), B.S., Dartmouth College (1989, summa cum laude). Professor McCallum joined the faculty of the College of Information and Computer Sciences as a Research Associate Professor in 2002 and became an Associate Professor in 2003 (awarded tenure in 2007). Previously, he was Vice President of Research and Development at WhizBang Labs, and Director of their 30-person research and development lab in Pittsburgh, PA. Prior to joining WhizBang, he was a Research Scientist and Research Coordinator at Just Research (Justsystem Pittsburgh Research Center), where he spearheaded the development of technology for statistical text processing. In 1996, he was a post-doctoral fellow at Carnegie Mellon University.

Activities & Awards

Professor McCallum has over 50 research publications spanning machine learning, natural language processing and reinforcement learning. He is the PI on a prestigious NSF Medium ITR grant, and Co-PI on another. In 2004 and 2007, he received the IBM Faculty Partnership award. In 2004, one of his papers won Honorable Mention at AAAI. Prof. McCallum was selected as a UMass Amherst Lilly Teaching Fellow for the 2005-2006 academic year, and received the College of Natural Sciences and Mathematics (NSM) Outstanding Faculty Research Award in 2007. He is an action editor on the board of the Journal of Machine Learning Research, and has served on the program committees for many technical conferences, including IJCAI, AAAI, ICML, NIPS, UAI, ACL, and HLT. He has given invited talks in academia and industry, including MIT, Stanford, CMU, UT Austin, Xerox PARC, IBM Research, Microsoft Research, AT&T Research and Google. In 2003, he gave a tutorial, "Information Extraction from the World Wide Web," at the Neural Information Processing Systems (NIPS) conference in Vancouver, Canada and at the KDD Conference.