Andrew Barto retires - celebratory workshop held in July

August 05, 2012

Professor Sridhar Mahadevan and alum Satinder Singh (Ph.D. '94; Professor at the University of Michigan) organized a Celebratory Workshop for Andrew Barto's Research in Reinforcement Learning. The event was held in the department on July 6-7, 2012. A number of internationally renowned researchers from AI, ML, cognitive science, psychology, neuroscience and engineering attended the workshop. More details on the Celebration.

Andrew Barto retires

Professor Andrew Barto retired in July 2012, after spending over three decades in the Department of Computer Science at the University of Massachusetts Amherst. He joined the department in 1977 as a Postdoctoral Research Associate, became an Associate Professor in 1982, and Professor in 1991. He served as Department Chair from 2007 to 2011. Although retired, Barto will continue to pursue his research as an Emeritus Professor.

Throughout his tenure in our department, Barto maintained a highly visible and productive research career, for which he has received many accolades. His primary research contributions are in the field of machine learning, in particular in reinforcement learning, a framework inspired from its study in biology and psychology. "Few researchers are gifted enough, indeed lucky enough, to have inspired the study of a new interdisciplinary research area. Through his leadership, the field of reinforcement learning blossomed into a major area of research, not only within computer science, artificial intelligence, and machine learning, but also within numerous other disciplines, from dynamic programming and operations research, and neuroscience, to robotics," says Professor Sridhar Mahadevan, Co-Director, with Barto, of the Autonomous Learning Laboratory (ALL).

To celebrate his research career, a workshop was held recently in the department that attracted researchers from a wide spectrum of research fields, attesting to the interdisciplinary impact of Barto's research.

A fundamental contribution of Barto - developed in conjunction with his then graduate student Richard Sutton (Ph.D. '84; Professor and iCore Chair at the Department of Computing Science at the University of Alberta) - is an algorithmic framework for solving sequential prediction problems called temporal-difference learning, or TD-learning as it is widely known. "TD is remarkable not only due to its power at solving incredibly large stochastic sequential decision problems, but also due to emerging evidence linking the TD framework to reward learning in the mammalian brain," notes Mahadevan. A researcher at IBM Watson Laboratories in 1992 showed that TD was able to learn to solve the difficult game of backgammon to a level comparable with the best human champions. This feat was notable since backgammon has around 1020 states, precluding any brute force method for solving it. Backgammon is also a stochastic game, since at each turn, moves are made using dice rolls. Later work demonstrated the utility of TD in a wide variety of real-world applications, from scheduling missions on the space shuttle, and controlling a team of elevators in a high-rise building, to routing calls on a cellphone.

Even more remarkable, growing experimental evidence in neuroscience shows that TD plays a fundamental role in the brain, providing a mechanism for learning from rewards through the neurotransmitter dopamine. TD may perhaps be the first machine-learning algorithm for which we have evidence of its neural implementation in the brain.

"Andy is a highly popular teacher and advisor, and the success of his many Ph.D. graduates is a tribute to his mentoring skills," adds Mahadevan. Two of Barto's former Ph.D. students, Satinder Singh ('94; Professor at the University of Michigan) and Sutton, have received AAAI Fellowships, a premier international recognition of excellence in artificial intelligence research.

Barto continues to co-direct ALL, formerly known as the Adaptive Networks Laboratory. His current research centers on what psychologists call intrinsically motivated behavior, meaning behavior that is done for its own sake rather than as a step toward solving a specific problem.

Barto received the 2004 IEEE Neural Network Society Pioneer Award for his contributions to the field of reinforcement learning. He is a Fellow of the American Association for the Advancement of Science, a Fellow and Senior Member of the IEEE, and a member of the American Association for Artificial Intelligence and the Society for Neuroscience. He received his B.S. with distinction in Mathematics from the University of Michigan in 1970, and his Ph.D. in Computer Science in 1975, also from the University of Michigan.

Barto has over one hundred publications. He is co-author with Richard Sutton of the book Reinforcement Learning: An Introduction, MIT Press 1998. "The book is a true classic that has been cited over 13,000 times - perhaps the single most cited research publication ever produced from our department," says Mahadevan. Barto was also a co-editor with Jennie Si, Warren Powell, and Don Wunch II of the Handbook of Learning and Approximate Dynamic Programming, Wiley-IEEE Press, 2004.

"Andy is an exceptional researcher, teacher, and mentor. He provided outstanding leadership to the department during his tenure as chair," says Professor Lori Clarke, department chair. "I am delighted that he will continue his research endeavors as an Emeritus Professor."