Since the field’s earliest days, UMass Amherst has been on the leading edge of reinforcement learning, a branch of AI which is now poised to disrupt numerous sectors of society.

In 1977, Andrew Barto, then a young postdoctoral researcher, came to UMass Amherst to join a lab working on neural networks. He was interested in exploring a new theory: that the human brain was driven by billions of nerve cells behaving like hedonists, each trying to maximize pleasure and minimize pain. Barto was soon joined by doctoral student Richard Sutton ’80MS ’84PhD, and together they applied this concept to studying the development of artificial intelligence (AI) systems.

Image
Andrew Barto first came to UMass Amherst in the late 1970s as a postdoctoral researcher working in a lab studying neural networks.
Andrew Barto first came to UMass Amherst in the late 1970s as a postdoctoral researcher working in a lab studying neural networks.

"UMass Amherst gave us the opportunity to be free-ranging, exploring, and pioneering the field," said Barto.

The two researchers are credited with developing the conceptual and algorithmic functions of what's now known as "reinforcement learning"(RL), a branch of machine learning and the foundation of many of today’s most impactful and promising AI applications, from chatbots like ChatGPT to medicine to robots for use at home and in commerce.

Nearly fifty years later, Manning College of Information and Computer Sciences (CICS) Professor Emeritus Barto and alumnus Sutton, now a professor at the University of Alberta, are being recognized for their innovative efforts. In March 2025, they received the 2024 ACM A.M. Turing Award, often referred to as the "Nobel Prize of Computing."

Despite humble beginnings, RL has proven to be an essential part of intelligent systems. After eventually overcoming its initial academic skepticism, the field has taken off more quickly than expected over the past decade, according to Philip Thomas, associate professor and co-director of the Autonomous Learning Lab (ALL), founded by Barto.

What is Reinforcement Learning?

According to Thomas, the field of machine learning has two major branches. The first, known as supervised learning, is used to train computers on problems with known solutions.

"A classic example is recognizing handwritten letters. As humans, we know what symbols are correctly labeled as an ‘A’ or ‘B’ and so on, so we can train a computer to be more likely to produce the correct output just by shifting its decisions towards the correct response," explained Thomas. 

The second branch, RL, typically tackles more complicated problems using data that doesn’t include the "right" answer.

"These are problems where we don’t know what we should do—we just know how good the outcome is," said Thomas. For example, his lab is working on research applying RL to Type 1 diabetes treatment, studying how much insulin to inject to keep a patient’s blood glucose near a target level.

Put another way, supervised learning allows AI systems to make predictions, while RL enables systems to make optimal decisions to achieve a desired outcome, explained Bruno Castro da Silva, assistant professor and co-director of the ALL.

"I’m surprised how long it took people to recognize that RL was something new, and that it was not just an academic curiosity without real-world applications," said da Silva.

Image
Andrew Barto in 1982
Andrew Barto in 1982

How is Reinforcement Learning used today?

Though it’s been around since the early 1980s, RL’s real-world applications have only begun emerging over about the past decade. "More and more, we’re seeing RL actually be applied in the real world," said Thomas.

Applications include:

  • Refining the responses of chatbots like ChatGPT
  • Treating conditions such as diabetes and sepsis
  • Driving autonomous vehicles
  • Controlling prosthetic limbs
  • Operating water treatment plants
  • Recommending content on platforms such as YouTube, 
  • Spotify, and Netflix

In the near future, da Silva predicts that industry will increasingly rely on RL to optimize operations. "I think companies are going to realize that this technology is really profitable: that they can use it to make any type of decision now made by humans."

But is it Safe?

Ensuring safety and fairness is crucial when deploying AI in the real world, especially in high-stakes contexts like health care, hiring, lending, or criminal justice. But historically, algorithms have often failed in this regard, said Thomas.

The researchers named algorithms designed to address these concerns "Seldonian" algorithms, after a character in Isaac Asimov’s science fiction novels. They demonstrated safe use of RL for diabetes treatment with these algorithms and also developed others to ensure fairness in applications ranging from online courses to loan approvals.

"We see the potential of RL to solve so many of the challenges our society is facing—improving medical treatments, increasing the efficiency of all our systems, and advancing fairness," said Thomas. "All this speaks to the Manning College of Information and Computer Sciences’ commitment to computing for the common good."

Preparing for an AI Future

As AI becomes increasingly prevalent, UMass is preparing its students through coursework on the fundamentals of machine learning, as well as its application in diverse areas. Several undergraduates have completed honors theses in the ALL, while graduate students play a vital role in the lab’s research.

Moreover, CICS is unique in balancing the hands-on application of machine learning with robust education on the foundations of RL.

Image
From left, Andrew Barto and Richard Sutton
From left, Andrew Barto and Richard Sutton

"It’s really important for students to understand the theory and to think deeply about how and why these things work, so they can identify gaps in the literature and challenge assumptions in the field," said da Silva. "We want students to carry forth the legacy of Andy [Barto] and Rich [Sutton] to not only make incremental improvements, but to find qualitatively novel approaches to AI."

Beyond Barto’s immense contributions to the scientific understanding of RL, he is credited with helping to cultivate the truly collaborative research community that exists today in the field.

"Andy is extremely humble, kind, and welcoming, always giving credit to others," said da Silva. "He has really set the tone for this field."

This story originally appeared in the Winter 2025–2026 issue of Significant Bits magazine. 

Image
Scenes from a historic day layered

Scenes from a Historic Day

On March 5, 2025, as the Turing Award was announced, the CICS community filled a standing-room-only campus celebration for Professor Emeritus Andrew Barto, where former students and longtime colleagues reflected on his mentorship and the field-defining impact of his work.
See more photos from the celebration