Skip to main content
UMass Collegiate M The University of Massachusetts Amherst
  • Visit
  • Apply
  • Give
  • Search UMass.edu
Manning College of Information & Computer Sciences

Main navigation

  • Academics

    Programs

    Undergraduate Programs Master's Programs Doctoral Program Graduate Certificate Programs

    Academic Support

    Advising Career Development Academic Policies Courses Scholarships and Fellowships
  • Research

    Research

    Research Areas Research Centers & Labs Undergraduate Research Opportunities

    Faculty & Researchers

    Faculty Directory Faculty Achievements

    Engage

    Research News Distinguished Lecturer Series Rising Stars in Computer Science Lecture Series
  • Community

    On-Campus

    Diversity and Inclusion Student Organizations Massenberg Summer STEM Program Awards Programs Senior Celebration

    External

    Alumni Support CICS
  • People
    Full A-Z Directory Faculty Staff
  • About

    Overview

    College Overview Leadership Our New Building

    News & Events

    News & Stories Events Calendar

    Connect

    Visiting CICS Contact Us Employment Offices & Services
  • Info For
    Current Undergraduate Students Current Graduate Students Faculty and Staff Newly Accepted Undergraduate Students

PhD Dissertation Proposal Defense: Arjun Karuvally, Beyond the Hopfield Memory Theory: Dynamic Energy Landscapes and Traveling Waves in RNNs

Content

Monday, December 2, 2024, 3:00 PM - Monday, December 2, 2024, 5:00 PM

Online
PhD Dissertation Proposal Defense
Presentation

Speaker

Arjun Karuvally

Abstract

Recurrent Neural Networks (RNNs) are central to artificial intelligence, excelling in sequence processing tasks across domains—from natural language processing to protein folding in biology. However, fundamental questions about how RNNs store and process information over time by forming and updating their memories remain unanswered, limiting our ability to understand and improve these models. Current theories, such as the Hopfield memory theory, primarily focus on static memory storage and associative retrieval, lacking mechanisms to explain the dynamic and adaptive memory processes observed in real-world applications. In this thesis, I propose two theoretical frameworks to address this challenge: the Dynamic Energy Theory, elucidating the RNN long-term memory processes through synaptic interactions over time, and the Wave Theory, describing the dynamic storage of inputs as transient working memories in the neural activity. By building mathematical models from these theories, studying their properties, capacities, and limitations, I derive new RNNs with improved capabilities.

The Dynamic Energy Theory generalizes Hopfield memory by allowing the energy function to evolve over time, representing sequences as dynamic trajectories on the network's changing energy landscape. Using this theory, I develop a class of continuous-time RNNs with slow-fast timescale dynamics—where some neurons update rapidly while others change slowly—and analyze their "escape times," the durations spent in each memory state, revealing the conditions necessary for state transitions. Further, the analysis of memory capacity shows that it scales with the strength of inter-memory interactions, enabling the derivation of networks with long-sequence storage capacities that exponentially outperform existing sequence networks. Next, I show how local biologically plausible learning rules are derived from the energy function adapting existing synaptic memories based on the input provided to the network. These new networks could potentially transform tasks requiring adaptive long-term memory retention.

The Wave Theory conceptualizes the binding of input and task-relevant variables in RNNs as propagating waves of neural activity. Building upon the Dynamic Energy Theory, it elucidates how synaptic interactions support wave propagation in the neural activity through local interactions. I demonstrate that practical RNNs like Elman RNNs and State Space Models (SSMs) can be transformed into this wave-based model, suggesting that it serves as a canonical framework for understanding existing RNNs. Using this model, I reveal hidden traveling waves in Elman RNNs that store memories and mitigate the vanishing gradient problem. Additionally, I show that the canonical wave model limits the computational power of existing SSMs to finite state machines—the simplest class in the Chomsky hierarchy. By incorporating waves with variable speed and direction, I illustrate how the computational power can be increased, enabling these models to process more complex sequences.

Together, these theories aim to fill a critical gap in understanding how RNNs store and process information over time. By enhancing interpretability and performance, they could inform the design of more efficient neural networks, positively impacting applications that rely on sequential data across diverse scientific disciplines and paving the way for advancements in artificial intelligence.

Advisor

Hava Siegelmann

Online event posted in PhD Dissertation Proposal Defense

More link

Join via Zoom

Site footer

Manning College of Information & Computer Sciences
  • Find us on Facebook
  • Find us on YouTube
  • Find us on LinkedIn
  • Find us on Instagram
  • Find us on Flickr
  • Find us on Bluesky Social
Address

140 Governors Dr
Amherst, MA 01003
United States

  • Visit CICS
  • Give
  • Contact Us
  • Employment
  • Events Calendar
  • Offices & Services

Info For

  • Current Undergraduate Students
  • Current Graduate Students
  • Faculty & Staff
  • Newly Accepted Undergraduate Students

Global footer

  • ©2025 University of Massachusetts Amherst
  • Site policies
  • Privacy
  • Non-discrimination notice
  • Accessibility
  • Terms of use