ICLR 2023: Like Humans and Animals, AI Agents Find Their Way Through Memory
Memory may be just as important to artificial intelligence (AI) agents in creating ‘mental maps’ as it is to humans and animals.
A recent paper authored by Georgia Tech researchers makes a surprising discovery — blind AI agents use memory to create maps and navigate through their surrounding environment.
Erik Wijmans, the lead author of the paper, said the idea for his research began by asking if AI agents might mimic human and animal behavior in how they navigate and adjust to their environments.
“Humans and animals navigate with some type of spatial representation — what is commonly referred to as a cognitive map,” Wijmans said. “So, we were wondering how AI agents navigate and if it’s similar to that.
“The first question we asked was, ‘Is memory important to these agents?' It is. They tend to remember at least the past thousand interactions with their environment.”
Wijmans completed his Ph.D. in computer science in 2022 and is currently a research scientist at Apple.
Wijmans created blind AI agents and trained them by dropping them into the floorplans of more than 500 houses with the goal of navigating from one area of the house to another area. The only sense it had to work with was egomotion — the ability to know how far it has moved.
The agent bumped its way around from room to room, backtracking as needed, before finding its destination. Wijmans then created a second probe agent that was injected with the memories of the first agent. The probe agent used the memory of the original agent to take shortcuts to quickly reach its objective.
“It’s surprising that they can do this without vision because they’re in an unknown environment that they’ve never seen before, so they have to figure out how to navigate in that environment and also figure out the structure of it,” Wijmans said.
“This is a result that shows that our hypothesis is true, or at the very least along the right direction. We took an agent and put it in a complex environment and trained it for a task that requires it to interact with that environment, and the result was mapping.”
Wijman’s paper, Emergence of Maps in the Memories of Blind Navigation Agents, is one of four outstanding paper award winners for the 2023 International Conference on Learning Representations, which is being held May 1-5 in Kigali, Rwanda. His research was also recognized by the Georgia Tech chapter of Sigma Xi (The Scientific Research Society) and received a 2023 GT Sigma XI Best Ph.D. Thesis Award.
Wijmans is advised by School of Interactive Computing Distinguished Professor Irfan Essa and Associate Professor Dhruv Batra.
“Erik makes fundamental contributions to multiple sub-areas of AI, including reinforcement learning, robotics, and embodied perception,” Batra said. “His hypothesis is a bold one — that intelligence emerges via large-scale learning by an embodied agent accomplishing goals in a rich 3D environment.”
In his paper, Wijmans describes mapping as an emerging phenomenon. Neural network models for navigation have performed well despite not containing any explicit mapping modules.
Wijman’s AI agents showed a 95% success rate when they used memory to navigate, whereas memoryless agents failed entirely. This seems to suggest that agents create mental maps as a natural part of learning to navigate.
“The results were so initially surprising, that my first gut instinct was that we had done something wrong in our experimental design,” he said.
“This is a work with a very complex body of experiments that tie together a single narrative,” he said. “This is a challenging thing to do. When you’re trying to test whether something involves memory, you must come up with ideas of what to test for and how to test for that. You must make each experiment as precise as possible to not get false positives, and that involves considerable experimental design and effort.”
Wijmans said he made it as difficult as possible for the agent to reach its goal, removing vision, audio, olfactory, haptic, and magnetic sensing and gave it no bias toward mapping. It had no supervision or any kind of outside help.
“Surprisingly, even under these deliberately harsh conditions, we find the emergence of map-like spatial representations in the agent’s non-spatial unstructured memory. It not only successfully navigates to the goal but also exhibits intelligent behavior like taking shortcuts, following walls, and detecting collisions.”
The discovery also suggests that AI, humans, and animals all share a natural characteristic of problem solving and navigation.
“The one link that we can make is the idea of convergent evolution, which is where you see the same mechanism evolve multiple times in species that have no common ancestor that shares that mechanism,” Wijmans said. “Mammals build maps, insects build maps, and now AI agents build maps. So perhaps mapping is the natural solution to navigation.”
As computing revolutionizes research in science and engineering disciplines and drives industry innovation, Georgia Tech leads the way, ranking as a top-tier destination for undergraduate computer science (CS) education. Read more about the college's commitment:… https://t.co/9e5udNwuuD pic.twitter.com/MZ6KU9gpF3
— Georgia Tech Computing (@gtcomputing) September 24, 2024