A single software developer began a journey to train an AI to play Pokemon Red Version. 50,000 hours later and it’s making some all too human strides.
Nearly a decade ago, Twitch Plays Pokemon captured the hearts, minds, and fingers of the internet. The legendary phenomenon has inspired recreations by Mizkif and revivals on TikTok.
Now the famed chaos of Twitch Plays Pokemon has inspired a new experiment revolving around the use of AI. We know what you’re thinking and it has nothing to do with Pokemon Scarlet & Violet’s bizarre ending.
Seattle-based software engineer Peter Whidden has undergone the painstaking process of training an AI to play Pokemon Red Version. He published an explanatory video on his YouTube channel that has garnered over 2.5 million views.
In the video, Whidden explains that the AI has now played over 50,000 hours of the game and is capable of catching Pokemon and defeating Gym Leaders. The AI relies on a Pavlovian reinforcement model that gives “point-based incentives” to level up Pokémon, explore new areas, and win battles.
Whidden has been stunned by the achievements of the programming but admits himself “more fascinating than its successes are the ways that it fails”. The AI interprets the reward system in its own ways which leads to some surprisingly human behaviour.
Subscribe to our newsletter for the latest updates on Esports, Gaming and more.
Aside from spending hours of time admiring scenery, the AI experiences something comparable to trauma in an incident at a Pokemon Center. Accidentally depositing a Pokemon in a PC halves its team’s overall level and triggers a negative response that it associates with the Pokemon Center.
“It doesn’t have emotions like a human does, but a single event with an extreme reward value can still leave a lasting impact on its behavior,” Whidden explains. “In this case, losing its Pokemon only one time is enough to form a negative association with the whole Pokemon Center, and the AI will avoid it entirely in all future games.”
Whidden’s AI is still only in the early stages of its Pokemon adventure after being waylaid by the frustrating cave of Mt Moon. He did tell his audience that a recent change in his reward system has allowed the AI to exit the cave and finally reach Cerulean City.
The software engineer has also made the code for his project public and is “thrilled” by how many people are engaging with it. One savvy fan has even been able to apply his code to Pokemon Crystal Version but we don’t know how it fared in Generation 2.