A team of scientists, including one of Indian origin utilized a branch of AI called reinforcement learning to perfectly play the 1980s arcade game Ms. Pac-Man. The team hails from the Canadian startup Maluuba, and has developed a new AI system which can achieve the maximum possible score of 999,990. Maluuba was acquired by Microsoft earlier this year.
AI researchers consider it as a groundbreaking achievement. This is largely due to the fact that they have been using various video games to test their systems, but Pac-Man was among the most difficult games to crack. Rahul Mehrotra, Program Manager, Maluuba explains, “Figuring out how to win these types of video games is actually quite complex, because of the huge variety of situations you can encounter while playing the game.”
The team divided the large problem of mastering Pac-Man into small pieces, to obtain a high score, following which they distributed the pieces among AI agents. The method employed is similar to some theories about how brain works.
The team at Maluuba named this method Hybrid Reward Architecture. The method involved over 150 agents, each of which worked in parallel with the other agents to master Pac-Man. For instance, few agents got rewarded for successfully finding one specific pellet, while others were tasked with staying out of the way of ghosts.
The researchers also created a top agent that can take suggestions from all other agents to decide on where to move Pac-Man. The top agent took into account factors such as how many agents advocated for going in a certain direction and the intensity with which they wanted to make that move.
For instance, let’s assume 100 agents want to move right because that was the best path to their pellet, however, three want to move left because there is a deadly ghost to the right. In this scenario, the AI program would give more weight to the ones who had noticed the ghost, and go left. “A lot of companies working on AI use games to build intelligent algorithms because there’s a lot of human-like intelligence capabilities that you need to beat the games,” concludes Mehrotra.