In the latest breakthrough, Google has now trained a neural network to book a flight ticket after analysing through web pages and by understanding the choices that it can make while filling out booking forms.
In a recent study, researchers associated with Google AI trained its reinforcement learning agent, DQN to navigate the content in the web by following certain instructions such as ‘book a flight’ or ‘interact with social media website’ to achieve it.
Reinforcement Learning or RL is an important Machine Learning tool wherein an agent is taught to behave in a particular environment within a specific context. By training DQN to read the commands that are currently available in the web, the RL network as able fill out the form and book the cheapest flight by entering the right origin, destination, airport code and date. “For example, in the case of an instruction “Book a flight from WTK to LON on 21-Oct-2016”, the agent needs to fill out the origin and destination dropdowns with the correct airport codes, select a date, hit submit button, and select the cheapest flight among all the options,” it says.
However, considering the plethora of outcomes that could be generated from a single command, it becomes hard to arrive at a conclusive result through the trial-and-error method. However, according to the researcher, the easiest way would have been by training the system by learning from human demonstration and word embedding. But to due to the complexity in a given environment, similar to that of flight-booking, which can generate more than 14 millions instructions, with more than 1700 vocabulary words and approximately 100 web elements, this method was regarded futile. “In environments with large state and action spaces, gathering the human demonstrations ‘does not scale, as the training needs a large number of human demonstrations for each environment,” the study says.
Hence, the challenge before the researchers was to train system without human intervention. The researchers were able to achieve the feat by training the LR agent to break down a single instruction into multiple sub-instructions and then assigning the web navigation agent with an easier task of solving only a subset of these sub-instructions.
More than enabling the RL agent to book a ticket, through the method Google researchers wanted to find a solution to problems with numerous outcomes without human guidance