The 27th edition of the International Joint Conference on Artificial Intelligence (IJCAI) brimmed with discussions, talks and presentations from speakers and experts on artificial intelligence across the globe. The main features of the event focussed on the latest developments in various popular areas in AI such as machine learning, artificial neural networks and computer vision, among others.
One of the most interesting talks at the conference was by prominent AI researcher Yann Lecun, director of Facebook’s AI Research. He delivered the keynote talk “Learning World Models: the Next Step towards AI”, which touched upon the evolution of AI and impacts made by subfields such as deep learning, in applications like speech recognition, handwriting recognition, autonomous cars and language translation.
In this article, we discuss LeCun’s views on how a full-blown AI could be realised with real-world models.
Supervised Learning For The New AI
LeCun began the talk by elaborating on the importance of supervised learning (SL) in the context of today’s widely deployed DL applications. He emphasises that DL has enabled to improvise on concepts such as perceptrons, which dates back to the early days of ML. With DL, the traditional process of ML has been simplified further by leaning more towards a modular approach rather than being just a straightforward task-intensive style.
He also addresses another important progress in DL, which is Convolutional Neural Networks (ConvNet or CNN). LeCun says that ConvNet implementations in computing systems have largely helped companies in concentrating on diverse tasks such as robotics, apart from application in a classic handwriting recognition problem. Striking instances of using ConvNet can be seen in Nvidia’s chipsets and Tesla’s Autopilot feature.
In addition, the talk commends the performance improvements CNNs have achieved over the years due to the parallel rise in computing hardware such as GPUs, as well as newer techniques evolving with lesser error rates in learning. Speaking on the ConvNet architecture, LeCun illustrates, “A few network layers back in the 97/98 to 12-13 layers to 20 layers in 2013 to all the way to 150 layers nowadays. Companies like Facebook, Google and others routinely use networks that are anywhere between 50 and 100 layers for image recognition.” There are various open source projects of Facebook such as Detectron, DensePose and FairSeq. LeCun presents loads of possibilities that can be accomplished through deep learning and ConvNets in the near future.
Reinforcement Learning Requires Real-World Environment
Speaking on Reinforcement Learning (RL), LeCun says that it has mostly been successful in applications like gaming and virtual environment. Even in scenarios such as gaming, RL has not achieved performances close to that of humans. He says that RL is hard to use in the real-world because it requires a large number of trials to run anything, without which it could turn out to be disastrous.
Where Supervised and Reinforcement Learning Require ‘World Models’
After briefing on the advances of SL and RL, LeCun talked of the shortcomings in these two fields, which is the lack of the real-world knowledge along with no common sense in the machines/systems that incorporate them. “The absence of world models causes us to not be able to build things we want to build. We would like to have machines with common sense because that would be the basis for dialogue systems and virtual assistants that will really help in our daily lives” he says.
Today’s AI machines have not yet mastered what is known as Artificial General Intelligence (AGI). This means, machines with intelligence that is at par with humans. AGI has slowly picked up in the recent years.
By providing an example of how human babies learn from their surroundings, LeCun says that AGI is possible if machines accumulate lots of background knowledge about the world over time through observation. He tells that machines must realise different time periods such as past, present and future around them, and based on this they should predict outputs respectively amongst those time periods. This is termed ‘self-supervised learning’.
RL and SL are generally restricted in terms of the information it is subjected and only analyse these at a smaller level compared to self-supervised learning. LeCun strongly emphasises that a model with extensive real-world knowledge is ideal for AI systems. But, this poses the problem of uncertainty.
Nevertheless, LeCun posits that this can be mitigated by incorporating as many real-world factors into the learning algorithms’ picture. He presents various example possibilities that can favour AGI with world models in them. LeCun concludes the presentation by illustrating that a symbiotic relationship between science and technology will definitely achieve a human level AI.