Machine learning sure is an umbrella word for many methodologies and tools but one must be clear about the fact that it is not an umbrella word for all the solutions. No one can deny that machine learning has revolutionised the way data can be squeezed in for discoveries.
What one should care about is that the enhancement of any technology also depends on a relentless introspective approach in attacking the shortcomings. The rise in popularity sure lures every amateur into believing that they have reached their destination. With tools and frameworks being open-sourced, everyone can play with data, experiment with MNIST datasets and get really good accuracy scores. But one should always question oneself if these results could be translated to a larger platform. Do these accuracies replicate for complex human tasks like speech recognition and object detection?
If the ultimate aim of AI is to replicate human behaviour then there are a few problems that will lurk around for a while. AI might have managed to beat chess grandmasters but can it stand a chance with the language learning capabilities of a 5-year-old? Can machine learning algorithms correctly predict the next economic shutdown?
Most of these questions do seem to appear on the ethical side of the spectrum. But, the technical side too, offers some difficult scenarios for AI to transform to General AI.
The problem in the early ‘50s was more of a computational problem. There were theories and mathematical proofs but there weren’t many machines to test these algorithms on.
Later it was the lack of data to work on. Collecting data manually was tedious enough not to forget the questionable authenticity of the sources that generated data.
Skip to the ‘80s and there was considerable advancement in computation but what appeared out of blue is our own lack of understanding human intelligence.
Now we have the best hardware for accelerated computation, we have frameworks which collect data, then there is cloud to store and access data in real time. But even after 40 years since predictions of pioneers like Minsky, we still are struggling to find solutions to inherently mystical human understanding and consciousness.
Problems outside a few niches (vision, speech, NLP, robotics) aren’t clearly amenable to this approach. For example, datasets generally include event videos without other objects appearing nearby unless this object is used, (i.e a chair, stool, or a bed) and as a consequence, occlusion scenarios are rarely represented. The lack of occlusions in most existing datasets offers an unrealistic perspective of virtually all indoor (i.e. home) environments. Therefore, in the event of an occluded action, current algorithms are generally untested for such scenarios.
Let’s list a few of the shortcomings in the fundamental concepts observed by machine learning scientist John Langford:
Explicitly specifying a reasonable prior is often hard. Human intensive. Partly due to the difficulties above and partly because “first specify a prior” is built into framework this approach is not very automatable.
Limited models. Although switching to a convex loss means that some optimisations become convex, optimisation on representations which aren’t single layer linear combinations is often difficult.
There are issues with parameter initialisation, step size, and representation. It helps a great deal to have accumulated experience using this sort of system and there is little theoretical guidance.
Specification of the kernel is not easy for some applications (this is another example of prior elicitation). O(n2) is not efficient enough when there is much data.
The boosting framework tells you nothing about how to build that initial algorithm. The weak learning assumption becomes violated at some point in the iterative process.
Decision tree learning
There are learning problems which can not be solved by decision trees, but which are solvable. It’s common to find that other approaches give you a bit more performance. A theoretical grounding for many choices in these algorithms is lacking.
The current error reducing, cost-cutting methodologies will flourish in the fields of finance, movie recommendations and other non-fatal avenues. In the case of diagnosis or self-driving cars, this is no excuse to have passable accuracy scores. So, if AI is deemed to shoulder the future of our species, it is only reasonable to expose the flaws in its nascent stage.
Machine learning, at its core, is a set of statistical methods meant to find patterns of predictability in datasets. Is your problem the kind of problem where getting things right 80% of the time is enough? Can you deal with an error rate? Bad examples include predicting profits from the introduction of a completely new and revolutionary product line or extrapolating next year’s sales from past data when an important new competitor just entered the market.
Even in the fully supervised setting, a predictive model is only as good as the data on which it’s trained. Current data sets are rather limited and unrepresentative in terms of variability in physical characteristics and patterns of behaviour as well as due to issues around scene setup, occlusions, data adaptation and privacy, amongst others.
To achieve General AI, one area to focus on more is the learning patterns found in nature. This self-learning would outclass the pre-constrained models and might lead the path to a more trustworthy AI.