Deep learning has been one of the most popular substrates of the machine learning family, in the last decade or so. The reason for its popularity is its sky-high accuracy. But what makes deep learning so integral in problem-solving?
The ‘deep’ part of deep learning comes with more than one layer in its architecture, along with which a number of advantages are introduced, as it is then able to classify data more and more accurately and can learn better and so have even found their application in self-driving cars.
Why Is Deep Learning Better?
Ease Of Problem-Solving:
Other methods need the problem to be broken down into small steps to create an integrated result. The process of problem-solving in deep learning does not want to be broken down into small steps. It solves problems on an end-to-end basis.
These methods pass data directly to the network and have a good performance, unlike other methods where the best features are carefully selected to pass over to the algorithm. Due to this, it can be adapted to different domains easily. For example, once a speech recognition technique is understood, NLP isn’t a big hurdle. This gives higher performance is a short time period.
Deep learning methods themselves learn to extract features from the dataset. The only way to increase the accuracy is to input more data and the accuracy attained is much higher in these methods. For example, in case of image classification, if the image set has no much logical contradiction due to conditions like lens distortion or product reflection, deep learning comes to the rescue and handles the classification with greater accuracy.
The image below one of the slides on Andrew Ng shows how larger data helps the performance of deep learning methods, over the older methods.
Why Do They Perform Well?
Working With A Large Amount Of Data: Deep learning methods are hungry for large datasets because they work the best with them. The way to improve the accuracy of these methods is just by making the dataset larger and they, therefore, have a better result. For this reason, they are also great for dealing with complex problems. However, more data will not always help. Sometimes it will need to rescale, invent and transform as well. For example, if the data is vectors or numbers, create random numbers, or if it is images, creating random versions of images will have to be done. But most issues are solved with just data.
Feature Extraction Is Included In The Training:
One of the major reasons why deep learning methods are gaining attention is because they try to learn features from data in an incremental manner. The feature extraction process is trained along with the classification processes. For example, a set of image filters or primitives is trained in the first layers of the classification network for image recognition. AlexNet, for instance, passes the images without any feature vector and the feature is extracted itself. This eliminates the need for staunch feature extraction since the feature extraction is included within the training itself.
Other methods work only with a set a predetermined data. If the data is changed, it fails and has to be trained again. Deep learning adapts to the changes. It analyses problems in its hidden layers that are otherwise computationally difficult to solve. It reduces the task of generating new feature extractors for every data. The system can learn from mistakes and enhance itself, thereby eliminating the big challenge of feature engineering step.
DL methods are always data hungry. They can be a failure when the dataset is small, or when the test set differs from the training set greatly. And some problems cannot be thought of as classification problems at all. In such occasions, Gary Marcus of the University of New York in his paper says that deep learning becomes a square peg slammed into a round hole, a crude approximation when there must be a solution elsewhere. Hence, it is not suitable for small datasets. Apart from that, they are very computationally expensive, because of the large set of networks that it has.
But the main objective of deep learning lies in modelling the complex, hierarchical features in the dataset and so far it has proved its worth in this domain since they have shown their worth in AI areas of NLP, speech recognition, image analysis and recommendation engines.