Google has recently activated its patent for the IP known as ‘Dropout’; a solution widely used to regularize deep neural networks. The method is used to reduce overfitting, and allow for a computationally cheap, yet effective method of regularization.
This is a method popular among data scientists and machine learning engineers, and has long been considered to be public domain knowledge. However, Google, who acquired the patent rights to the technology in 2016, have now activated the application.
What does this mean for creators of neural networks? What will happen if Google enforces this patent?
An Introduction To Dropout
To understand what Dropout does, it is important to identify the problem it was made to solve. Overfitting is a process that occurs in neural networks wherein the model learns the training data too well. It is unable to generalise the learning, as it will pick up on small details and noise.
This results in inaccurate results with new data. This is due to the fact that the model has ‘overfitted’ the learning dataset, and is not able to extend it to more applications.
Dropout is a solution proposed to this problem by Nitish Srivastava, Geoffrey Hinton and few other students at the University of Toronto in 2012. Hinton is now an employee at Google, leading to the giant picking up the patent for the technology.
Dropout functions to reduce the possibility of overfitting by ‘dropping out’ random nodes during the training process. This means that those nodes, and by extension their layers, are treated and look different from other layers.
This means that each training pass is performed with a different view of the layer, as the unit removed has all incoming and outgoing connections barred. This removes a large amount of noise in the training process.
Through the properties of probability, the remaining nodes take on more of the inputs. The Dropout approach capitalizes on the tendency of the neural network to co-adapt and correct mistakes from layers that came before then.
Since Dropout thins the number of nodes after being used, it is recommended for use on a network with a large number of nodes.
Advantages of Using Dropout
The reason for Dropout’s use by almost every machine learning and data science practitioner is due to the fact that it is an extremely generic approach. It can be used on any network to see performance gains and reduction in error rates.
Almost all neural networks are compatible with Dropout, as it probabilistically determines the nodes to be dropped out. Adjustments have to be made to each type of neural network, as in the case of different dropout rates in each phase of a long short-term memory RNN.
Moreover, Dropout also allows engineers to create a bigger neural network and not worry about overfitting. The number of nodes can be increased to whichever level the developer feels accurate, and can later be trimmed down by Dropout.
What this allows for is the network to pick up weights in a more random fashion, reducing overfitting by increasing nodes. By adding more nodes to each layer, Dropout is also optimized for use.
The reason for Dropout’s effectiveness is due to the fact that it capitalizes on a self-improvement process enforced by every neural network. This is the self-improving capability of the neural network, in which the network continues to improve upon gaps left by non-functional layers.
There are a few factors to be kept in mind, with the primary being to set a weight constraint on the nodes. In addition to this, a dataset which might cause overfitting should be used, in order to allow for regularization through Dropout.
Dropout’s Effect & Google’s Power
Since its rise to prominence in the early 2010’s, Dropout has been used on many prominent neural networks. Most significant of them is, possibly, Alex Krizhevsky’s revolutionary AlexNet. The first ‘neural network’ had Hinton’s method being used to regularize the nodes.
Google’s power over the patent can mean many things. While it is obvious that the giant is not going to hunt down every data scientist who uses the method, it is troubling for companies. The patent means that Google holds the IP for Dropout, and while it cannot be proven that a network is using Dropout for training, it is still worrisome.
Google can enforce the patent on any future competitors to its brand, which is what it seems to be doing with another patent as well. For those unaware, Google holds the patent to ‘Processing images using deep neural networks’, just as Microsoft holds the patent to ‘Training CNNs on graphics processing units’.
While it is not worrisome at this exact moment, patents on mathematical concepts also come with a generally negative ethical connotation. The enforcement of this IP seems to be reserved for the future.