Any newfound theory in science is insignificant without being put to practical use. The same can be said about deep learning (DL). Now, with new research and development, the vast pedagogy in this field has found practical applications in various disciplines across new tech businesses.
Research work in DL has taken an innovative stance. Rather than using it to better AI and ML technologies, DL research is seeing new ideas being explored in critical areas such as healthcare and banking. We have listed down the top research papers on DL which are worth reading and have an interesting take on the subject. These papers were published in the recently concluded International Conference on Learning Representations in Vancouver, Canada, in May 2018.
1. Spherical CNNs
Researchers at the University of Amsterdam have developed a variation of convolution neural networks (CNN) known as Spherical CNNs. These CNNs work with images which are spherical in shape (3D). For example, images from drones and autonomous cars generally cover many directions and are three-dimensional. Regular CNNs are applicable only to two-dimensional images, and imposing 3D features from images mentioned in this example may literally fail in a DL model. This is where Spherical CNNs were envisioned. In the paper, the researchers conceptualise spherical features with the help of the Fourier Theorem, as well as an algorithm called Fast Fourier Transform. Once developed, they test the CNNs with a 3D model and check for accuracy and effectiveness.
The concept of Spherical CNNs is still at a nascent stage. With this study, it will definitely propel the way CNNs are perceived and used. You can read the paper here.
2. Can Recurrent Neural Networks Warp Time?
Not just ML and AI researchers, even sci-fi enthusiasts can quench their curiosity about time travel, if they possess a strong grasp of concepts like neural networks. In a research paper published by Corentin Tallec, researcher at University of Paris-Sud, and Yann Ollivier, researcher at Facebook AI, they explore the possibility of time warping through recurrent neural networks such as Gated Recurrent Units (GRUs) and Long Short Term Memory (LSTM) networks. The self-learning capabilities present in these models are analysed. The authors have come up with a new concept called ‘Chrono Initialisation’ that derives information from gate biases of LSTM and GRUs. This interesting paper can be read here.
3. Learning How To Explain Neural Networks: PatternNet And PatternAttribution
We are yet to fully understand why neural networks work exactly in a particular way. Complex ML systems have intricate details which sometimes astonish researchers. Even though there are systems which decode neural networks, it is difficult at times to establish relationships in DL models. In this paper, scholars at Technical University in association with researchers at Google Brain, present two techniques called PatternNet and PatternAttribution which explain linear models. The paper discusses a host of previously established factors such as signal estimators, gradients and saliency maps among others. You can read the paper here.
4. Lifelong Learning With Dynamically Expandable Networks
Lifelong learning was a concept first conceived by Sebastian Thrun in his book Learning to Learn. He offered a different perspective of the conventional ML. Instead of ML algorithms learning one single task, he emphasises on machines taking a lifelong approach wherein they learn a variety of tasks over time. Based on this, researchers from KAIST and Ulsan National Institute of Science and Technology developed a novel deep network architecture called Dynamically Expandable Network (DEN) which can dynamically adjust its network capacity for a series of tasks along with requisite knowledge-sharing between them. DEN has been tested on public datasets such as MNIST, CIFAR-100 and AWA for accuracy and efficiency. It was evaluated for factors including selective retraining, network expansion and network timestamping (split/duplication). This novel technique can be read here.
5. Wasserstein Auto-Encoders
Autoencoders are neural networks which are used for dimensionality reduction and are popularly used for generative learning models. One particular type of autoencoder which has found most applications in image and text recognition space is variational autoencoder (VAE). Now, scholars from Max Planck Institute for Intelligent Systems, Germany, in collaboration with scientists from Google Brain have come up with the Wasserstein Autoencoder (WAE) which utilises Wasserstein distance in any generative model. In the study, the aim was to reduce optimal transport cost function in the model distribution all along the formulation of this autoencoder. After testing, WAE proved to be more stable than other autoencoders such as VAE with lesser architectural complexity. This is a great improvement in autoencoder architecture. Readers can go through the paper here.
All of these papers present a unique perspective in the advancements in deep learning. The novel methods also provide a diverse avenue for DL research. Machine learning and artificial intelligence enthusiasts can gain a lot from them when it comes to latest techniques developed in research.