As 2020 comes to an end, Deep learning turns out to be the most coveted jargon of this decade. Though the core foundations of deep learning have been led many years ago, with papers related to neural networks and LSTM architectures being published since late 80s, dots didn’t seem to connect practically till the early years of this decade. This revolution really began with 2012, AlexNet paper that proposed a multi-layer neural network architecture with operations like convolutions and max-polling to classify images in the ImageNet Dataset. Before we talk about AlexNet let’s briefly discuss the ImageNet Dataset. This dataset was created from efforts led by Fei Fei Lee and her colleagues at Stanford. The aim was to create a large scale fully labelled and annotated image classification dataset with classes derived from the WordNet. This large scale dataset is what has predominantly enabled researchers to design deep learning architectures that could define a new state-of-art year after year, surpassing the human performance benchmarks as well. Before AlexNet in 2012, most neural networks architectures were only good enough to classify smaller simpler datasets like MNIST. But what made AlexNet exceptionally successful, all we know is that the foundation’s concepts were already available since years. Two factors answer this question, firstly the availability of large scale datasets for training like ImageNet and CIFAR and secondly, the availability of GPU based compute resources. Training deep neural nets takes a lot of compute power, let alone the case of large datasets like ImageNet. Training on traditional CPUs previously took a lot of time, making neural networks not the best choice for the Image Classification models. AlexNet changed this by utilizing GPU based computation that just started to become more powerful at that time, along with distributed computing and parallelization. AlexNet’s architecture had many other facets that are till date an inherent part of many deep learning architectures, this includes ReLU activation function, Max-Pooling operations etc.
Researchers like Geoffrey Hinton, Yann Lecun, Andrew Ng, Yoshua Bengio, Ian Goodfellow, David Silver and many more are today leading the research at different fronts of deep learning. Deep Learning is playing its role in almost all domains of AI, from computer vision, Natural Language Processing, Multi-agent learning, recommendation systems etc. Deep learning is powering everything from the face unlock of your phone to the Netflix show recommendation you get. If you are interested in studying deep learning, Ian Goodfellow’s book is the best place to go, but if you are just interested in just staying updated and listening to people who are creating bang in this field, I would highly recommend the Lex Fridman podcast.
I might continue this article at some point, and will try to make it comprehensive yet recapitulating. Till then Adiós, hasta la próxima!