HTC Education Series: Deep Learning with PyTorch Basics - Lesson 5

 After a short holiday break (and software release crunch time, oh joy), let's continue with lesson 5 of our HTC Deep Learning with PyTorch Basic course.  As usual for this course, we'll be starting off with the 5th lecture in the Jovian.ai course called 'Deep Learning with PyTorch: Zero to GANs'.  And what a lecture it is. The best so far in this series (so we look forward with anticipation to lesson 6 and GANs).

In this jam packed to the gills lesson, you will learn how to code a state of the art deep learning model from scratch in PyTorch (how far we have come in 5 short lessons).  We'll also learn very useful techniques like data augmentation, regularization, and adding residual layers. 

Plus we'll be using the Adam optimizer rather than stochastic gradient descent. Adam uses techniques like momentum and adaptive learning rates for faster training.

And we'll learn about weight decay, gradient clipping, and learning rate scheduling.

Let's get started.

What was covered in the lecture

data normalization

data augmentation

residual connections

batch normalization

learning rate scheduling

weight decay

radient clipping

adam optimizer

how to train your state of the art model

transfer learning


Don't forget to check out the associated Jupyter notebooks

ResNets, regularization and data augmentation - here

SimpleCNN Starter Notebook - here

Transfer Learning with CNNs - here

Image classification with CNNs - here


Additional HTC Material

1.  Now that you are familiar with some activation and loss functions used in deep learning, you're ready for Yann LeCunn's deep dive lecture on PyTorch Activation and Loss Functions. Collect them all for your PyTorch programming toolbox.


If you watch some of Yann's other lectures from this NYU Deep Learning course, you'll notice he presents a fascinating unified theory of deep learning architectures, all based on energy based models (EBM).  The second half of this lecture covers margin-based loss functions for EBM models.

2.  There are a number of blog posts associated with this lesson you should definitely check out.  They provide additional information on all of the various new things covered in this lecture.

Why and how residual blocks work - here

Batch normalization and dropout explained - here

ResNet9 architecture in all it's detail - here

Learning rate scheduling (via Sylvain of fastai) - here

Weight decay regularization technique - here

Gradient clipping - here

Hint - you can get around the Medium and Towards Data Science article limits if you run into them by restarting your browser and using private view mode.


Observations

1.  The jovian lectures started off a little bit too basic for my tastes (but perhaps perfect for yours if you are just getting started with all of this.  But they really pulled things together for this lesson 5 lecture. It provides a very clear presentation of a wide variety of advanced topics, presented in a way that is easy to comprehend (not always the case in a course like this).

2.  Your PyTorch toolbox of reusable deep learning components has grown over the various lessons.  Think about the various pieces you used in this lesson to build your state of the art (almost) ResNet9 deep learning architecture for image classification.

convolutional layer

max pooling layer

fully connected layer

residual blocks

batch normalilzation


Need to review something from the previous lessons in the course. 
No problem.


You can access the first lesson here.

You can access the second lesson here.

You can access the third lesson here.

You can access the 4th lesson here.

You just completed lesson 5 here.

Lesson 6 posts next monday (baring a software release crisis in my other life).

Keep in mind that HTC courses are works in progress until the final lesson post, so previous and current posted lessons may be tweaked to optimize as the course progresses.

Comments

Popular posts from this blog

CycleGAN: a GAN architecture for learning unpaired image to image transformations

Pix2Pix: a GAN architecture for image to image transformation

Smart Fabrics