HTC Education Series: Deep Learning with PyTorch Basics - Lesson 1

 This is the second course in the HTC Education Series.  It focuses on learning PyTorch and beginning Deep Learning.  PyTorch is the HTC recommended base language to learn for programming deep learning neural nets (as opposed to using Keras and Tensorflow).  So we thought it would be good to put together a very basic beginners course on getting started with PyTorch programming.

It's based on the Jovian.ai course called 'Deep Learning with PyTorch: Zero to GANs'.  There is a video lecture associated with each lesson.  There is also an executable Jupyter notebook associated with each lesson. It is hosted on jovian.  There are several different options for running your Jupyter notebooks, but we recommend using Colab.  Jovian has it setup to make it very easy to do this.

We will be working through the jovian course lessons in this HTC Education Series Course as they come out.  A new lesson will be presented here each monday until the course is finished.  One nice thing about this particular course is that jovian has a really nice environment to work with your Jupyter notebooks to learn the course lessons and to write your own code (that allows you to spin off the computation onto Google's free colab service).

The course is free.


The HTC philosophy is that looking at educational material from several different viewpoints is a great way to better learn a subject area.  There is already a different HTC course on 'Getting Started with Deep Learning' based on the fastai api available here.

Fastai is built on top of PyTorch.  And the HTC course based on it also covers a lot more material (some of it is more advanced as well).  You are exposed to PyTorch code when you watch fastai lectures, but it's kind of assumed you are just going to pick it up on your own.  This Jovian based course is more basic than the other course, and really focuses on the 'learn PyTorch programming' part of the deep learning equation.

There are other languages one could use to build deep learning systems. Keras (which is built on top of TensorFlow) being the obvious other candidate.  I have to be honest, PyTorch is the one to go with in my opinion.  It's quickly becoming the major language for research and development, and has some great deployment options.

This course assumes you know Python.  If you have a programming background but don't know Python, fear not.  You'll pick it up very quickly by just looking at the code.  If you don't have a programming background then you have an initial hump to get over as you get comfortable with the concept of writing code.  

If you are just interested in learning more about deep learning to educate yourself, and aren't really interested in becoming a coding deep learning practitioner, you should really focus on our 'Getting Started with Deep Learning' course here.  It gives a great background of what deep learning is all about, and also includes topics like ethical considerations of deep learning and AI systems.


What was covered

How to work with Jupyter Notebooks

Linear regression

    simplest neural net

Gradient descent

Building a model

Loss function

Training a model (which uses the loss function during training)

    uses gradient descent to adjust model parameters (weights)

    training loop

        1- generate predictions

        2- calculate the loss

        3- compute gradients with respect to the parameters (weights and biases) in model

        4- adjust weights by subtracting a quantity proportional to the gradient

        5- reset gradient to zero

Dataset and DataLoader

Optimizer

    Stochastic Gradient Descent (SGD)


Additional HTC Material

1:  I think it's important to have some understanding of why you are even taking this course in the first place.  What is deep learning, what is it's history, what can you do with it, why do you want to learn this stuff.  There are a number of different 'what is deep learning' video lectures on the HTC site (our Getting Started with Deep Learning course for example has 2 of them from different sources (fastai and MIT) in the first lesson).  

But for the purposes of this 'focus on PyTorch for deep learning' HTC course, we will go to the source.  Which means we go to Yann LeCun at NYU and the first lecture in the 2020 NYU Deep Learning course, titled 'History, Motivation, and Evolution of Deep Learning'.


Ignore and or skip the very beginning where they talk about specific things for their course.  Then settle in and learn about the history of Deep Learning from a major contributor in the field.  


2: PyTorch's tensors are an alternative representation of NumPy arrays.  Numpy is an open source Python library.  Tensors in PyTorch provide several important features that are not in NumPy listed below.

    AutoGrad - the ability to automatically compute gradients for tensor operation.

    GPU Support - easy ability to share memory with GPU

GPUs (Graphical Processing Units) can be thought of as multi core specialized processors.  They were originally developed to compute shader graphics on personal computers.  They can also be used to speed up neural net computations.  The current revolution in deep learning is largely powered by the availability of GPU processing resources.

Note that it's extremely easy to go back and forth between NumPy and Tensors.

        torch.from.numpy

Here's a very interesting review paper in Nature on 'Array Programming with NumPy'.

3:  A great resource for learning PyTorch is the PyTorch website.  There are a number of very good tutorials there.  The PyTorch documentation is there as well.

4:  Deep learning neural net systems can be thought of as Software 2.0. Software 1.0 being conventional manual programming by hand.  Software 2.0 is software that adaptively learns from data models.  

Andrej Karpathy (head of Tesla self driving car effort) has written a good blog post on Software 2.0.  Note that current implementations of Software 2.0 still require someone to write it's learning system using Software 1.0 (which is what you are learning in this course).


Observations

1  Gradient descent is the key component of adaptive systems that learn.  You can think of it as a method for moving downhill on a landscape to get to the bottom of the landscape.

2:  I think it's important to start to think about individual processing blocks (that correspond to layers inside of the neural net) you can work with to construct deep learning neural nets.  

You just learned one in this lesson, the Linear layer, or nn.linear() in PyTorch.

This is the simplest neural net layer one could construct.  It has no non-linearities in it.

Additional non-linear layers were alluded to at the end of the lecture.

3:  You might be wondering what Jovian is all about.  Obviously they are into providing free education courses for people and to provide a community learning platform for machine learning.  

Their ultimate goal is to become the de-facto tool for the data science community to share and collaborate on data science projects online, similar to what platforms like Github have done for open source software.

They are a small company based in San Francisco with a secondary presence in Bengaluru India.  They are funded by Arka Venture Labs (and some other angel investors). Here's an article that discusses their plans.


You can access the next lesson here.

Keep in mind that HTC courses are works in progress until the final lesson post, so previous posted lessons may be tweaked to optimize as the course progresses.

Comments

Popular posts from this blog

CycleGAN: a GAN architecture for learning unpaired image to image transformations

Pix2Pix: a GAN architecture for image to image transformation

Smart Fabrics