HTC Education Series: Getting Started with Deep Learning - Lesson 7

Another exciting lesson in HTC's 'Getting Started with Deep Learning' course begins.  Following our usual presentation layout of our lesson's material, we will start with the fastai Part1 2020 Lesson 7 lecture.  This lecture continues the extremely fascinating discussion started last week about how to generate a latent space associated with a collaborative filtering model.

We then move onto the topic of tabular data.

We will take a diversion into a random forest, where we will learn more about random forests than one might have expected.  This being a lead up to the notion of using neural networks for tabular data.

You can also watch this first fastai video lecture on the fastai course site here. The advantage of that is that you can access on that site a searchable transcript, interactive notebooks, setup guides, questionnaires, etc.



What is covered in this lecture?

regularization
weight decay - L2 regularization
we need to wrap with nn.Parameter() in PyTorch to make it learnable tensor

PCA - principal component analysis
50 dimensional vector space mapped into 3D vector space
    distance in the reduced dimensionality latent space is useful, has meaning

cosine similarity (dot product) - used to find movies that are similar

neural net version of collaborative filtering is implemented using a fastai tabular neural model

categorical vs continuous data

can think of 'one hot encodings' as embeddings
    the models can learn something about what Germany looks like just by looking at the purchasing behavior of people who live there.
    model can learn information about the world the data was generated in (via the embedded features)
    collaborative filtering is just 2 vectors?

deep learning neural net approach to tabular data allows N vectors (not restricted to just 2)

alternate older approaches to dealing with tabular data
    ensembles of decision trees
    random forests
    gradient boosting machines

Scikit-learn and Pandas libraries
    popular for tabular data, can use to do random forest, decision tree stuff (which don't really use PyTorch math acceleration features)

what is an automatic procedure that generates decision tree that does better than random choices?

shows how to create a random forest algorithm from scratch

out-of-bag error (OOB)

model interpolation - 5 step analysis procedure

feature importance calculator
    allows removal of non relevant columns of data in model
    allows removal of redundant features

Random forests can't predict future data (they can't extrapolate to predict future trends)
    deep learning models can do extrapolation

Random Forest extremely popular, and often gives fairly good performance
    so you should test against to make sure you aren't worse

Boosting

Entity embeddings can improve existing methods if combined together



Additional HTC Course Material

1. We've been heavily emphasizing Feature Visualization of deep learning neural networks as being very important concept.  Both for understanding how deep learning systems really work, and for driving future developments in deep learning image and computer vision processing.

There is a great lecture by Andrej Karpathy on 'Visualization, Deep Dream, Neural Style, and Adversarial Examples'.  It's from the 2016 Stanford CS231 course.  

You have been expose to all of this stuff before, starting in an early fastai lecture that briefly talked about feature visualization, and then in Xander's various 'get pumped ' presentations we have included in previous HTC lessons.  

After watching this great lecture above, i feel you are going to start to get a much better understanding and integration about how all of these topics relate, and will continue to get a better intuitive sense of what is going on under the hood (in a conceptual sense, not a how do i code it sense).



Observations

1. If you are like me, at some point in this fastai lecture you started to get a little bit confused. Because we thought we were in a deep learning course, but seemed to have traveled by quantum fluctuation into a parallel universe AI course covering random forest data modeling, then continued on into an in-depth discussion of decision tree methods in general. And then into a discussion of data bagging. 
 
Which is certainly all fascinating, but i'm not sure what a lot of it specifically has to do with deep learning.  Other then being the popular precursors to process tabular data prior to the advent of deep learning.

Bagging is kind of extraordinary, since it provides a way to improve the accuracy of nearly any kind of machine learning algorithm by training it multiple times, on different random subsets of data, and then averaging the predictions.

Random Forests do not predict time series data into the future. Deep learning neural nets can do this.  So this is an important distinction to understand.

2: This fastai lecture does closely follow chapter 9 of the course book.  And the v2 fastai course does try to include some additional machine learning specific material that was included in a previous fastai course on machine learning techniques.  So there's your answer about why this material is presenting in the fastai v2 Part 1 series of course lectures.

In a recent interview, Jeremy mentions that he started working with neural networks in the 90's, but then moved on to using Random Forests instead for the kinds of things he was looking into using neural nets for at that time, because they ran so much faster then neural nets on 90's computer hardware.  Of course it's a whole new world these days.

3:  I can not stress enough how important and exciting understanding deep learning neural net feature visualization is.  Developments in this area are going to hugely influence future research directions, certainly in the imaging space of deep learning applications.

There is a fascinating distill.pub paper on Feature Visualization: How neural networks build up their understanding of images. Distill publications are great because they can be interactive.  After watching the various Xander presentations in previous lessons, and Andrej's lecture in our HTC additional material for this lesson, you will be able to gain an appreciation for what they are talking about in this paper (as well as see amazing deep learning visualization imagery).

A followup distill paper on 'The Building Blocks of Interpretability' extends this feature visualization work to try and get a better understanding of what is going on inside of deep learning neural nets.

Both of these papers are extremely fun to check out.  They are extremely visual, and quite fascinating.

4:  We have an upcoming HTC post on OpenAI's Microscope for peering inside of several deep learning system architectures.  We'll link to that here when that comes online.  
Again, this is all about feature visualization, and this new system lets you do deep dives inside of many deep learning architectures in your web browser (no coding required).


Don't forget to read the course book

Chapter 9

Need to review something from the previous lessons in the course.
No problem.

You can access Lesson 1 here.

You can access Lesson 2 here.

You can access Lesson 3 here.

You can access Lesson 4 here.

You can access Lesson 5 here.

You can access Lesson 6 here.

You can move on to the next Lesson 8 in the course (when it posts on 11/16/20).

Comments

Popular posts from this blog

CycleGAN: a GAN architecture for learning unpaired image to image transformations

Pix2Pix: a GAN architecture for image to image transformation

Smart Fabrics