GAN Deep Dive - Part 2

Let's continue our deep dive exploration of Generative Adversarial Networks (GAN).  This post builds off of the material in Part 1, so check that out first if you haven't.

We're going to start out by working through Lesson 7 from the 2019 fasai course.

This is a super information packed lecture, filled with great stuff.

Jeremy starts off by showing us how to build the Resnet architecture from scratch.  He does this to show off a very important technique called the 'skip connection'.

He then covers the fascinating U-net architecture, which also uses skip connections to build super resolution in the U-net's output.

He then covers 2 new loss functions, feature loss, and gram loss.

Building on all of the above, he then moves into Generative Adversarial Networks (GAN).

You may recall in Part 1 Jeremy mentioned that transfer learning might be useful for training GANs, but he wasn't familiar with anyone using it yet.  In this lecture a year later he shows you how to do it.  He also shows off some fastai unique thinking on how to build GAN models using architectural and loss function innovations.

Some Observations

1: I pointed out before that the fastai course is always changing to some extent.  This is usually a good thing, because they are trying to continue to develop and refine what they are doing to represent how to achieve state of the art results using the fastai api.

But sometimes interesting and useful things get lost or buried in this process.  When we looked at the course coverage of this material in the 2018 lecture in Part 1 of this series, Jeremy spent a lot of time going over how Wasserstein loss worked in GAN systems, and then generating some code to do it from scratch.  He briefly mentions wloss in this lecture, nothing else on this topic.

Now the good news is that they advanced the fastai api so that wloss is just an option in the new GAN Learner available in fastai.  So here we see an example of evolution at work.  The fastai api got better at handling this kind of problem and hiding the details under the hood, but you kind of need to look at the previous years lecture to understand how that part of the system works internally.

The bad news is that you need to look at a previous year's lecture material to develop an understanding of this topic.  It's not really all that bad, but it requires some extra work on your part.  We did the work for this particular topic area by pointing it out and telling you where to go to learn all of the information.  But you most likely have personalized interests, and may have to go through a similar search and find process for your particular topic area of interest.

2: I normally consider Jeremy a very visionary individual. But he said something in the middle of this lecture that was such a non-visionary statement that i have to take exception to it here. He said that there is nothing interesting about GAN systems that generate random output. By which i assume he was referring to GAN systems that synthesize fake images of things like a face or an object class.

This statement is so incorrect and untrue i really felt the need to point that out here in my observations on the material covered in this lecture.  Do not let it influence your thinking about this GAN subject area.

 So i guess we can immediately deduce that Jeremy is probably not an artist (his real passion seems to be data modeling), because artists find this particular part of what GAN's are capable of doing as being highly interesting.  Maybe he was just having a bad day, or some aspect associated with potential abuse of generated fake images bugs him about this part of GANs. 

Not only are the potential applications of GAN systems ground breaking, fascinating, almost magical from an artistic perspective, they also are telling us something very important about how the human visual system works (or doesn't work).  I've been thinking about this a lot recently (remember i have an extensive background in human visual modeling and it's application to engineering problems).  I'll get into this topic in much more detail in a later post.  

3: The whole U-Net architecture is fascinating.  Other than it's initial development in a fairly specialized branch of medical image segmentation, the architecture was not really well know until fastai started pushing it.  And if you look at the most recent papers on neural nets that do image generation, it seems to be be everywhere now.

I'm sure there are other not well known but really powerful algorithms like U-Net and skip connections just waiting for us to find and then use to solve issues associated with existing deep learning systems.


Popular posts from this blog

Pix2Pix: a GAN architecture for image to image transformation

CycleGAN: a GAN architecture for learning unpaired image to image transformations

Smart Fabrics