Deep Generative Models

 Today's post take a look at a great lecture by Ava Soleimany from the MIT Intro to Deep Learning course.  We'll be taking a look at Latent variable modesl, and discuss the differences between using variational autoencoders vs Generative Adversarial Networks (GAN) for creating generative models.

You can think of variational auto-encoders (VAE) as compression - decompression systems.  An image is compressed to a reduce representation that is the latent space in the first half of the VAE architecture.  The second half then reconstructs the inout image from the latent space representation.


GAN's are somewhat different, in that they are composed of a Generative part that takes an input vector (usually noise) and then generates an image from that. A second adversarial critic part then observes the output of the generator and makes a binary classification (is the generator output good or bad).  This result is then feed back to the complete system to force a competition where the Generator tries to create more realistic images, and the critic tries to get better at detecting real vs fake images created by the generator.


Here's a fun question to think about.  What is the latent space of a GAN?  We'll get into that more in future posts.


I've got another excellent video here for you to watch that does a very good job of breaking down how a Variational AutoEncode works.  Watching this will also give you a much better understanding of what the latent space is, and how you can work with it.  Dis-entangling the latent space is a big issue in getting GAN systems to be more controllable, so this information will be useful to better understand when we take our deep dive in to GAN systems later this week.


Note that both of these videos spend time explaining how to restructure VAE systems so that the probablistic part can be gradient optimized.  I caught that was important in my first view of Ava's presentation, but still didn't fully understand what they did, so having Xander explain it again is really useful to help drive it home.


We'll be continuing our 'deep' exploration of 'deep generative models' in thursday's 'deep dive into GANs' post

Comments

Popular posts from this blog

CycleGAN: a GAN architecture for learning unpaired image to image transformations

Pix2Pix: a GAN architecture for image to image transformation

Smart Fabrics