Rewriting a GAN's Rules to Interactively Adjust it's Behavior

 Let's continue our exploration of how GANs can be enhanced to allow for a user to interactively adjust their behavior.

David Bau is a Ph student in Antonio Torralba's Artificial Intelligence Laboratory at MIT.  He is developing techniques to help understand how deep learning neural networks work.  By focusing on the representations (and their latent structure) learned by deep learning neural networks.

Here's a quick overview of this research.

Here's a longer presentation on this work he just gave at AIM 2020 on August 28, 2020.

You can watch a video of the longer presentation here right now.

We'll try to get a you tube version of the video here soon.

David and other in the Torralba lab are doing some seriously excellent and fascinating work focused on better understanding how neural networks generate internal representations of the data sets they are modeling.

You can check out David's work on 'GAN Dissection: Visualizing and Understanding Generative Adversarial Networks' here.

You can check out David's work on 'Network Dissection: Quantifying Interpretability of Deep Visual Representation's here.

10/21/20 update

Andrew Ng at had a really nice overview description of this research that i'm including in this post below.  It's from his most recent 'The Batch' emails that he sends out weekly.  They are really great, feel free to check them out.

What One Neuron Knows

How does a convolutional neural network recognize a photo of a ski resort? New research shows that it bases its classification on specific neurons that recognize snow, mountains, trees, and houses. Zero those units, and the model will become blind to such high-altitude playgrounds. Shift their values strategically, and it will think it’s looking at a bedroom.
What’s new: Network dissection is a technique that reveals units in convolutional neural networks (CNNs) and generative adversarial networks (GANs) that encode not only features of objects, but the objects themselves. David Bau led researchers at Massachusetts Institute of Technology, Universitat Oberta de Catalunya, Chinese University of Hong Kong, and Adobe Research.
Key insight: Previous work discovered individual units that activated in the presence of specific objects and other image attributes, as well as image regions on which individual units focused. But these efforts didn’t determine whether particular image attributes caused such activations or spuriously correlated with them. The authors explored that question by analyzing relationships between unit activations and network inputs and outputs.
How it works: The authors mapped training images to activation values and then measured how those values affected CNN classifications or GAN images. This required calculations to represent every input-and-hidden-unit pair and every hidden-unit-and-output pair.

  • The authors used an image segmentation network to label objects, materials, colors, and other attributes in training images. They chose datasets that show scenes containing various objects, which enabled them to investigate whether neurons trained to label a tableau encoded the objects within it.
  • Studying CNNs, the authors identified images that drove a given unit to its highest 1 percent of activation values, and then related those activations to specific attributes identified by the segmentation network. 
  • To investigate GANs, they segmented images generated by the network and used the same technique to find relationships between activations and objects in those images.

Results: The authors trained a VGG-16 CNN on the places365 dataset of photos that depict a variety of scenes. When they removed the units most strongly associated with input classes and segmentation labels — sometimes one unit, sometimes several — the network’s classification accuracy fell an average of 53 percent. They trained a Progressive-GAN on the LSUN dataset’s subset of kitchen images. Removing units strongly associated with particular segmentation labels decreased their prevalence in the generated output. For example, removing a single unit associated with trees decreased the number of trees in generated images by 53.3 percent. They also came up with a practical, if nefarious, application: By processing an image imperceptibly, they were able to alter the responses of a few key neurons in the CNN, causing it to misclassify images in predictable ways.
Why it matters: We often think of neural networks as learning distributed representations in which the totality of many neurons’ activations represent the presence or absence of an object. This work suggests that this isn’t always the case. It also shows that neural networks can learn to encode human-understandable concepts in a single neuron, and they can do it without supervision.
Yes, but: These findings suggest that neural networks are more interpretable than we realized — but only up to a point. Not every unit analyzed by the authors encoded a familiar concept. If we can’t understand a unit that’s important to a particular output, we’ll need to find another way to understand that output.
We’re thinking: In 2005, neuroscientists at CalTech and UCLA discovered a single neuron in a patient’s brain that appeared to respond only to the actress Halle Berry: photos, caricatures, even the spelling of her name. (In fact, this finding was an inspiration for Andrew’s early work in unsupervised learning, which found a neuron that encoded cats.) Now we’re dying to know: Do today’s gargantuan models, trained on a worldwide web’s worth of text, also have a Halle Berry neuron?


Popular posts from this blog

Simulating the Universe with Machine Learning

CycleGAN: a GAN architecture for learning unpaired image to image transformations

Pix2Pix: a GAN architecture for image to image transformation