Semantic Image Synthesis with Spatially Adaptive Normalization

 A lot of work has been done developing deep learning nets that take an image as input and output a series of tags for objects present in the image.  Less studied, but perhaps more interesting, is the notion of a deep learning net that takes a textural description of an image or scene and then outputs an image that looks like the textural description passed to it as input.

So let's deep dive into a recent paper that tries to turn a textural description of some imaginary image, and then generate an artificial image that looks like the description. And this particular approach adds an additional constraint, a segmentation map constraint to be precise.  And it being precise is kind of the whole point to this.

The paper describing this 'Semantic image Synthesis with Spatially-Adaptive Normalization' can be found here.

A project page called 'Semantic Image Synthesis with SPADE' can be found here.

Here's a short video presentation of 'GauGAN: Semantic Image Synthesis with Spatially Adaptive Normalization'.

Divyansh Jha has put together a really great blog post on Implementing SPADE using Fastai.  Perfect for HTC fastai api oriented deep learning folks.  Check it out.

And of course the world moves on, and people are trying to improve SPADE performance.  A recent attempt at improving SPADE is the Semantic Region-adaptive Normalization (SEAN) algorithm.

SEAN is conditioned on segmentation masks that describe the semantic regions in the desired output image. Using SEAN normalization, a network architecture can be made to control the style of each semantic region individually.

The SEAN generator network is built on top of SPADE and contains three convolutional network layers with their biases and scales modulated separately by individual SEAN blocks. There are two inputs per SEAN block: the set of style codes for specific regions, and a semantic mask that defines regions to apply the style code

The paper 'SEAN: Image Synthesis with Semantic Region-Adaptive Normalization' is available here.

And here's a short 5 minute paper presentation of 'Sean: Image Synthesis with Semantic Region-Adaptive Normalization'.

I find it fascinating that you can pay serious money for well put together courses on this stuff, but then they don't get around to actually covering the most recent developments, which is kind of the whole point of taking a course on something like GAN Image Synthesis in 2020. 

Fortunately, we are covering all of the recent developments here at HTC.  Like what we just covered above, an in Pix2Pix last week.  And our course is free. And if we miss anything you think we should be shouting about, let us know and we'll get something posted on it as well.


Popular posts from this blog

Pix2Pix: a GAN architecture for image to image transformation

CycleGAN: a GAN architecture for learning unpaired image to image transformations

Smart Fabrics