Deep Learning and Computer Graphics

 Our previous mini review of some selected Siggraph papers from this years Siggraph conference really brought home the message that AI research in deep learning is really driving new developments in computer graphics rendering.


We going to continue that theme in this post by checking out 2 Nvidia podcasts on this topic. Augmenting the generation of computer graphics using deep learning networks.


The first is a talk with Nvidia’s David Luebke, who is vice president of graphics research at Nvidia.  David talks about a lot of fascinating state of the art topics, including how deep learning is speeding up the real time render pipeline for generating computer graphics, the use of GAns (generative adversarial networks) to directly synthesize graphical imagery, and augmented reality displays, including using deep learning to help build holographic displays for AR.

You can access the podcast here.


The second is a talk with Nvidia’s Arron Lefohn.  Arron continues the discussion on how deep learning is revolutionizing computer graphics.  More info on GANs here, as well as other aspects of the computer graphics pipeline that can be enhanced or rethought using deep learning..

You can access the podcast here.


After listening to the 2 talks, I hope you can feel some of the excitement about new developments in this field.  It’s kind of astonishing how much and how quickly things are changing.


One big take home message is the work being done on using GANs (generative adversarial networks) to directly synthesize computer graphics imagery.  This is a rapidly changing research area


David in his talk specifically mentions this, and discusses it in some detail for a few minutes of his talk. The notion that you can explore the latent semantic space of the GAN to help understand what the GAN is actually representing is fascinating.

You may remember that we discussed latent semantic maps in an earlier post. 

If you step back from a deep learning neural network and think about what is is actually doing, it’s building a multi-dimensional transformational mapping.  The latent semantic space is a way of representing that non-linear multi-dimensional mapping manifold. Of exploring what the network is representing within different parts of the multi-dimensional mapping.


Nvidia’s GAn researchers (and others) have found that in addition to the appearance of objects within a photo, a GAN may also be building representation of lighting and rotation properties of these objects.








Comments

Popular posts from this blog

CycleGAN: a GAN architecture for learning unpaired image to image transformations

Pix2Pix: a GAN architecture for image to image transformation

Smart Fabrics