Posts

Showing posts from December, 2020

Generative Models Deep Dive - Part 1

Image
 We're going to be taking a deep dive into deep learning neural net generative models.  It will be broken up into several parts.   Parts 1 and 2 will be based on lectures presented by Justin Johnson.  Parts 3 and 4 will be based on lectures by Yann LeCun from the 2020 NYU Deep Learning course.  Yann has a fascinating unified theory of all things neural net generative model (and there are so many flavors without it ), which is pretty mind expanding once you get your head around the concepts.  Parts 5 and 6 will be focused on GANS, and hopefully tie back to Yann's unified generative model theory. Part 1 begins with a lecture from Justin Johnson, who is now a professor at University of Michigan.  It was taken from the online lectures associated with justin's class called 'Deep Learning for Computer Vision.   We're using the fall 2019 lecture since the 2020 lectures unfortunately won't post online until next year (university policy?) .  I say unfortunately because r

HTC Seminar #26 - Abstraction and Reasoning in AI systems: Modern Perspectives

 Todays's seminar was alluded to in yesterday's HTC Deep Learning Update.  "Abstraction and Reasoning in AI systems: Modern Perspectives", by Francois Chollet, Melanie Mitchell, Christian Szegedy.  It was presented at the most recent NeurlPS conference this month. You can watch the presentation and associated slides on SlidesLive here . If you read the HTC blog it should be obvious why i'm jazzed about this particular talk and associated paper.  It's a big affirmation to the notion that what deep learning networks do is manifold learning.  They approximate the nonlinear functional transformation to get from the input super high dimensional space to the internal manifold that the perceptual information in the input data live on.

HTC Updates - Deep Learning #2

Image
 There are some great recent update summaries from Henry AI Labs, so let's dive into them.  The first one on data efficient image transformers being particularly interesting. Here's a link to a blog post on the Facebook AI work on Data-efficient Image Transformers. I guess one question in my mind is whether they are really better than typical convolutional net architectures if you build the convolutional nets with 'attention' features.  And 'attention' is really another way to say 'sparse network'. In any case, this area of research (transformer architectures for deep learning image tasks) is fascinating.  And there is a lot of recent activity in this area, so expect more surprises in the coming year. We'll be covering transformer architectures in much more detail here soon. Now let's jump to a previous Henry AI Labs update. So the one that really jumps out for me is the first one on Abstraction and Reasoning in Modern AI Systems by Francois Cho

Waymo and the Future of Self-Driving Cars

Image
 This is an interview between Lex Fridman and Waymo CTO Dmitri Dolgov.  They talk about the origins of Waymo, their current efforts with fully riverless service in Phoenix, and then the future. It's interesting to contrast the Waymo technology stack for self-driving cars vs what Tesla is doing.  Their separate corporate cultures also seem to contrast.

Background Matte Generation Using Novel Neural Architecture

Image
 Generating background mattes for video is something that tends to be used for all kinds of different applications.  There is a long history of different techniques for this, like color key which assumes the foreground objects or people are in front of a flat color background (think green screen).  Or one of the many Siggraph papers over the years focused on different variants of this (graph-cut, etc). With the deep learning neural net renaissance currently taking place, you would expect to see different neural net implementations of this.  Nvidia recently announced an implementation for example. And now the U of Washington graphics lab has come up with an interesting neural net based approach.  Take a look at the video examples below. The neural net architecture they use is interesting, and if you stretch your brain you might start to come up with other applications of the basic idea. Two different neural nets are used.  A low res one to get in the ballpark, and then a high res one th

Graph Convolutional Neural Network Practicum

Image
 Continuing our graph convolutional network (GCN) theme from yesterday's post, Alfredo Canziani of the 2020 NYU Deep Learning course helps reinforce the notion presented yesterday, that GCNs are very closely related to self-attention and transformer networks.  Which seemed surprising when first presented, but maybe less so when you realize it's all about sparse networks. After understanding the general notation, representation and equations of GCN, we delve into the theory and code of a specific type of GCN known as Residual Gated GCN. We then look at some related PyTorch code in a Jupyter notebook.

HTC Seminar #25 - Graph Convolutional Networks (GCN)

Image
 Today's HTC Seminar Series lecture is by Xavier Bresson on Graph Convolutional Networks.  So how to implement convolution (and deep learning) on graphs rather than full grids. I knew absolutely nothing about graph convolutional networks prior to watching this lecture, and now i understand it fairly well.  If you have a signal processing background, everything Xavier says in this really great presentation you will totally grok.

How Deep Learning Can Enhance Digital Artist's Creative Workflows

Image
 This talk by Anastasia Opara on 'Proceduralism and Deep Learning' is very thought provoking.  It covers some specific topics HTC is interested in. Specifically, how can we enhance digital artist's workflows by using proceduralism and Deep Learning to enhance creativity and reduce the amount of 'busy work tedious craft' required to get from point A to point B in your artistic journey. The talk focuses on enhancing workflows for game artists.  But i think the same principals also applies to more general digital art workflows.  How can we use 'software 2.0' principals to train a system by showing it examples. Observations 1.  The GAN synthesis example was interesting. Note that she pointed out that it doesn't always connect the lines completely. 2.  I think the 'draw a line and make a 3D mountain' demo could be directly implemented using Studio Artist Geodesic or Chamfer distance interpolation approaches. So in my mind using the Pix2Pix neural net

HTC Education Series: Deep Learning with PyTorch Basics - Lesson 4

Image
This is the 4th lesson in the HTC Deep Learning with PyTorch Basics course. It begins with the 4th lecture in the Jovian.ai course called 'Deep Learning with PyTorch: Zero to GANs'. This 4th lecture is called "Image Classification with Convolutional Neural Networks". What was covered in the lecture CIFAR10 dataset - 32x32 image dataset for image classification DataLoaders and torchvision datasets Understanding convolution Building a convolutional neural network in a nn.Sequential     nn.Conv2d     nn.MaxPool2d Training a convolutional neural networks     how to avoid overfitting Saving and loading model weights Additional HTC Material 1.  You should start getting more comfortable working with PyTorch DataLoader objects.  It was discussed in the lecture above.  The Paperspace Blog has a good recent  post called "A Comprehensive Guide to the DataLoader Class in PyTorch" you can read to further educate yourself about DataLoaders and DataSets. 2.  We will co

GPT-3 Roundup Discussion and Demos

Image
 Here it is in all it's glory.  Hot off the presses (very recently posted).  An extended panel discussion concerning GPT-3.  Where learned individuals will discourse.  One of them (Gary Marcus) is a notorious deep learning curmudgeon, so be warned.  The others are all pretty enthusiastic (the moderator a little bit less than some of the others, and the linguist is a linguist so he's always going to have issues).  It's all quite fascinating. The beginning is kind of a fast paced, somewhat chaotic overview of key insights discussed at different points in a somewhat long series of different discussions, and then the rest of the video is the actual discussions and some experiments with GPT-3 by the moderator. Observations 1:  Gary said something about DeepMind's Atari game system that was very misleading.  He points out that if you move the location of the game paddle, then the trained system is no longer able to play the game.  Implying this brittle-ness, and that the syst

So What's the Deal with Rust

Image
 Rust as in the programming language rust, not the state of any metal left in Hawaii's climate for more than a week.  So i bumped into a procedural texture implementation written in Rust on github recently, and my immediate response was long the lines of 'what the hell is this'.  Some additional research lead to this post. Here's a stackoverflow blog post on 'What is Rust and why is it so popular?' Here's the main rust language site . Here's a talk by Steve Klabnik on 'Rust, WebAssembly, and the future of Serverless'.  Quick Rust over view at the beginning . Here's a talk by Carol Nichols on 'Rust: A Language for the Next 40 Years'.  She's on the rust steering committee, so bear that in mind.  Probably a little too much time on the historical railroad analogy . Here's a talk by Pavel Yosifovich on 'Rust for C++ Developers'.  Pretty objective pro-con viewpoint . I will keep an open mind about Rust.  However, here's

Old School Procedural Texture Extraveganza

Image
 This is an awesome talk by procedural artist Anastasia Opara of Embark Studios titled "More Like This, Please! Single Example Texture Synthesis and Remixing".  She starts out by trying to give an intuitive feel for what is going on in texture synthesis algorithms.  We then go on a wild ride through the history of texture synthesis.  She then builds on top of that research for her unique approach. The talk also discusses how the notion of texture synthesis can be integrated into a digital artist's workflow.  Some interesting 'old school' approaches to style transfer are also discussed based on texture synthesis algorithms. The 'funky style-transfer' approach using the Levoy paper algorithm is pretty slick. So is the 3d rock synthesis example  (which she really quickly mentions in passing)  based on incorporating displacement maps into the texture synthesis procedure. Anastasia's MultiResolution Random Order algorithm is based on the Paul Harrison pdf t

HTC Updates - Deep Learning #1

Image
 Research in Deep learning is increasing at an ever accelerating pace.  Keeping abreast of the latest developments is a constant challenge.  The HTC Updates on Deep Learning will try to periodically keep you abreast with a summary of the latest new research of interest. Let's start off with the most recent hot off the presses AI Weekly Update from Henry AI Labs.  This 35 minute presentation covers 14 different new papers or lectures or blog posts. We will be presenting the Yoshua Bengio lecture mentioned in the presentation in an upcoming HTC Seminar. The vision transformer research seems particularly interesting.  We also really need to get some more posts going here on Transformer architecture.  We covered AlphaFold recently in a HTC  seminar post . Self-Supervised Learning More reinforcement that Self-Supervised Learning is a very hot topic.  So the classification variation of it mentioned above is kind of doubly interesting because of that.  We recently posted an in-depth repo

HTC Seminar #24 - Self-Supervised Learning in Computer Vision

Image
 Today's HTC seminar is the week 10 guest lecture at the NYU Deep Learning 2020 course.  The talk is titled "Self-supervised learning (SSL) in computer vision (CV)", and is presented by Ishan Misra of Facebook FAIR.  Ishan talks about self-supervised learning (a hot topic in deep learning research currently).  He dives into how 'pretext tasks' can help make SSL work, and tries to give an intuition for their underlying representations in the SSL deep learning models. The second half of the talk forces on the shortcomings of 'pretext tasks', what their desired performance would be, and how to use Clustering and/or Contrastive Learning to get there.  A specific kind of Contrastive Learning called PIRL is detailed. I had briefly glanced at some of the papers associated with this work before, and hearing Ishra explain it really helped me understand what they were doing much better. The link of this stuff with current data augmentation practices is very interes

Self Driving Cars and PyTorch and Deep Learning at Tesla

Image
 It's interesting to hear how Tesla is using the capabilities of PyTorch to do the deep learning neural network research and development required to build self driving cars. Here's another slightly longer presentation by Andrej on 'Building the Software 2.0 Stack'. Here's another somewhat overlapping and longer presentation by Andrej on 'Tesla Autopilot and Multi-Task Learning for Perception and Prediction'. Here's another somewhat overlapping and longer presentation by Andrej on 'Tesla Autopilot and Multi-Task Learning for Perception and Prediction'. The neural net architecture they use to build their self-driving system is fascinating.  As is their lack of use of lidar (everything is vision based only).

HTC Education Series: Deep Learning with PyTorch Basics - Lesson 3

Image
 This is the third lesson in the HTC Deep Learning with PyTorch Basics course.  It begins with the third lecture in the Jovian.ai course called 'Deep Learning with PyTorch: Zero to GANs'. This 3rd lecture is called 'Training Deep Neural Networks on a GPU'.  You will learn how to build and train a deep learning neural network with hidden layers and non-linear activation functions.  You will be working with a Jupyter notebook in a cloud based system that gives you access to GPUs. What was covered in the lecture how to run jupyter notebooks on colab using jovian multi-layer networks with nonlinearities rectified linear unit (ReLU) deep learning neural nets approximate arbitrary functional mappings defining a model by extending nn.module training and verifying your model using a gpu Additional HTC Material 1: Again, our goal with this course is to present pretty basic beginning PyTorch code for building neural networks in the first lectures in this course.  We then try and

Self Supervised Learning is Taking Off

Image
 The cutting edge of deep learning research in self-supervised learning algorithms is taking off.  Self supervised being a neural net that learns from the data in the model, not from supervised labels or other pattern match-ups tagged by humans.   And in some sense it's really all about data augmentation.  And we move from thinking about data augmentation as just a way to endlessly expand our model's data, to thinking about it as a way to introduce perceptual clustering of the data in our model so that the training process can manipulate it's energy surface to correspond to the natural perceptual classes in the data (doing this with human intervention).  Actually the human intervention is the human designing the correct perceptual augmentation. SwAV is the latest paper from FAIR to generate state of the art results in self-supervised learning.  With exciting potential for an alternative approach to transfer learning ( alternative to a ResNet model (or equivalent)) . Here

Kornia - A PyTorch Based Differentiable Computer Vision Library

Image
 Kornia is amazing.  A computer vision and image processing library built as an extension of PyTorch. Where everything inside of it is differentiable.  So computer vision and image processing algorithms can be directly dropped in as additional layers in a deep learning neural net system (that was built based on PyTorch).   Kornia is open source, and an official part of the pPyTorch eco-system.  It is based on OpenCV, so if you know how to work with OpenCV, it's very similarly structured.  The difference is that Kornia is implemented using PyTorch, and processes tensors rather than NumPy arrays when working with images.  And of course everything is differentiable, which is not true for OpenCV . Here's a quick into video into on 'Kornia: Computer Vision Library for PyTorch'. Here's a much longer deep dive video presentation into all that is the Kornia Library from 4 months ago. Here's the Kornia main site . Here's the Kornia documentation site . Here's the

Convolutional Neural Networks - learn about it from Yann LeCun

Image
The video below is the week 3 lecture in the NYU Deep Learning course from 2020.  The lecture is given by Yann LeCun, who is really the father of convolutional neural networks. It starts off a little bit strange, first with a demo by Alfredo Canziani of what a simple multi-layer linear net is doing in a visualization, and then Yann starts talking about 'when the parameter vector is the output of a function', and i was a little bit confused because i did not expect this was how it was going to start.  But then it pulls together (Yann is really talking about weight sharing architectures in that beginning section), and Yann dives into a really interesting presentation on convolutional neural networks. Yann's implementation of convolutional neural networks was highly influenced by research on how the primate and human visual system works.  And by the associated work of the NeoCognitron architecture developed by Fukushima in the early 80's. The slide below brought back a lot

Working with TensorBoard in PyTorch

Image
 TensorBoard is a tool designed for visualizing the results of neural net training runs.  It is a part of Tensorflow, but you don't have to be running or working with Tensorflow to utilize it. PyTorch has specific hooks so you can use TensorBoard to visualize the results of your PyTorch defined and run neural net models. You could of course be using matplotlib or any other graphing library to directly do this in your PyTorch code.  One nice thing about using TensorBoard instead is that you can work with interactive visualizations in a Jupyter notebook. But TensorBoard is much more than just a graph plot toolkit.  TensorBoard allows us to directly compare multiple training results in a single graph plot.  This can be very useful for finding the best set of hyperparamters for a model, and can help visualize problems like vanishing or exploding gradients. Here's a quick intro overview video on Visualization with TensorBoard from PyTorch Developer Day 2020. Here's a link to Te

HTC Seminar #23 - DeepMind's New AlphaFold 2 Breakthrough in Protein Folding

Image
Today's HTC seminar provides some information on a very recent development in trying to solve the protein folding problem by DeepMind.  They just announced this in the last week, so hot new news on how deep learning continues to dominate in new and very difficult problem sets. This lecture is given by Yannic Kilcher.  As he points out, DeepMind has not released a specific paper on AlphaFold 2 yet, just the PR associated with the recent press release.  He tries to read between the lines based on an analysis of the AlphaFold 1 paper and what the PR releases about AlphaFold 2 say to inform us on what is going on under the hood. You know its big news when Nature talks about it. A fascinating quote from this is the following " instead of predicting relationships between amino acids, the network predicts the final structure of a target protein sequence ".  They then claim that kind of prediction is a more complex system.   But is that really true?  Or does the improved perform

QT 6 Released - Quick Update

Image
Just a quick update (more detail later in a separate post) . Qt 6 finally dropped today.  Qt 6 is the latest evolution of the Qt (pronounced 'cute') platform (which has been around for 20 years at this point, and is a hybrid open source / commercial code base) Qt 6 is cross-platform nature, allowing users to deploy their applications to all desktop, mobile, and embedded platforms using one technology and from a single codebase.  It is scalable from low end embedded systems to mobile applications to desktop application.  It also features a unified graphics architecture that abstract the lower level GPU specific hardware api details from an application programmer (key point in todays computing landscape) . There is a blog post associated with this from Lars Knoll. Here's a link to the tech specs. Here's a link to add on support in Qt 6.0 and future point releases. We have various HTC posts  on Qt. Here's their little Qt6 promo video to get you excited.

PyTorch Developer Day 2020

Image
 PyTorch Developer Day was a virtual event held on 11/12/20.  Anyone paying attention can see that the PyTorch opensource community is moving full speed ahead (with assistance from Facebook AI Research (FAIR) of course) .  We've included some keynote highlights below to better educate everyone on what PyTorch is, and where it is going. The keynote presentation below is on 'Open Challenges in Deep Learning Systems'. The keynote presentation below is on 'State of PyTorch in 2020'. The keynote presentation below is on 'TorchVision - Towards Research to Production', The keynote presentation below is on 'TorchAudio - PyTorch Developer Day'. The complete video index from PyTorch Developer Day 2020 is available here . And of course the big exciting news for the near term future is the ongoing effort to add apis for hardware accelerated mobil and ARM 64 builds. So ios Metal GPU, Android Vulkan GPU, and Android Neural Network api.   I think the ARM support f

HTC Education Series: Deep Learning with PyTorch Basics - Lesson 2

Image
 This is the second lesson in the HTC Deep Learning with PyTorch Basics course.  It begins with the second lecture in the Jovian.ai  course  called 'Deep Learning with PyTorch: Zero to GANs'.   This 2nd lecture is called 'Working with Images and Logistic Regression'.  In it we will explore how to work with images from the MNIST character recognition dataset, create training and validation sets for our model, and train a logistic regression model using softmax activation and cross-entropy loss. You can also access the video at the jovian site here .  There are several Jupyter notebooks available there associated with this lesson.  You can also access a community discussion forum there if you want to use it. What was covered in the lecture Working with your Jupyter notebook Working with images in PyTorch The famous MNIST dataset (handwritten digits 0-9) Splitting a dataset into training and validation sets (why?) Batch processing Creating a custom PyTorch model by extendi

History, Motivation, and Evolution of Deep Learning

Image
 This is the first lecture in the NYU Deep Learning course.  It is presented by Yann LeCun, who won a turing award for his contributions to deep learning research (along with Geoffrey Hinton and Yoshua Bengio).  After some 'how does our course work' info at the beginning you can skip if you want to, Yann dives into the history of deep learning, a quick overview of interesting topics they will cover in later lectures, the evolution and application of Convolutional Neural Nets (CNN), how deep learning is used at Facebook, why deep learning is hierarchical in nature, and generating and learning features and representation. The course website is here .  It includes access to all the course slides, Jupyter notebooks, and youtube videos. The last bit of the lecture on the manifold hypothesis is a particularly great section.  I talk about this repeatably here on the HTC site (one example here ).  I don't really think it's a hypothesis at this point, but pretty well established

GhostNet - Speeding up Computation by Reducing Redundancy

Image
 Imagine that you could restructure your neural net architecture to achieve a 40% speedup. Sounds interesting, right? GhostNets are deep learning neural net architectures that introduce additional layers that are cheap computationally (think linear layers).  The idea is to generate more feature maps from the additional cheap layers. The GhostNet team noticed that there seemed to be a lot of redundancy in the feature map representations generated by a neural net architecture. Some specific feature maps just seems to be clones of other ones (ghosts).  They looked at ResNet-50 architecture's first block, and trained a linear net to learn the mapping between them. They then hypothesized that you don't need to spend computation generating so many unique feature maps in each block in the neural net.  Instead, you can calculate a few 'intrinsic' maps and then use cheap and fast linear operations to approximate the rest of the 'ghost' maps. In their paper they show exam

Apple New M1 Chips and Machine Learning

 Apple's new M1 ARM processor based computers are interesting beasts.  We recently had a post that describe what this is in more detail.  I got one of the first M1 based powerbooks you could get your hands on literally a week ago, so we're talking about something very new. And it will be interesting to see how this plays out over the next few years, particularly when we look at training and deploying deep learning neural net systems. Here's an article on 'How is the Apple M1 going to affect Machine Learning'.  It provides an overview of the overall M1 architecture, which include a multi-core ARM based RISC processor, along with a Multi-core GPU, and a'Neural Engine' all on the same chip substrate (along with shared memory).  It briefly discusses tensorflow at the bottom, but does not get into PyTorch. PyTorch has announced prototype features that support hardware accelerated mobile and ARM64 builds.  This is pretty cool, since they are supporting deployme

StyleGAN2 for Artistic Image Manipulation

Image
One thing that blew me away in Jensen Huang's recent Nvidia  keynote presentation in September was a demo of using GAN technology to manipulate artistic facial images.  The video below gives a good demonstration of what you can do with the particular GAN system used in the demo. Note that the user manipulation of the imagery is associated with the artist manipulating latent vectors inside of the GAN model  That's a mouthful, but we'll dive into what it all means after taking a look at the demo. So how does a system like this even work? GAN stands for Generative Adversarial Network. It's a unique kind of deep learning neural net architecture. The way it works internally is that there are 2 different internal neural nets in competition with each other.  They are a Generator and a Discriminator.  The Generator tries to take random noise input, and turn it into an image.  The Discriminator evaluates the output of the Generator and decides if it is a real image, or a fake i

HTC Seminar Series #22 - William Falcon Discusses PyTorch Lightning

Image
 Today's HTC seminar was given by William Falcon at the NYC Deep Learning Meetup in fall 2019.  William is the original creator of PyTorch Lighting.  PyTorch Lightning is built on top of PyTorch, is an official part of the PyTorch infrastructure, is the new standard at NIPs for releasing research code that is reproducible, and can be thought of as an extension of PyTorch that hides a lot of the internal details of the busywork associated with building, training, and deploying deep learning neural net architectures. If you want to learn more about PyTorch Lightning, a good place to start is here . We're going to be covering it much more extensively in future HTC posts.

Qt Design Studio and Software Development Integration

Image
 Qt (pronounced 'cute') is a cross platform application development platform. It was originally developed for C++ programming (with an emphasis on solving the desktop cross platform development conundrum), but has extended over time to support things like Python, targeting mobile and embedded devices, and QML development (QML is like CSS for application layout).  Qt Design Studio is a visual WYSIWYG editor for Qt.  It is really focused on supporting the graphic designer side of application development.  So a graphic designer can use the tools they are comfortable with in their daily work (PhotoShop, Illustrator) to conceptualize the graphic design elements of an application. They then can export those design elements and import them into Qt Design Studio.  They can then layout the graphical components of the application inside of Design Studio.  Design Studio not only lets you edit an application in a visual drag and drop WYSIWYG way, you can also run previews to see it in acti