PyTorch - quick introduction

Our goal is to take a look at PyTorch, which is another api for putting together deep learning neural networks. But before we do that it's probably useful to step back and take a brief look at history.

Long ago, let's say approximately 10 years in the ancient past, there were a few different deep learning apis most people were working with. They were split up into separate camps associated with the various research groups that put them together.

So Torch was an interesting beast. An open source machine learning library with a script language based on the exotic Lua programming language.  Based on implementing and manipulating Tensor classes. Torch was associated with Yann LeCun's research group at NYU.

Caffe (Convolutional Architecture for Fast Feature Embedding) is an open source  deep learning framework originally developed at the University Of California, Berkley.  It is written in C++, with a Python interface.

So then Facebook acquired Yann LeCun, and later PyTorch was spun off from Torch as a Facebook open source project.  Facebook also seemed to ingest Caffe at some point, as they later announced Caffe2, and still later merged Caffe2 with PyTorch.

Another ancient (2007 initial release) open source deep learning api was Theano. Developed at the Universite de Montreal.

Now you may remember that we have already discussed Keras in previous posts. And discussed how Keras used to support Theano as a back end. But is now a part of TensorFlow 2.0 using it as it's primary back end.

So Keras and TensorFlow are sitting in the Google tent. And PyTorch and Caffe2 are sitting in the Facebook tent. They are all open source projects, but the funding comes from the tent owner, and to some extent who uses what depends on where they are working.

The Open Neural Network Exchange (ONNX) project was created by Facebook and Microsoft for converting neural network models between different frameworks. Merging Caffe2 and PyTorch was probably also intended to help eliminate incompatibilities between the two different models.

Ok, so that was some quick history to help you understand how we got where we are today with these different systems. And you are going to find lots of pre-built pre-trained neural net models online in all of the various different model formats discussed above. So now you have some idea why the different incompatible formats and associated models exist.

PyTorch tries to improve on what Torch did (which was implement Tensor classes). PyTorch defines a class called Tensor (torch.Tensor) to store and operate on homogeneous multidimensional rectangular arrays of numbers. PyTorch Tensors are similar to NumPy Arrays, but can also be operated on a CUDA-capable Nvidia GPU. So a key take away is that PyTorch is setup to make it easy to train on GPU clusters.

Deep neural networks are built in PyTorch based on a tape-based automatic differentiation system. A recorder records what operations have performed, and then it replays it backward to compute the gradients. This method is especially powerful when building neural networks to save time on one epoch by calculating differentiation of the parameters at the forward pass.

Lots of commonly used optimization methods are available so you can just grab them pre-built and run with them.

PyTorch offers:
Production ready, c++ runtime environment.
Experimental Torchserve tool for deploying PyTorch models at scale.
Distributed training accessible from Python and C++;
Experimental deployment on iOS an Android mobile
Native ONNX support.
C++ front end.
Cloud support (Alibaba, AWS, Google Cloud, Microsoft Azure).

We'll be discussing PyTorch in more detail in later posts. It's important enough that we need to be comfortable using it as well as the Keras - TensorFlow pathway we have previously discussed here.


Popular posts from this blog

Pix2Pix: a GAN architecture for image to image transformation

CycleGAN: a GAN architecture for learning unpaired image to image transformations

Smart Fabrics