Moving Towards Cross Platform Neural Network Deployment

 Suppose you are an application developer who develops cross platform applications. So you write your program in a way that allows it to be run on different desktop platforms (mac,windows, linux), and perhaps also targeting mobile platforms (ios, android).  And now you want to incorporate neural network models into your cross platform application.  How does one do this?

Now if you are just writing a C++ application, Qt provides a great solution for you.  Because you can write one code base, and actually target all of the above platforms for deployment of your application.  This is great, it works, it isolates you from platform specific distractions, preventing them from isolating your application to just one specific platform, unable to escape it.

And upcoming Qt 6 provides a great way to isolate the platform specific distractions of GPU specifics internal to these various platforms.  Platforms which can all be very different as you dive deep into their internal architectures.  For example, apple is dropping OpenGL support, focusing on proprietary Metal api instead.  OpenGL is getting long in the tooth, so it's evolving into Vulcan.  Windows is pushing Direct 3D 11.

So if you tried to write code yourself to target all of them, you will become embedded in a morass of frustration and wasted time.  Because they are all so very different internally.

Qt 6 provides an abstraction layer to hide these platform specific GPU details from an application developer.  So you can just write your cross platform application with one code base, and then have Qt do all of that grunt work to get everything to run on all of the different internal platform apis.

So my question is, can we come up with something similar for at least deploying trained neural networks? Maybe even doing training as well if we can take advantage of the GPUs on these different platforms.

Think about what a huge win this would be for everyone.  Both for application developers, as well as the people who use these cross platform applications.

I would argue that on some level neural network deployment is a mess right now.  How can we clean it up?

PyTorch, TorchScript, and JIT

One interesting approach to the problem described above is going on within the development of the PyTorch system.

TorchScript is a way to create serializable and optimizable models from PyTorch code. Any TorchScript program can be saved from a Python process and loaded in a process where there is no Python dependency.

The goal being to eventually convert everything into C++ code.
Here's some more detailed info on converting PyTorch models into C++ code via TorchScript.

PyTorch is also working on extending this concept to hardware accelerated Mobile and ARM64 platforms.  Using the GPU technology available on these platforms.  
So GPU execution on Apple's iOS Metal api.  Note that this should extend to the Mac desktop as they transition from Intel to ARM chips on the desktop.
GPU execution on Android via the Vulcan framework.
Note the similarity to what is going on here with the general 'how do i target different GPU architectures using one api' question i discused at the beginning of this post.

One thing that is unclear in my mind is how the extensions to PyTorch described above affect the Fastai api for deep learning neural nets (specifically the issue of pumping a fastai model through the PyTorch-TorchScript pipeline).  I'm still trying to wrap my head around this, but based on some reading through the fasai user forums, it seems as if some parts of the new fastai v2 api might break this PyTorch-TorchScript pipeline.  

This is very unfortunate if true, because it means that fastai api generated deep learning models (at least some) are not going to flow through the above pipeline.  This specifically has to do with the callbacks introduced by fastai v2 api.  I hope that Jeremy and crew were not perhaps a little bit too clever in how they constructed fastai v2 api.  

Because if this is true it means that if you care about cross platform deployment (and deployment to C++ execution), you are ultimately going to have to drop back to doing your final models in more conventional PyTorch code.

PyTorch - Onnx

Another option to get from PyTorch code to cross platform (Windows, Linux, Mac) is to convert a PyTorch model into Onnx.  You can then use the Onnx Runtime to run your Onnx model on your platform(s) of choice.  Again, we are interested in cross platform solutions

Here's a tutorial on Exporting a Model from PyTorch to Onnx and then Running it Using Onnx Runtime.


Popular posts from this blog

CycleGAN: a GAN architecture for learning unpaired image to image transformations

Pix2Pix: a GAN architecture for image to image transformation

Smart Fabrics