Deep Learning Deployment Options - part 1

Deployment. Where the rubber hits the road. Where you can actually take all of this great theoretical stuff and turn it into something useful. Something useful by non wizards that is. By regular people who will be the customers for your fabulous custom deep learning widget.

So we've talked about various higher level apis like Keras that allow one to fairly easily specify and train a deep learning network.  Keras riding on the back of TensorFlow 2.0. So after you put together your training data set, specify your deep net, and train it up so it does a good job of modeling the statistics of your data set, what do you have at that point?

You have the set of connection weights that determine your trained net. And that information is going to be living in a file somewhere laid out according to some file specification for whatever format it is stored in.

Now as we discussed before, Keras is great for hiding all of the gory TensorFlow details.  For example, let's say you want to iterate over your data set for training.

In TensorFlow, it's going to be something like this:

def train(model, dataset, optimizer):
  for x, y in dataset:
    with tf.GradientTape() as tape:
      # training=True is only needed if there are layers with different
      # behavior during training versus inference (e.g. Dropout).
      prediction = model(x, training=True)
      loss = loss_fn(prediction, y)
    gradients = tape.gradient(loss, model.trainable_variables)
    optimizer.apply_gradients(zip(gradients, model.trainable_variables))

While in Keras, things are way simplier:

model.compile(optimizer=optimizer, loss=loss_fn)

Which is so much easier to comprehend and work with.

Now if you are working with Keras, saving your trained model is a simple as
# save the network to disk
print("[INFO] serializing network to '{}'...".format(args["model"]))["model"])
So, pretty simple.  The model would be saved to disk with the name

If you wanted to load the model into Keras again and use it to make predictions, it's as simple as
# load the pre-trained network
print("[INFO] loading pre-trained network...")
model = load_model(args["model"])

This is all great if your scheme for deployment involves distributing your trained model to someone else running Keras in Python on a hot-rodded development system somewhere. And of course all of your customers fall into that category, trained wizards with souped up development systems. Oh wait...

Now in reality, your customers are probably normal people. And if your dream of deployment involves installing everything needed to run Keras and TensorFlow on their poor little home computer, you are in for a very rude awakening.

What you should be thinking about is how you are going to take your trained model and embed it in some desktop application. Or a phone or tablet app.

So you are looking for some kind of wrapper. A simple to use wrapper that lets you load your pre-trained model into it, and then you are off and running.

Fortunately there are some very interesting solutions for this kind of thing. The DNN module being one of them.  We will be looking into it much more deeply in later posts.

So say you want to load a TensorFlow trained model into the DNN module, you could do something like this:

cv2.dnn.readNetFromTensorflow('frozen_inference_graph.pb', 'graph.pbtxt')

Why 2 files? TensorFlow models usually have a fairly high number of parameters. Freezing is the process to identify and save just the required ones (graph, weights, etc) into a single file that you can use later. So, in other words, it’s the TF way to “export” your model. The freezing process produces a Protobuf (.pb ) file.

Additionally, OpenCV requires an extra configuration file based on the .pb, the .pbtxt.

But wait, what about my Keras model?

Exactly, you need to convert it into a TensorFlow pb and associated pbtxt model so that DNN can read it. DNN currently doesn't read Keras format, hopefully they will fix that issue at some point.

At this point you may have noticed that i punted on telling you exactly how to do the conversion.  Like all things that should be dead simple, it's a little bit more elaborate. Fun stuff for the HTC Toolkit and later posts.

Something like the DNN module is a great solution for both desktop and mobile applications. And it's cross platform, which is even better.

If you are glutton for punishment, there are various platform specific options. Like Apple's Core ML.  Now they are definitely pushing it heavily right now at their developers, and it seems great when they describe it. And they are so devoted to supporting deep learning research and development as a company that their official instructions for working with CUDA on Nvidia cards on macs is to use bootcamp to install windows on a mac and run it there. Oh wait...

A few years ago they would have told you to use OpenCL to do it. And boy are they doing a good job of supporting OpenCL these days, oh wait...

My inherent biases as a developer tend to focus on desktop or mobile applications.  It's how i see the world, and it definitely does color my worldview. But there are other deployment options to consider.

You could be thinking of some kind of cloud based system. So the fun computationally intensive stuff happens off in the cloud somewhere, and the results work their way back to a local app, or some web widget in a browser, or whatever. 

Living on Maui, i tend to be dubious of things like that, since the power and internet vanished last night for hours while i was originally trying to get this post together, which is why it ended up being posted a day later. So the cloud is cool, but maybe not the best thing when you live on a remote island with doggy power and internet connectivity and you just want your custom deep learning based application to work locally when you want it to work.

Another very cool option i recently became acquainted with is embedded deployment.  And i was going to primarily focus on that for this post, but it's already getting too long, so we'll save embedded options for a second post.  And there are some very interesting options. Mind blowing perhaps.  Deep learning neural nets running on super cheap hardware. Like a $35 raspberry pi. Stay tuned.


Popular posts from this blog

CycleGAN: a GAN architecture for learning unpaired image to image transformations

Pix2Pix: a GAN architecture for image to image transformation

Smart Fabrics