HTC Seminar Series #9: Visualizing and Understanding Recurrent Networks

Today's HTC Seminar is by Andrej Karpathy on Visualizing and Understanding Recurrent Networks.  It was presented at the Deep Learning London Meetup, and can be seen here.

The audio quality is not the greatest in this video, and i could not find it on you tube to embed directly into this post, but stick with watching it and you will learn quite a bit about how to build and use recurrent neural networks.  Including how to use them for sequential processing of a single image (building up sentence level descriptions of what objects are in an image would be an example of this).

The talk covers the material presented in Andrej's web tutorial on The Unreasonable Effectiveness of Recurrent Neural Networks.  This tutorial and talk were put together in 2015, ancient times when it comes to the fast pace of neural network research, but the talk and associated tutorial are a great introduction to recurrent neural networks.  And the tutorial includes code examples you can run if you are comfortable with 2015 era Torch.

We'll cover another recurrent neural network tutorial and associated more recent code examples and associated more recent deep learning apis we are more familiar with in an upcoming post.  Note that we've discussed systems like this before here in our ongoing discussions of GPT-2, and of neural networks or processing and modeling audio.

Andrej was just finishing up his PhDd from Stanford in 2015, where he studied AI in Fei-Fei Li's lab.  He then moved on to be a research scientist at OpenAI, and then to his current day job as director of artificial intelligence at Tesla.

So why might one be interested in these exotic recurrent neural networks anyway?  Because they are great at modeling and learning the statistics of data that changes over time.  Like modeling text for example. So train a recurrent neural network on Shakespeare plays, then set it lose, and it will babble new text that kind of sounds like it was written by Shakespeare.

For example,
O, if you were a feeble sight, the courtesy of your law,
Your sight and several breath, will wear the gods

One fascinating thing about the particular recurrent neural network system discussed in the tutorial, is that the neural network is just fed individual characters in the text (blanks spaces are also characters) one by one in order, and then it learns about the concept of characters arranged as words, and sentence punctuation, arrangement of words into paragraphs, how a script for a play is laid out with the character's name before a block of text that character says, etc. The recurrent network learns all of the structure associated with the text.

Most people would probably approach this problem by starting with inputting words as data elements instead of just raw characters. It's a testament to the strength of how powerful these systems are that they can learn all of the structure associated with text in addition to modeling the meaning and inter-relationships of the words in the text.

We'll be continuing this discussion of recurrent neural networks in subsequent posts.


Popular posts from this blog

Pix2Pix: a GAN architecture for image to image transformation

CycleGAN: a GAN architecture for learning unpaired image to image transformations

Smart Fabrics