Posts

Showing posts from November, 2020

HTC Education Series: Deep Learning with PyTorch Basics - Lesson 1

Image
 This is the second course in the HTC Education Series.  It focuses on learning PyTorch and beginning Deep Learning.  PyTorch is the HTC recommended base language to learn for programming deep learning neural nets (as opposed to using Keras and Tensorflow) .  So we thought it would be good to put together a very basic beginners course on getting started with PyTorch programming. It's based on the Jovian.ai course called 'Deep Learning with PyTorch: Zero to GANs'.  There is a video lecture associated with each lesson.  There is also an executable Jupyter notebook associated with each lesson. It is hosted on jovian.  There are several different options for running your Jupyter notebooks, but we recommend using Colab.  Jovian has it setup to make it very easy to do this. We will be working through the jovian course lessons in this HTC Education Series Course as they come out.  A new lesson will be presented here each monday until the course is finished.  One nice thing about

A Conversation with Elon Musk

Image
 This is a podcast of a conversation between Lex Fridman and Elon Musk.  They discuss a wide variety of topics, including Artificial Inteligence, self-driving cars, the Neurolink brain-computer interface, and the rise and fall of civilizations.  Let's listen in. If you enjoyed the above conversation, there's another one we can listen in on from a year ago, which features a little more heavily on the Tesla Autopilot.

Creative Adversarial Networks (CAN) - Generating Art by Learning About Styles and then Deviating from Style Norms

Image
 What is art?  What is creativity?  What is 'aesthetically appealing?  These are fascinating questions that perhaps tell us more about what it is like to be human. Which poses the following question.  Is it possible to remove the human completely from the art making process and still generate what humans would consider art? Here's a link to a blog article on this topic.  It points out something interesting, which is that conventional GAN systems might be great at creating new images of a particular type (faces, landscapes, furniture) , but in some sense they aren't really being creative.  At least not in a certain artistic sense.  Because they aren't breaking away from the representations they were trained to generate. The developers of these GAN systems would probably reply, of course not, because it would defeat the whole point of our trained GAN system, which is to generate things that look like the type of images they were trained on . But in some sense, the whole

Moving Towards Cross Platform Neural Network Deployment

 Suppose you are an application developer who develops cross platform applications. So you write your program in a way that allows it to be run on different desktop platforms (mac,windows, linux), and perhaps also targeting mobile platforms (ios, android).  And now you want to incorporate neural network models into your cross platform application.  How does one do this? Now if you are just writing a C++ application, Qt provides a great solution for you.  Because you can write one code base, and actually target all of the above platforms for deployment of your application.  This is great, it works, it isolates you from platform specific distractions, preventing them from isolating your application to just one specific platform, unable to escape it. And upcoming Qt 6 provides a great way to isolate the platform specific distractions of GPU specifics internal to these various platforms.  Platforms which can all be very different as you dive deep into their internal architectures.  For exa

Sketch to Art Style Transfer Techniques - how to bring the artist into the process

Image
How can artists become more directly involved in working with and controlling different deep learning based style transfer techniques?  This is an important consideration, as many deep learning systems and algorithms are developed by computer scientists or engineers, who oftentimes have very different sensibilities than people trained with a traditional art background (artists) . Several different approaches were presented and discussed at Siggraph 2020.  Let's take a look at two of them that were presented at the Real-Time Live demo session.  Both of these are from sections of a much longer video that presents a wide variety of different live demos, and while the rest of the longer video does not directly relate to today's post topic, feel free to watch them if you are interested. The first live demo is called 'Interactive Video Stylization Using Few-Shot Patch-Based Training' by Ondrej Texler. The second live demo is called 'Sketch-To-Art: Synthesizing Stylized Ar

HTC Seminar Series #21 - From Research to Production with PyTorch

Image
Today's HTC seminar is presented by Jeff Smith of Facebook. Jeff discusses some of the latest features from PyTorch - the TorchScript JIT compiler, distributed data parallel training, TensorBoard integration, new APIs, and more. He talks about some projects coming out of the PyTorch ecosystem like BoTorch, Ax, and PyTorch BigGraph. He also presents some of the use cases and industries where people are successfully taking PyTorch models to production. This presentation digs into a key factor, which is the following.  How do you deploy your deep learning neural net system into actual production?  How to you get it into an actual product that actual people can use? Here's a tutorial on 'Loading a Torchscript Model in C++'.

Computational Video Editing for Dialog Driven Scenes

Image
 Non-linear editing is of particular interest to me (given my background as one of the original developers of ProTools digital audio editing system).  So when i watched the video below and read the associated paper i got very excited.  Let's take a look at a demo and then we can comment further. So we're looking at an experimental system to edit dialog driven scenes in a video.  And conventionally this kind of editing is done in a non-linear video editing system.  So Avid or Final Cut or Premiere would be common systems used for almost everything to watch today.  All of which have their humble beginnings back in the golden olden times when we were doing that original Pro Tools development effort. And as they say in the video demo above, hand editing of this kind of scene by a human editor using one of these non-linear editing systems is kind of a pain.  Time consuming, tedious.  So developing semi-automated systems that can do a lot of the manual grunt work that is needed to cr

HTC Education Series: Getting Started with Deep Learning- Lesson 9

Image
 Our journey into getting started with deep learning continues.  Today, we head upstream into uncharted territory.  What i mean by that is that the first 8 lessons included the complete fastai V2 course, and the first part of the v2 fastai course we have completed.  We're all anxiously awaiting for Jeremy and crew to come out with their v2 Part 2.  What to do until that happens? There are still a ton of cool things to learn.  And as it turns out, there are several previous years of the fastai course we can mine for additional information on these topics.  Along with our additional HTC specific material that we always include in our lessons (which is not included in the fastai course) . Today's first lecture is one by Jeremy of fastai from last year.  Lesson 7 from the 2019 fastai course.  We'll be covering things like 'skip connections', and the fabulous U-net architecture which uses skip connections internally to dramatically improve image segmentation results usin

The Neuroscience of Optimal Performance

Image
This is a little bit different than our normal HTC posts, but pretty interesting.  Lex Fridman interviews Andrew Huberman.  Andrew is a neuroscientist in the Department of Neurobiology at Stanford University School of Medicine.   Part of this talk is about how to train your brain to achieve optimal performance (something i think everyone would be interested in understanding more about) .  They also discuss other topics including overcoming fears, the human visual system, Elon Musk's Neuralink, and the science of consciousness.  Let's dive in. If you go to the YouTube page for the video there are various links to more information on Andrew. There is a specific HTC post on Elon Musk's Neural link system if you are interested. Here's another talk by Andrew on Good Mental Health and Tools for the Brain. This is a very non technical talk, and would be good for your typical Maui resident.

How Computer Graphics Expertise Will Further the State of the Art in Machine Learning

Image
 Today's presentation is a talk by Martin Wicke of Google on 'How Computer Graphics Expertise Will Further the State of the Art".  It was presented at Siggraph 2019.  Martin is in the Tensor Flow group at Google. The session focuses on the success of deep learning in solving many problems that had long defied solution. As such, machine learning has become an essential tool in many disciplines — something that is evidenced by recent increases in the amount of papers that use related tools at SIGGRAPH.  Martin  discusses knowledge transfer in the opposite direction: What insights can experts in computer graphics bring to the field of machine learning? How can knowledge about geometry, rendering, simulation, or perception be used to further the state of the art in machine learning?

Datasets for Understanding the 3D World, Introducing the Objectron Dataset.

Image
 Understanding objects in 3D space is a challenging task.  Part of the problem is that many of the existing real world datasets one can use to training deep learning nets are based on 2d images.  There needs to be more datasets that are focused on capturing the 3d structure of objects.  At the same time, you'd like to organize this data so that it can easily be used as input to machine learning algorithms (like deep learning neural nets).  One approach to doing this is to create object-centric video clips, and then use them to build your dataset.  That is the objective of Google's new Objectron Dataset.  You can read all about it here . Objectron is a collection of short video clips, that capture object views from different angles.  Each video clip is accompanied by AR session metadata that includes camera poses and sparse point-clouds. The data also contain manually annotated 3D bounding boxes for each object, which describe the object’s position, orientation, and dimensions.

What's Next for Fastai/ with Jeremy Howards

Image
 There's a very interesting and informative recent interview with fastai's Jeremy Howard by Sam Charrington of TWIML AI.  I first heard it on the TWIML AI Podcast, which you can access here . It's also available in a slightly longer format as a video if you want to see everyone's smiling faces in the zoom interview. Jeremy gets into several really interesting topics, which we're going to briefly talk about below.  We'll expand out to provide a little more info for some of thee topics.  But you should really watch the keynote, or listen to the podcast to hear the whole thing. The interview starts off with some info on Jeremy's background, and then dives into a conversation focused on the fastai v2 release, and it's associated book "Deep Learning for Coders with fastai and PyTorch: AI Applications Without a PhD'. The Future of fastai is then discussed.  As we alluded to in a recent HTC Seminar Series post that was an interview with Chris Lattner,

HTC Seminar Series #20- Top 10 User Interface Trends 2020

Image
Today's seminar is a change of pace.  We will be looking at the top 10 UI trends in 2020. For each UI trend, they overview what it looks like, and why.  Then an example is shown of how to use Qt DesignStudio to build an example of the UI trend using Qt and QML.   This webinar will provide an overview of the Top 10 User Interface Trends, in which modern UI designers have been developing & implementing during the 2020 year. The trends we discuss, are based on observations and research from the end of 2019, and throughout 2020. From mobile, web, desktop, and embedded, the trends we discuss are applicable for all types of User-Interfaces. As a demonstration, we will guide attendees on how to apply each unique design trend, using Qt Design Studio. The do try to pitch a paid course at the end that has already taken place. Speakers: - Dr. Antti Aaltonen is the Head of User Experience at the Qt Company and leads the design of tools for UI/UX designers. He had always enjoyed working

OpenAI Microscope - Visualizing Neural Networks

Image
 So OpenAi has a new tool they have made available called Microscope .  It lets you visualize every significant layer and neuron in 8 different deep learning neural networks used for processing images. The are AlexNet, AlexNet (Places), Inception V1, InceptionV1 (Places), VGG19, InceptionV3, Inception V4, and ResNet v2 50. The OpenAI Microscope is based on two concepts, a location in a model and a technique. Metaphorically, the location is where you point the microscope, the technique is what lens you affix to it. Our models are composed of a graph of “nodes” (the neural network layers), which are connected to each other through “edges.” Each op contains hundreds of “units”, which are roughly analogous to neurons. Most of the techniques we use are useful only at a specific resolution. For instance, feature visualization can only be pointed at a “unit”, not its parent “node”. Check it out for ResNet v2 50. This tool is amazing! Currently they offer DeepDream and Caricature visualizatio

HTC Education Series: Getting Started with Deep Learning - Lesson 8

Image
 This lesson starts off with a great lecture by Jeremy of fastai on Natural Language Processing (NLP).  Using deep learning nets and the fastai api. So we'll be covering tokenization of text data.  We'll be doing a deep dive into how to code recurrent neural nets (RNN) using the fastai api.  We'll cover methods like LSTM which were developed to prevent exploding gradients in RNNs. You can also watch this first fastai video lecture on the fastai course site here . The advantage of that is that you can access on that site a searchable transcript, interactive notebooks, setup guides, questionnaires, etc. What is covered in this lecture? Natural language processing with fastai api     recurrent neural network The advantages of starting with a pre-trained model     wikipedia text language model Fine tune the pre-trained model with a model more directly related to your desired task     IMDb movie review text model Text Pre-processing     word ,  sub-word,  character based Putting

Qt 6.0 Beta Released

Image
 Upcoming Qt 6 is very exciting.  The improvements they have made really makes it shine as a single application framework to develop cross platform applications that can run on mac, windows, linux, ios, and android. You can also target embedded systems. Qt is a C++ development framework.  It includes a declarative user interface markup language called QML.  Qt 6 really builds up the C++ integration with QML in exciting new ways (bring the best of QML to C++). Latest Information Update QT 6 has hit another significant milestone.  The Qt 6.0 Beta is released . Putting ease of mind into everyone counting on a Qt 6 release in early 2021 Here is Lars Knoll's Nov 2020 keynote talk on Qt6 - What's Cooking.  It details the latest up to the minute status of imminent Qt 6. For more information on what improvements Qt6 will bring, Lars gave a longer keynote at the end of 2019 that is very informative.  This longer keynote presentation is titled 'Qt 6 will bring massive improvements to

Generative Adversarial Networks for Image Synthesis and Transformation

Image
 We're continuing our exploration of generative neural models with this presentation by Jan Kautz from Nvidia on 'Generative Adversarial Networks for Image Synthesis and Translation'.  This presentation is from November 2019.  It's a good overview of some of the different GAN approaches for generative synthesis we've recently been looking into. I find the use of 'translation' to be confusing since it typically refers to an affine transformation, so the post is titled using our preferred 'Transformation' terminology . We've got a GAN double header today.  Our second presentation is from DeepMind, and is a Deep Learning Lecture on the topic of 'Generative Adversarial Networks'. Mihaela Rosca presents Part 1- 'Overview', and Part 2-' Evaluating GANs'.  Jeff Donahue then presents The GAN Zoo, which has a Part 3.1 on 'Image Synthesis with GANs: MNIST to ImageNet' , and a Part 3.2 on 'GANs for Representation Learning

Big Sur has Arrived - Apple's Transition to ARM Processor Macs

 Apple Computer released their new macOS 11 Big Sur operating system update today.   So note right away that we went from osx 10.15 to macOS 11.  A full version bump from 10 to 11, and osx is retired for the new macOS marketing slogan. Fun Fact - it references itself internally as 10.16 in some mac apis to avoid potential problems associated with code badly tied to version numbers. Big Sur is the latest wave associated with some fundamental underlying change in the personal computer world.  A wave that is a lot bigger then just going from 10 to 11 in marketing speak. Because you can also order new macs from apple this week that use apple's own ARM RISC processors rather than Intel CISC processors.  The same ARM RISC technology that ios devices like iPhones and iPads have been using for quite awhile now.  So now we have 3 different models of macs for sale that use apple's new M1 processor.  This is a big deal, and is the reason why the marketing speak went from 10 to 11 (Spinal

Lecture on Visualization, Deep Dream, Neural Style, and Adversarial examples

Image
 2016 must have been a very good year when Andrej Karpathy taught Stanford CS231.  We're presenting his 2016 Lecture 9 on 'Visualization, Deep Dream, Neural Style, and Adversarial Examples'. Enjoy the points of view and varied expertise of different experts on deep learning like this one. This lecture complements what we have been covering in the HTC course on Getting Started with Deep Learning, and i might move a copy of it directly into the HTC Lesson's Extra Material section somewhere (maybe as a replacement for RNN lecture this week or an extra thing to check out).

Recent Advances in Unsupervised Image-to-image Tranformation

Image
 Today's talk is by Xun Huang from Cornell University (and Nvidia) on 'Recent Advances in Unsupervised Image to Image Translation' and was given at Microsoft Research in Sept 2019.  Note that for the title of the post we're using HTC preferred terminology of Transformation for this subject area, since Translation is typically thought of as an affine transformation . The project page for the talk is available here .  Here's a link to some of Xun's publications.  He also has a personal page with a lot of info on it. Abstract for this talk: Unsupervised image-to-image translation aims to map an image drawn from one distribution to an analogous image in a different distribution, without seeing any example pairs of analogous images. For example, given an image of a landscape taken in the summer, one may want to know what it would look like in the winter. There is not just a single answer. One could imagine many possibilities due to differences in weather, timing, l

Semantic Image Synthesis with Spatially Adaptive Normalization

Image
 A lot of work has been done developing deep learning nets that take an image as input and output a series of tags for objects present in the image.  Less studied, but perhaps more interesting, is the notion of a deep learning net that takes a textural description of an image or scene and then outputs an image that looks like the textural description passed to it as input. So let's deep dive into a recent paper that tries to turn a textural description of some imaginary image, and then generate an artificial image that looks like the description. And this particular approach adds an additional constraint, a segmentation map constraint to be precise.  And it being precise is kind of the whole point to this. The paper describing this 'Semantic image Synthesis with Spatially-Adaptive Normalization' can be found here . A project page called 'Semantic Image Synthesis with SPADE' can be found here . Here's a short video presentation of 'GauGAN: Semantic Image Synt

HTC Education Series: Getting Started with Deep Learning - Lesson 7

Image
Another exciting lesson in HTC's 'Getting Started with Deep Learning' course begins.  Following our usual presentation layout of our lesson's material, we will start with the fastai Part1 2020 Lesson 7 lecture.  This lecture continues the extremely fascinating discussion started last week about how to generate a latent space associated with a collaborative filtering model. We then move onto the topic of tabular data. We will take a diversion into a random forest, where we will learn more about random forests than one might have expected.  This being a lead up to the notion of using neural networks for tabular data. You can also watch this first fastai video lecture on the fastai course site here . The advantage of that is that you can access on that site a searchable transcript, interactive notebooks, setup guides, questionnaires, etc. What is covered in this lecture? regularization weight decay - L2 regularization we need to wrap with nn.Parameter() in PyTorch to mak

Running Jupyter Notebooks on Gradient

Image
 In previous posts we have discussed Jupyter Notebooks. We have also discussed Colab Notebooks, which are Jupyter Notebooks hosted by Google. Gradient is an alternative to Colab for running Jupyter notebooks on the cloud.  Like Colab, they offer a free tier.  They also have additional paid tiers with added features for professionals. Let's watch the Welcome to Gradient video for a quick overview of what Gradient is all about.   Gradient actually has way more features then we are going to use in out HTC 'Getting Started with Deep Learning' course.  But it's fascinating to learn about them, and someday they may come in handy for you . Here's a link to the Quick Start documents for working with Paperspace Gradient.  There's a table of contents on the left side you can use to move around to different topics of interest. Again, this document discusses many additional Gradient features we will not use in the HTC Deep Learning course.  You can use this document for yo