Automated Video Meeting AI

One of the things we're interested in here at HTC are different approaches to building AI Bots. As we seem to be diving head first as a society into virtual meetings and telepresence as a normal part of daily life. Which immediately leads you to start thinking about how AI Bots are going to integrate or hijack themselves into your daily stream.

As a joke the other day i was mentioned how a certain government leader could be accurately emulated in an AI Bot by taking the repetitive speech patterns, rambling ever changing story line, etc, of their speech patterns, as well as a set of procedural animating image transformations to represent the visual part of the AI Bot. I was thinking of morphing transformations between different states in a random state machine.

Well, low and behold, i wake up the next morning and someone has already done it.  Matt Reed put together a digital twin to attend Zoom conference calls in his absence.

It was actually pretty straightforward.  He just recorded a few video shots of himself using the QuicktimePlayer (mac only, but you get the idea) and then saving them as a set of movie files for the AI Bot.

He then used ManyCam to setup a virtual webcam. The virtual webcam could then log into Zoom as a video conference attendee.

Reed had already been working with Artyom.js, a speech recognition and text-to-speech library, for another project. So, he knew it had the capability to listen for key phrases and speak back anything you program it to.

"It's basically a library to build your own custom voice assistant using standard web technologies," he says. "The best part is, you don't have to give it a wake word like 'Alexa' or 'Siri' to get it started. It listens for just the command phrases like 'How are you?' or 'Are you OK?' or 'Could you send that?'—which triggers a chain of commands like cycling through the stills of my face and speaking back the response text."

Want to build your own Zoombot? You can get a copy of it here and customize it.

Comments

Popular posts from this blog

CycleGAN: a GAN architecture for learning unpaired image to image transformations

Pix2Pix: a GAN architecture for image to image transformation

Smart Fabrics