OpenAI Clip - Connecting Text and Images

January 18, 2021

So the big news in deep learning AI this last week was the announcement of OpenAI's DALL-E and the associated companion work on the CLIP algorithm. We already have one post on DALL-E, which is a generative model architecture for creating an image from a textural description.

CLIP is a deep learning model with a contrastive objective function that generates a textural description of what is in an image.That is pretty slick in itself. But the resulting model can be turned into an arbitrary zero-shot classifier for new tasks. It's like transfer learning, but slightly different.

Yannic Kilcher gives us the lowdown on the CLIP algorithm. Let's check it out.

Here's a link to the 'Learning Transferable Visual Models from Natural Language Supervision' paper.

Here's a link to the PyTorch CLIP code.

Search This Blog

Haiku Tech Center

OpenAI Clip - Connecting Text and Images

Comments

Post a Comment

Popular posts from this blog

CycleGAN: a GAN architecture for learning unpaired image to image transformations

Pix2Pix: a GAN architecture for image to image transformation

Smart Fabrics