Transformers for Image Recognition

In this video, Yannic runs us through a new paper currently under review called 'An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale'.  In the process, he schools us on what is really going on inside of the Transformer architecture.  

That alone is worth watching this.  Like most things, it's a lot easier to understand once you see what is really going on.



Comments

Popular posts from this blog

CycleGAN: a GAN architecture for learning unpaired image to image transformations

Pix2Pix: a GAN architecture for image to image transformation

Smart Fabrics