Pretrained Transformers as Universal Computation Engines

 Interesting analysis of a recent paper on using frozen transformers as a fixed prior architecture for function approximation.  So what basis function set did it learn?


Popular posts from this blog

Pix2Pix: a GAN architecture for image to image transformation

CycleGAN: a GAN architecture for learning unpaired image to image transformations

Smart Fabrics