Pretrained Transformers as Universal Computation Engines

 Interesting analysis of a recent paper on using frozen transformers as a fixed prior architecture for function approximation.  So what basis function set did it learn?



Comments

Popular posts from this blog

CycleGAN: a GAN architecture for learning unpaired image to image transformations

Pix2Pix: a GAN architecture for image to image transformation

Smart Fabrics