Transformers for Applications in Audio, Speech, and Music
This presentation is a seminar on Transformers for Application in Audio, Speech, and Music by Prateek Vermma of Stanford University (CCRMA research). It is a lecture in their very good Transformers United course that covers recent (as of fall 2021) developments in transformer architectures.
There are some very good take away messages form this presentation. How to beat the wavenet architecture. How to use K-means clustering to convert continuous embeddings into a discrete representation that transformers seem to love processing. How to incorporate ideas from wavelets into the transformer mix. The advantages of a signal specific learned adaptive front end to the transformer system.
Comments
Post a Comment