HTC Seminar Series #32: How do we Inject Inductive Bias into a Deep Learning Model
We're mixing up the HTC seminar series formula a little bit in this post.
Let's start off with a good podcast with Max Weiler you can listen to here. Covers gauge equivalent CNNs, group equivalent CNNs, generative VAE baysian stuff, compressing a neural net for deployment, etc.
The end discussion about modeling vs just interpolating to data is fascinating. Laws of physics have a few parameters. Stuff we care about lives in that world. So the real world lives in a manifold. We ideally want models that really model this. As opposed to just overfitting to a huge amount of data.
Historically, we started by building models for our systems.
Then we moved to just training our system on data, because it did better than more limited models. But these systems are missing the underlying low dimensional model of the world, the manifold it lives in. They just try to interpolate the data.
Here's a link to the Sutton rebuttal, titled 'Do we still need models or just more data and compute?'.
This is all a great lead in to the following lecture by Max Welling titled 'Learning Equivariant and Hybrid Message Passing on Graphs', presented at MIT CSAIL in may 2020. The podcast above is from 2019.
I started this post off thinking it was going to be titled 'Group Theory and Deep Learning'. Obviously we strayed a little bit (not entirely). What Max is talking about above relates to this work of his below.
The paper titled 'A General Theory of Equivariant CNNs on Homogeneous Spaces' can be found here.
Here's another paper titled 'Learning the Irreducible Representations of Commutative Lie Groups'.
Using group theory and symmetry to redefine neural net architecture.
Extends to the concept of manifolds (pay attention here).
So they view it as now we can run the net on geometry manifolds like spheres, 3D objects, etc.
But of course human visual perception lives on manifolds as well. So what can you rethink when you view it that way?