HTC Seminar #27 - Priors for System 2 Knowledge Representaation

 Today's HTC Seminar is by Yoshua Bengio on 'Priors for System 2 Knowledge Representaation', and was presented at the ICML conference in July 2020.

The talk is on Slides Live and can be seen here.


1.  So what is system 1 and system 2 (not to be confused with software 1.0 and software 2.0 metaphors)? If you ask the all knowing entity 'the googler', you get the following (at least you do today):

System 1 is the brain's automatic, intuitive, and unconscious thinking mode. It requires little energy or attention, but it is often biased prone. ... System 2 is a slow, controlled, and analytical method of thinking where reason dominates. Unlike system 1, it requires energy and attention to think through all the choices.

So Yoshua is taking this concept of cognition described by Daniel Kahneman in his book 'Thinking Fast and Slow', and applying it to deep learning. So system 1 is learning the data manifold, system 2 is more like traditional rule based AI.

Here's a quick look into a conversation with Daniel Kahnman if you are interesting in understanding more about this concept.

2.  So what are priors?

No, it's not refering to your previous criminal record.

If you are a Bayesian, then a prior is a probability distribution over a set of distributions which expresses a belief in the probability that some distribution is the distribution generating the data.

A broader way to think about i in the context of deep learning is that it's the 'knowledge' encoded within the structure of your architecture. So if you are doing data augmentation, you are pre-building priors into the architecture based on your human knowledge of what data augmentations seem perceptually relevant.

If you are using a pre-trained neural network in your system in some way, that is a prior in your system.

What is fascinating is that even if you use a randomly-initialized neural network, it can be considered a had-crafted prior that can lead to reasonable results on some problems.

Check out this paper for an example of what i mean by this.


Popular posts from this blog

Pix2Pix: a GAN architecture for image to image transformation

CycleGAN: a GAN architecture for learning unpaired image to image transformations

Smart Fabrics