Curve Detectors

 Lets take a look at a recent article in Distill that analyses curve detectors in the InceptionV1 deep learning neural net.  

What do we even mean by curve detectors anyway?  You could of course reread yesterdays HTC blog post.  But in the discussion associated with todays highlighted Curve Detector article, we are referring to 'curve neurons' in the Inception V1 feature space.

For example, here are 1- curve neurons in layer 3b

We are looking at a feature visualization of what an input image that maximally excites them looks like.

Curve detectors are activated by input signals of a certain orientation.

So if you have a bunch of them you can cover all of the different orientations.  And get every where in between by rereading the recent HTC material on steerable oriented filters.

You can find the curve detector article here.

You can find a guided overview of early vision in the InceptionV1 architecture here.


1: The article claims it is surprising that curve detectors are meaningful features. Why? I don't think it's surprising at all.  Curves are a natural characteristic of the information manifold that real world images and objects live in.

However, maybe we should be less focused on this whole notion of 'feature detectors' and more focused on what the specific properties of natural images are, and how the model (neural net architecture in this case) encodes that set of natural properties.

We can expand on that further by saying that we really should be focused on the properties of the information manifold that humans can actually perceive (for any task associated with analyzing or representing objects or images humans are going to look at).  This paradigm has been successfully utilized in applications like image or video compression as well as stochastic screening algorithms for digital printing.

2: So is that dynamic spline parameterization they show better or worse than using an object segmentation neural net to segment the object, and then using a conventional bezier spline fitter to build the exact curve from the segmentation boundary?

3: Is the combing phenomena they describe real, or an artifact of the feature visualization algorithm?  Or an artifact of the underlying CNN architecture implementation?   I don't know, i'm just asking.  You can see it very clearly in the very recent multi-modal neuron paper feature visualization output.

In a re-read through the paper, i also just noticed that it's very prevalent in the 'curve extraction' algorithm they showed off. Humm...

4: Don't let my cranky observations above make you think i didn't like the paper, because i think it's an awesome read (as are all of the Distill publications for the most part).


Popular posts from this blog

Pix2Pix: a GAN architecture for image to image transformation

CycleGAN: a GAN architecture for learning unpaired image to image transformations

Smart Fabrics