Hacker News new | past | comments | ask | show | jobs | submit login

Valid-only convolution (in the MATLAB sense) by itself reduces the dimensionality of the input; for images, it will go from (h x w) to (h - kh + 1) x (w - kw + 1) per each plane.

You can think of a convnet as a series of feature transformations, consisting of a normalization/whitening stage, a filter bank that is a projection into a higher dimension (on an overcomplete basis), non-linear operation in the higher dimensional space, and then possibly pooling to a lower dimensional space.

The “filter bank” (aka convolution) and non-linearity produce a non-linear embedding of the input in a higher dimension; in convnets, the “filter bank” itself is learned. Classes or features are easier to separate in the higher dimensional space. There are some still-developing ideas on how all this stuff is connected to wavelet theory on firmer mathematical ground and the like, but for the most part, it just works "really well".

For an image network, at each layer there are (input planes x output planes) convolution kernels of size (kh x kw).

Each output plane `j` is a sum over all input planes `i` individually convolved using the filter (i, j); the reduction dimension is the input plane.




see https://github.com/facebook/fbcunn/blob/master/test/Referenc...

for a loop nest that shows what the forward pass of a 2-d image convnet convolution module does. That's gussied up with convolution stride and padding and a bunch of C++11 mumbo jumbo, but you should be able to see what it is doing.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: