Friday, March 23, 2018

Convolutional neural network course Coursera - Week 1

Edge detection with convolutional filter.
Image is nxn, fxf filter (f is usually odd).

Valid convolution when you don't pad the original image which is nxn, with fxf filter you get n-f+1 x n-f+1 output image which tells where are the edges.
Same convolution when you pad the original image so that every pixel gets equal opportunity to participate in the final output => (n + 2p -f)/s + 1 = n

Strided convolution - when you do the convolution while making a stride. Output image size will be (n + 2p -f)/s + 1.

Convolutions over 3D volumes:

For e.g. RGB image.
Image is 6x6x3 and filter 3x3x3, then you get 4x4 output. First 9 numbers will detect edges in red channel and so on..

Multiple Filters
What if you want to use multiple filters at the same time? For e.g. detect Vertical/Horizontal edges together? Or detect edges at various angles?
In the above example if you apply 2 3x3x3 filters, you will get output as 4x4x2.
Which is n -f + 1 x n - f + 1 x (number of filters).

How to tune parameters
But for now, maybe one thing to take away from this is that as you go deeper in a neural network, typically you start off with larger images, 39 by 39. And then the height and width will stay the same for a while and gradually trend down as you go deeper in the neural network. It's gone from 39 to 37 to 17 to 14. Excuse me, it's gone from 39 to 37 to 17 to 7. Whereas the number of channels will generally increase. It's gone from 3 to 10 to 20 to 40, and you see this general trend in a lot of other convolutional neural networks as well.

Similar to convolutional layer, there is pooling layer:
for e.g. Max pooling - if a feature is detected anywhere - preserve it.
It has some hyperparameters but no parameters(to learn for gradient descent)
Hyperparameters -> f,s (filter size, stride)

Similarly, average pooling:

No comments:

Blog Archive