- apply convolution to a 5×5 patch with 32 features -> 24x24x32
- apply max-pooling 2×2 -> 12x12x32
Second convolutional layer:
- apply convolution to a 5×5 patch with 64 features -> 8x8x64
- apply max-pooling 2×2 -> 4x4x64
But it said “Now that the image size has been reduced to 7×7” but my calculation seems to claim that it is a 4×4 Did I miss some concept? I am new to CNN so it may be a beginner question. Thanks
Asked By : LKS
Answered By : Wandering Logic
How do we handle the boundaries? What is our stride size? In this example, we’re always going to choose the vanilla version. Our convolutions uses a stride of one and are zero padded so that the output is the same size as the input.
So they are:
- zero-padding the 28x28x1 image to 32x32x1
- applying 5x5x32 convolution to get 28x28x32
- max-pooling down to 14x14x32
- zero-padding the 14x14x32 to 18x18x32
- applying 5x5x32x64 convolution to get 14x14x64
- max-pooling down to 7x7x64.
They probably have an option to turn the zero padding off. In other infrastructures I’ve used zero padding is not the default. (In several of the infrastructures I’ve used zero-padding isn’t even possible.)
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/49658