Blog

Why can a 5×5 Convolution kernel be replaced with two 3×3 kernels?

July 26, 2021 by Author

Table of Contents

1 Why can a 5×5 Convolution kernel be replaced with two 3×3 kernels?
2 Does Convolution reduce size?
3 What is flatten layer?
4 How to improve the speed of 5×5 convolution?
5 What happened to the 5×5 convolutional layer in GoogLeNet?

Why can a 5×5 Convolution kernel be replaced with two 3×3 kernels?

For example, at one layer there’s a 5×5 kernel takes a face to activate. When replacing it by two 3×3 kernels, the lower layer 3×3 kernel only covers part of the face and only activates at one spatial pattern, i.e. eye, ear, mouth.

Does Convolution reduce size?

So each Convolution results in reduction in the size. Same Convolution uses padding such that the size of the matrix is preserved.

What is 3×3 convolution layer?

Blog. In image processing, a kernel, convolution matrix, or mask is a small matrix. It is used for blurring, sharpening, embossing, edge detection, and more. This is accomplished by doing a convolution between a kernel and an image.

Why do we use flatten in CNN?

Rectangular or cubic shapes can’t be direct inputs. And this is why we need flattening and fully-connected layers. Flattening is converting the data into a 1-dimensional array for inputting it to the next layer. We flatten the output of the convolutional layers to create a single long feature vector.

What is flatten layer?

Flattening is merging all visible layers into the background layer to reduce file size. The image on the left shows the Layers panel (with three layers) and file size before flattening.

How to improve the speed of 5×5 convolution?

Factorize 5×5 convolution to two 3×3 convolution operations to improve computational speed. Although this may seem counterintuitive, a 5×5 convolution is 2.78 times more expensive than a 3×3 convolution. So stacking two 3×3 convolutions infact leads to a boost in performance.

How do you calculate the size of a 5×5 convolution layer?

The convolution kernel size needed for a depthwise convolutional layer is n_ depthwise = c * (k² * 1 ²). Uncoupling those 2 reduces the number of weights needed: n_separable = c * (k² * 1 ²) + 1 ² * c². Considering a 5×5 convolutional layer, k² is smaller than c > 128. That let us with a ratio of approximately the kernel surface: 9 or 25.

What happened to the 5×5 convolutional layer in GoogLeNet?

In the later versions, the 5×5 convolutional layer of the first version of GoogleNet has been replaced by 2 stacked 3×3 convolutional layers, copying VGG16. What happened here? The number of parameters grows quadratically with kernel size.

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.