[Deep Learning] Convolutional Neural Network (CNN)

Recent Posts

Recent Comments

Tags more

Archives

Today

Total

Code&Data Insights

[Deep Learning] Convolutional Neural Network (CNN) 본문

Artificial Intelligence/Deep Learning

[Deep Learning] Convolutional Neural Network (CNN)

paka_corn 2023. 11. 7. 02:00

What is a Convolution?

: a standard operation, long used in compression, signal processing, computer vision, and image processing

Convolution = Filtering = Feature Extraction

Main difference with the MLP

1) Local Connection

: Local connections can capture local patterns better than fully-connected models

-> search for all the local patterns by sliding the same kernel

-> have the chance to react to patterns that are in different positions.

2) Weight Sharing

: differently to MLPs that employ different weights for different neurons

Kernel Size

: the wieghts of the filter w are learned from data

- the length of the convolving filter, the kernel size = k

Input Channels

- In general, the convolution takes a multi-channel input, filters it with some filters, and returns multi-channel output

- The multi-channel input is processed by a filter of dimensionality

- The convolutional framework is flexible enough to manage multi-channel inputs as well.

multi channel input size = kernel_size * input channels

Output Channels

: the number of different feature maps generated through the convolution operation.

- Each output channel is responsible for detecting and storing various features.

- In CNN, we want each filter to react to different local patterns

- All these outputs produced by the convolution are gathered in single matrix with dimensionality

- To capture multiple patterns, we have to process the input with many different filters

-> Multiple output channels can be used by the model to learn more complex patterns.

Stride

: quantifies the amount of movement(step size) by which we slide a filter over an input

- hyperparameter of the convolution layer

- if stride facter is bigger than 1, the effect is to compress the input

Dilated Convolution

- a technique that expands the filter by inserting holes between its consecutive elements

- can be done to cover a larger area of the input

Stacking Convolutional Layers

- we can stack multiple convolutional layers to form a deep convolutional network.

- optionally, normalization is applied right after the convolution (layernorm or batch norm)

- then, a non-linearity is applied (ReLU or LeakyReLU)

=> after non-linearity, the set of features = feature maps

- After stacking multiple convolutional layers, we can apply a final linear transformation(fully-connected layer)

- Finally, we apply a flattern operation that stacks in a single big vectore all the output channels

=> Feature Extraction

Receptive Field

: the region of the input space that affects a particular unit of the network

- Show what specific area of the input image a particular neuron is looking at.

- To put it simply, the receptive field of a neuron represents the area in the input that the neuron uses for its computations.

- Smaller receptive fields enable neurons to detect finer and more detailed features.

=> Therefore, in CNNs, the receptive field is a crucial concept that explains how the model perceives and understands various parts of an image.

* The receptive field depends on different factors

1) Kernel Size
2) Number of Layers
3) Stride Factor
4) Dilation factor

Parameters

- the number of paramters in 1D convolutional layer

=> kernel size x input channel x output channel

- the number of paramters in 2D convolutional layer

=> kernel size(x) x kernel size(y) x input channel x output channel

Pooling

: helps to make feature maps approximately invariant to small transitions of the input

- pooling is often applied after the non-liearity(activation function)

- the size of the sliding window and its stride factor are hyperparameters of the pooling operation

(ex) Max Pooling, Avg Pooling

'Artificial Intelligence > Deep Learning' 카테고리의 다른 글

[Deep Learning] Recurrent Neural Networks (RNN) (1)	2023.12.10
[Deep Learning] Regularization - Dropout \| Data Augmentation \| Multitask Learning (0)	2023.12.10
[Deep Learning] Vanishing and Exploding Gradients \| Weight Initialization \| Batch Normalization \| Layer Normalization (0)	2023.11.06
[Deep Learning] Advanced Optimization Methods - Momentum \| Adaptive Learning Rate \| AdaGrad \| RMSProp \| Adam \| Distributed Synchronous SGD (0)	2023.11.02
[Deep Learning] Optimization - Gradient Descent \| Stochastic Gradient Descent \| Mini-batch SGD (0)	2023.11.01