[Week2] DL Basic - CNN [Day3]

*Convolution

*Convolutional Neural Networks

Convolution and pooling layers : feature extraction
Fully connected layer : decision making (e.g., classification)
- 최소화 시키는 추세 (parameter dependency : 파라미터가 많으면 학습이 잘 안되고, Generalization performance 감소)

*Stride

*Padding

*Convolution Arithmetic

Padding (1), Stride (1), 3 x 3 Kernel
What is the number of parameters of this model?
- The answer is 3 x 3 x 128 x 64 = 73,728
Excercise
number of parameters : 2048 * 2 x 1000 = 약 4M
- parameter수를 줄이기 위해서는 dense layer를 줄여야 함

* why 1x1 Convolution?

*AlexNet

*VGGNet

Increasing depth with 3 x 3 convolution filters (with stride 1)
- Receptive field : 커널이 커짐에 따라 고려되는 인풋의 크기
- why 3 x 3 convolution?
1 x 1 convolution for fully connected layers
Dropout (p=0.5)
VGG16, VGG19

*GoogLeNet

network-in-network (NiN) with inception blocks
Inception blocks
- 1 x 1 Conv를 적절히 섞어주며 dimension reducntion을 통해 parameter를 줄임
  1 x 1 convolution enables about 30% reduce of the number of parameters

*ResNet

Deeper neural networks are hard to train
- overfitting은 아니지만 layer가 깊어져도 학습률이 좋지 않음
Add an identity map (skip connection)
skip connection
Bottleneck architecture
- dimension reduction

*DenseNet

*Semantic Segmentation

*Detection

SPPNet
- R-CNN은 이미지 내의 bbox를 CNN을 다 통과 시켜야함
- SPPNet은 CNN을 한번만 수행
Fast R-CNN
- Selective search를 통한 bbox 제안
- CNN 한번 수행 (SPPNet과 같음)
- For each region, get a fixed length feature from ROI pooling
- Two outputs : class and bounding-box regressor
Fatser R-CNN
- 기존 Selective search단계를 Region Proposal Network로 바꿈
- Region Proposal Network
- 9 : Thress different region sizes (128, 256, 512) with three different ratios (1:1, 1:2, 2:1)
- 4 : four bounding box regression parameters
- 2 : box classification (wheter to use it or not)

YOLO
- No explicit bounding box sampling (compared with Fatser R-CNN) -> speed up
- Given an imgae, YOLO divides it into SxS grid
- Each cell predicts B bounding boxes (B=5)
  - box refinement (x / y / w / h)
  - confidence (of objectness)
- Each cell predicts C class probabilities
- In total, it becomes a tensor with SxSx(B*5+C) size
  - SxS : Number of cells of the grid
  - B*5 : B bounding boxes with offsets(x,y,w,h) and confidence
  - C : Number of classes
- Result

[Week2] DL Basic - Generative Models [Day5] (0)	2021.08.13
[Week2] DL Basic - Transformer [Day4] (0)	2021.08.12
[Week2] DL Basic - RNN [Day4] (0)	2021.08.12
[Week2] DL Basic - Optimization [Day2] (0)	2021.08.10
[Week2] DL Basic - MLP(Multi-Layer-Perceptron) [Day1] (0)	2021.08.09