*Convolution
- Contiunuous convolution
- Discrete convolution
- 2D image convolution
- 2D convolution in action
- RGB Image Convolution
*Convolutional Neural Networks
- Convolution and pooling layers : feature extraction
- Fully connected layer : decision making (e.g., classification)
- 최소화 시키는 추세 (parameter dependency : 파라미터가 많으면 학습이 잘 안되고, Generalization performance 감소)
*Stride
*Padding
*Convolution Arithmetic
- Padding (1), Stride (1), 3 x 3 Kernel
- What is the number of parameters of this model?
- The answer is 3 x 3 x 128 x 64 = 73,728
- Excercise
- parameter수를 줄이기 위해서는 dense layer를 줄여야 함
* why 1x1 Convolution?
- Dimension reduction
- To reduce the number of parameters while increasing the depth
- e.g., bottleneck architecture
*Modern CNN
*AlexNet
- Key ideas
- ReLu activation
- Preserves properties of linear models
- Easy to optimize with gradient descent
- Good generalization
- Overcome the vanishing gradient problem
- GPU implementation (2 GPUs)
- Local response normalization, Overlapping pooling
- Data augmentation
- Dropout
- ReLu activation
*VGGNet
- Increasing depth with 3 x 3 convolution filters (with stride 1)
- Receptive field : 커널이 커짐에 따라 고려되는 인풋의 크기
- why 3 x 3 convolution?
- 1 x 1 convolution for fully connected layers
- Dropout (p=0.5)
- VGG16, VGG19
*GoogLeNet
- network-in-network (NiN) with inception blocks
- Inception blocks
- 1 x 1 Conv를 적절히 섞어주며 dimension reducntion을 통해 parameter를 줄임
- 1 x 1 Conv를 적절히 섞어주며 dimension reducntion을 통해 parameter를 줄임
*ResNet
- Deeper neural networks are hard to train
- overfitting은 아니지만 layer가 깊어져도 학습률이 좋지 않음
- overfitting은 아니지만 layer가 깊어져도 학습률이 좋지 않음
- Add an identity map (skip connection)
- Bottleneck architecture
- dimension reduction
- dimension reduction
*DenseNet
- DenseNet uses concatenation instead of addition
- Dense Block
- concatenates the feature maps of all preceding layers
- The number of channels increases geometrically
- Transition Block
- BatchNorm -> 1x1 Conv -> 2x2 AvgPooling
- Dimension reduction
- Dense Block
*Computer Vision Applications
*Semantic Segmentation
- *Fully Convolutional Network
- Convolutionalization
- Left : 4 x 4 x 16 x 10 = 2,560
- Right : 4 x 4 x 16 x 10 = 2,560
- Transforming fully connected layers into convolution layers enables a classfication net to output a heat map
- Deconvolution (conv transpose)
- Convolutionalization을 수행하면 파라미터 수는 같지만 spatial dimension이 줄게 됨
- 따라서 spatial dimension을 키우기 위한 Deconvolution을 함
- Result
- Convolutionalization
*Detection
- R-CNN
- SPPNet
- R-CNN은 이미지 내의 bbox를 CNN을 다 통과 시켜야함
- SPPNet은 CNN을 한번만 수행
- Fast R-CNN
- Selective search를 통한 bbox 제안
- CNN 한번 수행 (SPPNet과 같음)
- For each region, get a fixed length feature from ROI pooling
- Two outputs : class and bounding-box regressor
- Fatser R-CNN
- 기존 Selective search단계를 Region Proposal Network로 바꿈
- Region Proposal Network
- 9 : Thress different region sizes (128, 256, 512) with three different ratios (1:1, 1:2, 2:1)
- 4 : four bounding box regression parameters
- 2 : box classification (wheter to use it or not)
- 기존 Selective search단계를 Region Proposal Network로 바꿈
- YOLO
- No explicit bounding box sampling (compared with Fatser R-CNN) -> speed up
- Given an imgae, YOLO divides it into SxS grid
- Each cell predicts B bounding boxes (B=5)
- box refinement (x / y / w / h)
- confidence (of objectness)
- Each cell predicts C class probabilities
- In total, it becomes a tensor with SxSx(B*5+C) size
- SxS : Number of cells of the grid
- B*5 : B bounding boxes with offsets(x,y,w,h) and confidence
- C : Number of classes
- Result
- No explicit bounding box sampling (compared with Fatser R-CNN) -> speed up
'부스트캠프 AI Tech > [Week2] Deep Learning basic' 카테고리의 다른 글
[Week2] DL Basic - Generative Models [Day5] (0) | 2021.08.13 |
---|---|
[Week2] DL Basic - Transformer [Day4] (0) | 2021.08.12 |
[Week2] DL Basic - RNN [Day4] (0) | 2021.08.12 |
[Week2] DL Basic - Optimization [Day2] (0) | 2021.08.10 |
[Week2] DL Basic - MLP(Multi-Layer-Perceptron) [Day1] (0) | 2021.08.09 |