MobileNetV2 : Inverted Residuals and Linear Bottlenecks

*Techniques for Small Deep Neural Networks

-> 수정필요

Depthwise Seperable Convolution = Depthwise Convolution + Pointwise Convolution(1x1 convolution)
- Depwise Convolution : w x h x d(1) 의 kernel을 이용하여 입력 이미지의 각 Channel 별로 Convolution
- Pointwise Convolution : w(1) x h(1) x d의 kernel을 이용해서 Convolution (1x1 conv와 동일)
Computation
- Standard Convolution :
- Depthwise Seperable Convolution :
- Reduction in Computations :
Width Multiplier (α): Thinner Models
- input과 output의 채널에 곱해지는 상수값 (1, 0.75, 0.5, 0.25)
Resolution Multiplier (ρ) : Reduced Representation
- input image의 height와 width에 곱해지는 상수값
MobileNetV1; Architecture

++ width multiplier 성능개선 table, resolution multiplier 성능 개선 테이블
수식 넣으면 좋을듯
Relu로 인한 정보손실
- 채널수가 적은 ReLU 함수 계층 -> 정보손실
- 채널수가 많은 ReLU 함수 계층 -> 정보보존
- narrow - > wide -> narrow에서 relu, linear bottlenecks을 쓰는 이유

*Linear Bottleneck

2가지 가정
1. If the manifold of interest remains non-zero volume after ReLU transformation, it corresponds to a linear transformation.
2. ReLU is capable of preserving complete information about the input manifold, but only if the input manifold lies in a low-dimensional subspace of the input space.
위 가정에 따라 Linear Bottlenecks을 활용을 한다면 효과적으로 정보를 전달할 수 있을거라고 기대
모든 layer는 Batch Normalization과 activation function으로 ReLU6를 활용함.
단, projection layer는 activation function으로 ReLU6를 활용하지 않음. 저차원으로 mapping할때 non-linearity한 activation function은 정보손실을 야기하기 때문

*Inverted Residuals

일반적인 Residual block : wide -> narrow -> wide
Inverted reisudal block : narrow -> wide -> narrow
일반적인 Residual block같은 경우 저차원으로 feature map이 만들어지는 과정에서 skip connection을 통해 정보손실을 줄이기 위함이였음
Inverted residual block은 narrow에 이미 필요한 정보들이 압축되어있다는 가정하에 skip connection으로 필요한 정보를 전달할 수 있고, memory에 좀 더 효율적임
Bottleneck Residual Blocks
- 1x1 "Expansion" Layer 에서 채널을 확장(narrow -> wide)
- 3x3 Depthwise Convolution (wide -> wide)
- 1x1 "Projection" Layer (wide -> narrow)
Operations
- h x w x t x d' x d' + h x w x t x d' x k x k x d' + h x w x d'' x t x d' = h x w x d' x t(d' + k^2 + d'')