본문 바로가기

Programming/Deep Learning

Activation Function의 종류와 특징

exponential 2021. 1. 9. 19:08

1. Sigmoid

장점	단점
미분이 가능하다. 0과 1 사이의 값을 갖는다.	1. 0에서 많이 벗어날수록 gradient가 0이 된다. > saturated neurons 'kill' the gradients. 2. sigmoid outputs are not zero centered. > 학습 속도가 떨어진다. 3. 지수함수는 계산이 힘들다.

2. tanh

장점	단점
음수를 값으로 가질 수 있다.	kill gradients when saturated

3. ReLU < The BEST option

장점	단점
계산이 빠르다 미분이 쉽다 Does not saturate in + region Converges much faster than sigmoid or tanh in practice biologically plausible than sigmoid	0이하일 때 미분이 불가능하다. not zero centered.

4. ELU : ReLU의 식을 약간 변형하여 0이하일 때에도 미분이 가능하도록 함

장점	단점
ReLU의 모든 장점 zero mean output과 가까움 Negative saturation regime compared with LeakyReLU adds some robustness to noise	지수함수가 필요함

5.Leaky ReLU

장점	단점
Does not saturate Computationally efficient Converges much faster than sigmoid/tanh Will not 'die'

6. Maxout 'Neuron'

장점	단점
1. Does not have the basic form of dot product > nonlinearity 2. Generalizes ReLU & LeakyReLU 3. Linear Regime, no saturation, no 'die'	Doubles number of parameters and neurons

'Programming > Deep Learning' 카테고리의 다른 글

GPT 모델에 대한 이모저모 (0)	2021.01.08
소설 쓰는 딥러닝 발표를 듣고 (0)	2021.01.04

티스토리툴바