A simpler GELU activation function approximation
The Endeavour 2025-03-06
Summary:
The GELU (Gaussian Error Linear Units) activation function was proposed in [1]. This function is x Φ(x) where Φ is the CDF of a standard normal random variable. As you might guess, the motivation for the function involves probability. See [1] for details. The GELU function is not too far from the more familiar ReLU, […]
The post A simpler GELU activation function approximation first appeared on John D. Cook.