Posts

Showing posts with the label activation

How Leaky ReLU Helps Neural Networks Learn Better

When training deep neural networks, choosing the right activation function can make a huge difference. One of the most common choices— ReLU (Rectified Linear Unit) —is popular for its simplicity and speed. But it's not without flaws. Enter: Leaky ReLU — a small tweak that solves a big problem. The Problem with Vanilla ReLU ReLU is defined as: f(x) = max(0, x) Which means: If x > 0 , output is x If x , output is 0 While this works great most of the time, it comes with a major drawback : Dead Neurons Problem If a neuron gets stuck in the negative range, it always outputs 0. That means no gradient, no learning — it's "dead." Over time, a portion of your network can become inactive, especially in deep architectures. How Leaky ReLU Fixes This Leaky ReLU modifies the function like so: f(x) = x if x > 0 f(x) = αx if x This tiny slope for negative values (instead of zero) helps keep the neuron alive and gradients flowing...