Softmax function

Softmax converts real numbers into a probability distribution

Image: Unknown authorUnknown author, CC BY 4.0, via Wikimedia Commons

Softmax function

Softmax converts real numbers into a probability distribution

The softmax function takes a vector of real numbers and applies the exponential function to each element, then normalizes these values by dividing by the sum of all exponentials. This transformation ensures that the output values are non-negative and sum up to one, making them a valid probability distribution. The softmax function is particularly useful in neural networks for tasks like classification, where it helps to convert the raw output scores into probabilities for each class.

Example

Given a vector [2, 1, 0.1], the softmax function will first compute the exponentials: exp(2) = 7.389, exp(1) = 2.718, exp(0.1) = 1.105. Then, it normalizes these values by dividing each by the sum of all exponentials: 7.389 + 2.718 + 1.105 = 11.212. The resulting softmax probabilities are approximately [0.655, 0.245, 0.010].

Understanding the softmax function is crucial for interpreting neural network outputs in classification tasks, as it provides a clear probability distribution over possible outcomes.

Related concepts

One email a day: 5 concepts + the 5 stories that matter →

Swipe through 100 ML concepts daily

Open TickerNews