Kullback–Leibler divergence

Kullback–Leibler divergence formula: D_KL(P||Q) = ∑_x∈X P(x) log(P(x)/Q(x))

The Kullback–Leibler (KL) divergence is a measure of how much one probability distribution diverges from a second, expected probability distribution. It quantifies the difference between two distributions, P and Q, by summing over all possible outcomes x in the set X. The formula involves multiplying the probability of each outcome under P by the logarithm of the ratio of P(x) to Q(x).

The KL divergence is always non-negative and equals zero if and only if P and Q are identical. This property makes it a useful tool for comparing distributions and assessing the performance of probabilistic models. It is commonly used in various fields such as machine learning, information theory, and statistics.

Example

Suppose we have two discrete probability distributions P and Q over the same set X = {a, b, c}. Let P(a) = 0.2, P(b) = 0.5, P(c) = 0.3, Q(a) = 0.1, Q(b) = 0.4, Q(c) = 0.5. The KL divergence D_KL(P||Q) can be calculated as follows: D_KL(P||Q) = 0.2 * log(0.2/0.1) + 0.5 * log(0.5/0.4) + 0.3 * log(0.3/0.5).

Understanding the KL divergence formula is crucial for evaluating the performance of probabilistic models and comparing different distributions. It helps in quantifying the divergence between an approximating distribution and the true distribution, providing insights into the accuracy and effectiveness of the model.

Related concepts

Jensen–Shannon divergence

Jensen-Shannon divergence formula: D_JS(P||Q) = 1/2 * D_KL(P||(M)) + 1/2 * D_KL(Q||(M))

KL divergence is always ≥ 0 and equals 0 only when P = Q exactly

KL divergence measures the difference between two distributions P and Q; it is always non-negative and zero if and only if P equals Q exactly

Entropy (information theory)

H(X) = −∑x∈X p(x) log(p(x))

Cross-entropy

Cross-entropy loss equation: H(p, q) = -Σ(p(x) * log(q(x)))

Mutual information

Mutual information formula: I(X;Y) = ∑_x∈X ∑_y∈Y p(x,y) log(p(x,y)/(p(x)p(y)))

Cross-entropy H(p,q) = -Σ p(x) log q(x) measures how well q approximates p

Cross-entropy H(p,q) = -Σ p(x) log q(x) quantifies approximation quality between distributions p and q

One email a day: 5 concepts + the 5 stories that matter →

Swipe through 100 ML concepts daily

Open TickerNews