Kullback–Leibler divergence formula: D_KL(P||Q) = ∑_x∈X P(x) log(P(x)/Q(x))
Image: CC BY-SA 3.0, via Wikimedia Commons
Kullback–Leibler divergence formula: D_KL(P||Q) = ∑_x∈X P(x) log(P(x)/Q(x))
The Kullback–Leibler (KL) divergence is a measure of how much one probability distribution diverges from a second, expected probability distribution. It quantifies the difference between two distributions, P and Q, by summing over all possible outcomes x in the set X. The formula involves multiplying the probability of each outcome under P by the logarithm of the ratio of P(x) to Q(x).
The KL divergence is always non-negative and equals zero if and only if P and Q are identical. This property makes it a useful tool for comparing distributions and assessing the performance of probabilistic models. It is commonly used in various fields such as machine learning, information theory, and statistics.
Example
Suppose we have two discrete probability distributions P and Q over the same set X = {a, b, c}. Let P(a) = 0.2, P(b) = 0.5, P(c) = 0.3, Q(a) = 0.1, Q(b) = 0.4, Q(c) = 0.5. The KL divergence D_KL(P||Q) can be calculated as follows: D_KL(P||Q) = 0.2 * log(0.2/0.1) + 0.5 * log(0.5/0.4) + 0.3 * log(0.3/0.5).
Understanding the KL divergence formula is crucial for evaluating the performance of probabilistic models and comparing different distributions. It helps in quantifying the divergence between an approximating distribution and the true distribution, providing insights into the accuracy and effectiveness of the model.
Jensen–Shannon divergence
Jensen-Shannon divergence formula: D_JS(P||Q) = 1/2 * D_KL(P||(M)) + 1/2 * D_KL(Q||(M))
KL divergence is always ≥ 0 and equals 0 only when P = Q exactly
KL divergence measures the difference between two distributions P and Q; it is always non-negative and zero if and only if P equals Q exactly
Entropy (information theory)
H(X) = −∑x∈X p(x) log(p(x))
Cross-entropy
Cross-entropy loss equation: H(p, q) = -Σ(p(x) * log(q(x)))
Mutual information
Mutual information formula: I(X;Y) = ∑_x∈X ∑_y∈Y p(x,y) log(p(x,y)/(p(x)p(y)))
Cross-entropy H(p,q) = -Σ p(x) log q(x) measures how well q approximates p
Cross-entropy H(p,q) = -Σ p(x) log q(x) quantifies approximation quality between distributions p and q
One email a day: 5 concepts + the 5 stories that matter →
Swipe through 100 ML concepts daily
Open TickerNews