Log-loss penalizes confident incorrect predictions more heavily
Image: U.S. Navy photo by Photographer's Mate 2nd Class Philip A. McDaniel, Public domain, via Wikimedia Commons
Log-loss penalizes confident incorrect predictions more heavily
cross-entropy equals negative log-likelihood for classification
Cross-entropy measures the difference between predicted probabilities and true labels, thus it equals negative log-likelihood, reflecting the cost of incorrect predictions
Entropy H = -Σ p(x) log₂ p(x) measures average surprise in bits
Entropy H = -Σ p(x) log₂ p(x) quantifies uncertainty in a system
Cross-entropy H(p,q) = -Σ p(x) log q(x) measures how well q approximates p
Cross-entropy H(p,q) = -Σ p(x) log q(x) quantifies approximation quality between distributions p and q
log-probabilities are used instead of probabilities: avoids numerical underflow
Log-probabilities convert multiplications into additions, preventing numerical underflow
ill-conditioned matrices cause numerical instability: small input changes → large output changes
Ill-conditioned matrices amplify input perturbations, leading to significant output variability
Shannon's source coding theorem: you can't compress below entropy
Shannon's theorem: Data compression can't exceed entropy limit
One email a day: 5 concepts + the 5 stories that matter →
Swipe through 100 ML concepts daily
Open TickerNews