Elastic net: λ₁|w| + λ₂w² enforces sparsity and stability simultaneously
Image: Hessemer, Friedrich Maximilian Friedrich Maximilian Hessemer (Q1461066), PDM-owner, via Wikimedia Commons
Elastic net: λ₁|w| + λ₂w² enforces sparsity and stability simultaneously
ill-conditioned matrices cause numerical instability: small input changes → large output changes
Ill-conditioned matrices amplify input perturbations, leading to significant output variability
Regularization (mathematics)
L1 regularization results in sparse solutions
non-convex loss landscapes are hard: many local minima and saddle points
Non-convex loss landscapes are hard due to many local minima and saddle points
Ordinary least squares
OLS minimizes squared differences
Vanishing gradient problem
Residual connections help by allowing gradient flow through the skip connection
the L1 norm is not differentiable at zero
The L1 norm is not differentiable at zero because the absolute value function has a kink at zero
One email a day: 5 concepts + the 5 stories that matter →
Swipe through 100 ML concepts daily
Open TickerNews