Label smoothing regularizes models by adjusting target distributions
Label smoothing regularizes models by adjusting target distributions
What calibration means: a model predicting 80% should be correct 80% of the time
Calibration: Model's predicted probabilities match actual outcomes' frequencies
What cutmix does: replaces a patch of one image with a patch from another
Patch-based image cutmix swaps image sections for data augmentation
A p-value < 0.05 means: if H₀ is true, this result has <5% probability
A p-value < 0.05 indicates a less than 5% chance of observing data as extreme as this if the null hypothesis is true
What denoising score matching does: learns to denoise, which equals learning the score
Denoising score matching learns to remove noise, enhancing signal representation and interpretation
Why L1 regularization produces sparse solutions — the diamond corners touch axes
L1 regularization promotes sparsity by penalizing non-zero coefficients, effectively driving some to zero
Why temperature T in softmax(x/T) controls entropy: T→0 is argmax, T→∞ is uniform
As T approaches zero, softmax becomes argmax, maximizing entropy; T→∞ yields uniform distribution, minimizing entropy
One email a day: 5 concepts + the 5 stories that matter →
Swipe through 100 ML concepts daily
Open TickerNews