
Bonferroni correction adjusts significance level by dividing α by the number of tests
Image: John Graner, Neuroimaging Department, National Intrepid Center of Excellence, Walter Reed National Military Medical Cent, Public domain, via Wikimedia Commons
Bonferroni correction adjusts significance level by dividing α by the number of tests
Adam has bias correction: divides by (1-β^t) in early steps
Adam bias correction divides by (1-β^t) in early steps to counteract initial bias from accumulated gradients
Race and intelligence
IQ test performance differences between racial groups have decreased over time
Top-k vs top-p sampling: top-k fixes candidate count, top-p fixes cumulative probability mass
Top-k sampling fixes candidate count; top-p sampling fixes cumulative probability mass
Effect size
Cohen's D benchmarks: 0.2 = small, 0.5 = medium, 0.8 = large effect
AdaGrad does: divides learning rate by sqrt of sum of squared gradients
AdaGrad adjusts learning rate by dividing it by the square root of the sum of squared gradients
the determinant tells you about volume scaling under a linear transformation
The determinant of a matrix representing a linear transformation indicates the factor by which volumes are scaled
One email a day: 5 concepts + the 5 stories that matter →
Swipe through 100 ML concepts daily
Open TickerNews