
LoRA: Augments model weights with low-rank matrices A, B, ΔW = BA
LoRA: Augments model weights with low-rank matrices A, B, ΔW = BA
What weight tying does in language models: shares embedding and output projection matrices
Language models use tied weights to share embedding and output projection matrices, enhancing parameter efficiency
What AWQ does differently — activation-aware weight quantization preserves important weights
AWQ quantizes weights while preserving critical activation values for neural network efficiency
What the rank-nullity theorem says: rank(A) + nullity(A) = n for an m×n matrix
Rank-nullity theorem: Rank(A) + Nullity(A) = Number of columns (n) in A
How tiling works in matrix multiplication — loading blocks into shared memory
Tiling in matrix multiplication optimizes cache usage by partitioning matrices into submatrices
What LSM trees optimize: write-heavy workloads by buffering writes in memory
LSM trees optimize write-heavy workloads through in-memory buffering
What mixup does: trains on convex combinations of pairs: x̃=λx_i+(1-λ)x_j
Trains on convex combinations of pairs: x̃=λx_i+(1-λ)x_j represents a weighted average of two points in a convex set
One email a day: 5 concepts + the 5 stories that matter →
Swipe through 100 ML concepts daily
Open TickerNews