Fine-tunes on (instruction, response) pairs
Image: CC BY-SA 3.0, via Wikimedia Commons
Fine-tunes on (instruction, response) pairs
instruction-level parallelism (ILP) achieves: multiple operations per clock cycle
Instruction-level parallelism (ILP) achieves: Multiple operations per clock cycle
gradient checkpointing trades: recomputes activations to save memory
Gradient checkpointing trades off computation time for memory savings by recomputing activations
loop unrolling does: trades code size for reduced loop overhead
Loop unrolling reduces loop overhead by executing multiple iterations simultaneously, increasing code size
to standardize: when you need zero mean and unit variance for gradient-based optimization
Standardize when zero mean and unit variance are required for gradient-based optimization
classifier-free guidance does: interpolates between conditional and unconditional generation
"Classifies samples as either conditioned or unconditioned, guiding generation towards desired outcomes."
operator fusion does at the compiler level: merges adjacent ops to reduce memory traffic
Operator fusion merges adjacent operations to optimize execution and reduce memory traffic
One email a day: 5 concepts + the 5 stories that matter →
Swipe through 100 ML concepts daily
Open TickerNews