gradient checkpointing trades: recomputes activations to save memory

Gradient checkpointing trades off computation time for memory savings by recomputing activations

Image: Aditya Suseno, CC0, via Wikimedia Commons

gradient checkpointing trades: recomputes activations to save memory

Gradient checkpointing trades off computation time for memory savings by recomputing activations

Related concepts

One email a day: 5 concepts + the 5 stories that matter →

Swipe through 100 ML concepts daily

Open TickerNews