
Register pressure: Excessive variables per thread lead to reduced occupancy and potential performance bottlenecks
Image: Da mocavi, CC BY-SA 4.0, via Wikimedia Commons
Register pressure: Excessive variables per thread lead to reduced occupancy and potential performance bottlenecks
Von Neumann architecture
CPU must fetch both data and instructions from memory
Memory hierarchy
Memory hierarchy levels: registers → L1 → L2 → L3 → RAM → SSD → HDD (each ~10× slower)
Thread block (CUDA programming)
Thread blocks can contain up to 1024 threads as of March 2010
Glossary of poker terms
Context window limit: maximum tokens model processes simultaneously
occupancy means in GPU programming
Occupancy = Active Warps / Max Warps
arithmetic intensity is
Arithmetic intensity = FLOPs / Bytes accessed
One email a day: 5 concepts + the 5 stories that matter →
Swipe through 100 ML concepts daily
Open TickerNews