SASS: compiled machine code executing on NVIDIA GPU hardware
Image: Unknown authorUnknown author, Public domain, via Wikimedia Commons
SASS: compiled machine code executing on NVIDIA GPU hardware
nvcc does: NVIDIA's CUDA compiler that produces PTX and SASS
nvcc compiles CUDA code to PTX and SASS
a Triton kernel is
Triton kernel: Python-based GPU programming that compiles to PTX
tensor cores are
Tensor cores are specialized hardware for matrix multiply-accumulate on GPU
Arm architecture family
ARM processors are the most widely used family of instruction set architectures
Parallel Thread Execution
PTX is an intermediate GPU instruction set used in Nvidia's CUDA
XLA does for TensorFlow/JAX: compiles computation graphs for TPU/GPU execution
XLA compiles computation graphs for TPU/GPU execution
One email a day: 5 concepts + the 5 stories that matter →
Swipe through 100 ML concepts daily
Open TickerNews