PCIe 5.0 x16 bandwidth limit ~64 GB/s, bottleneck for CPU-GPU transfer
Image: АрміяІнформ, CC BY 4.0, via Wikimedia Commons
PCIe 5.0 x16 bandwidth limit ~64 GB/s, bottleneck for CPU-GPU transfer
NVLink provides: high-bandwidth GPU-to-GPU interconnect (900 GB/s on H100)
NVLink provides: high-bandwidth GPU-to-GPU interconnect (900 GB/s on H100)
NVIDIA's A100 has: 80GB HBM2e, 2TB/s bandwidth, 312 TFLOPS FP16
NVIDIA's A100 features: 80GB HBM2e, 2TB/s bandwidth, 312 TFLOPS FP16
NVIDIA's H100 has: 80GB HBM3, 3.35TB/s bandwidth, 990 TFLOPS FP16
NVIDIA H100 features: 80GB HBM3, 3.35TB/s bandwidth, 990 TFLOPS FP16
Von Neumann architecture
CPU must fetch both data and instructions from memory
2024–present global memory supply shortage
Global DRAM shortage began in 2024
HBM (High Bandwidth Memory) provides: stacked DRAM with much higher bandwidth than DDR
High Bandwidth Memory (HBM) provides stacked DRAM with much higher bandwidth than DDR
One email a day: 5 concepts + the 5 stories that matter →
Swipe through 100 ML concepts daily
Open TickerNews