NVIDIA's A100 features: 80GB HBM2e, 2TB/s bandwidth, 312 TFLOPS FP16
Image: jnftech, CC BY-SA 4.0, via Wikimedia Commons
NVIDIA's A100 features: 80GB HBM2e, 2TB/s bandwidth, 312 TFLOPS FP16
NVIDIA's H100 has: 80GB HBM3, 3.35TB/s bandwidth, 990 TFLOPS FP16
NVIDIA H100 features: 80GB HBM3, 3.35TB/s bandwidth, 990 TFLOPS FP16
NVLink provides: high-bandwidth GPU-to-GPU interconnect (900 GB/s on H100)
NVLink provides: high-bandwidth GPU-to-GPU interconnect (900 GB/s on H100)
PCIe bandwidth limits: ~64 GB/s for PCIe 5.0 x16, bottleneck for CPU-GPU transfer
PCIe 5.0 x16 bandwidth limit ~64 GB/s, bottleneck for CPU-GPU transfer
HBM (High Bandwidth Memory) provides: stacked DRAM with much higher bandwidth than DDR
High Bandwidth Memory (HBM) provides stacked DRAM with much higher bandwidth than DDR
2024–present global memory supply shortage
Global DRAM shortage began in 2024
CPU cache
L1/L2 cache hierarchy reduces global memory latency
One email a day: 5 concepts + the 5 stories that matter →
Swipe through 100 ML concepts daily
Open TickerNews