Containers share host kernel, lighter; VMs full OS isolation
Image: Gregor Hartl, CC BY-SA 4.0, via Wikimedia Commons
Containers share host kernel, lighter; VMs full OS isolation
database sharding does: splits data across machines by a partition key
Database sharding splits data across machines by a partition key
Load balancing (computing)
Load balancing distributes tasks efficiently across resources
the ONNX format does: standardizes model representation for cross-framework deployment
ONNX format standardizes model representation for cross-framework deployment
paged attention (vLLM) improves serving throughput
Paged attention (vLLM) improves serving throughput by reducing latency through non-contiguous KV-cache pages, enabling faster data retrieval
LSM trees optimize: write-heavy workloads by buffering writes in memory
LSM trees optimize write-heavy workloads by buffering writes in memory
fused kernels do
Fused kernels combine multiple operations into one kernel to avoid memory round-trips
One email a day: 5 concepts + the 5 stories that matter →
Swipe through 100 ML concepts daily
Open TickerNews