A CUDA thread block is a group of threads executing in parallel, sharing global and shared memory

What a thread block is in CUDA — a group of threads that share shared memory

A CUDA thread block is a group of threads executing in parallel, sharing global and shared memory

Related concepts

What a CUDA kernel is — a function that runs on thousands of GPU threads in parallel

CUDA kernel: Parallel function executed on GPU's thousands of threads simultaneously

What cooperative groups enable in CUDA: flexible thread synchronization patterns

CUDA allows cooperative groups for flexible thread synchronization patterns via atomic operations and events

What bank conflicts are in shared memory — multiple threads accessing the same bank

Shared memory conflicts arise when multiple threads concurrently access the same bank in a banking system

How do lock-free data structures manage concurrent access to shared memory in a multithreaded environment?

Lock-free data structures use atomic operations to ensure concurrent access without traditional locking mechanisms

Why memory coalescing matters — adjacent threads reading adjacent memory addresses

Memory coalescing reduces cache misses, improving multithreaded application performance

How tiling works in matrix multiplication — loading blocks into shared memory

Tiling in matrix multiplication optimizes cache usage by partitioning matrices into submatrices

Swipe through 100 ML concepts daily