Across
- 3. This occurs when threads in a warp take different conditional branches, reducing efficiency. (4, 11)
- 9. The primary execution units inside a GPU, containing hundreds of cores. (9, 13)
Down
- 1. A memory space within an SM that acts as a low-latency scratchpad for inter-thread communication. (6, 6) —
- 2. The type of parallelism where the same operation is applied to multiple data elements simultaneously. (4, 8)
- 4. The programming model uses terms like "work-items" and "work-groups" and is vendor-neutral. (6)
- 5. CORES Specialized hardware units designed to accelerate matrix-multiply-and-accumulate operations. (6, 5)
- 6. The high-speed interconnect developed by NVIDIA for multi-GPU systems, faster than PCIe. (6)
- 7. The architecture that presents a single shared virtual memory address space for both CPU and GPU. (7, 6)
- 8. A group of 32 threads that execute the same instruction in the SIMT model. (4)
- 10. The acronym for the execution model where all threads in a warp execute the same instruction. (4)
