Learn a natural way to program multi-GPU systems. The super-GPU programming concept is a natural extension of NVIDIA?CUDA?tiling hierarchy into multi-GPU systems. It allows you to write super-kernels that run on super-GPUs. Tiling simplifies the challenging problem of programming a multi-GPU system that requires coordination of multiple kernels running on nodes connected via a heterogeneous network. We'll illustrate a super-GPU programming model on several applications and benchmarks, including SpMV, Integer Sort, Transpose, FFT, GEMM, and RTM. Use of super GPU provides super-linear speedup of SpMV due to better utilization of L2 of several GPUs. For Sort, FFT, and GEMM, the speedup is close to linear. Multi-GPU Transpose attains the limit imposed by the interconnecting network.