CUDA allows very fine-grained control over parallel execution compared to high-level GPU programming models, e.g. OpenMP, which helps to optimise performance. This online 5-day course by HLRS gives an introduction to the programming language CUDA, which is used to write fast numerical algorithms for NVIDIA GPUs. Emphasis is placed on basic use of the language and efficient use of hardware to maximise performance.

Read more and register here.