GPU Hack-a-thon 2017
Programming GPU with OpenACC
The introductional talk from Dr Saber Feki can be downloaded Here
The NVIDIA Workshop GPU Hackathon 2017 Results are to be found Here.
A few tips...
Connection
How to connect to buzzard:
ssh buzzard.hpc.kaust.edu.sa
Before you start on Buzzard, load the following module
module load hackathon/2017
or by hand execute the following
export PATH=$PATH:/usr/local/cuda/bin/
export PATH=/sw/cs/pgi/linux86-64/16.1/mpi/openmpi-1.10.1/bin/:$PATH
export PATH=$PATH:/sw/cs/pgi/linux86-64/16.1/bin/
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/sw/cs/pgi/linux86-64/16.1/lib/
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/sw/cs/pgi/linux86-64/16.1/mpi/openmpi-1.10.1/lib/
export PATH=$PATH:/opt/allinea/forge/bin/
How to connect to IT P100:
ssh noor-gpu
or
ssh 10.68.209.37
Export variables for the P100 machine
export PATH=$PATH:/usr/local/cuda-8.0/bin
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-8.0/lib64:/usr/local/lib/
Compilation
Example:
pgf90 -O2 -ta=tesla -acc -Minfo=accel ...
Options: -ta=tesla:lineinfo -Minfo=all,intensity
lineinfo
It will provide the lines in the code that you have memory problems
Minfo=all
It will provide all the compiler information (including the acceleration), for example
Loop not vectorized: loop count too small
Loop unrolled 6 times (completely unrolled)
Minfo=intensity
Provides the intensity of all the loops, intensity is the (Compute operations/Memory Operations), if it is more or equal to 1.0 then we should move this loop to GPUs otherwise not.
For example,
210, Possible copy in and copy out of q in call to coef_df4_premiere
Intensity = 0.50
442, Intensity = 1.67
Profiling
You compile your code for CPU
Execute:
nvprof --cpu-profiling on ./executable
Profiling instructions
Use the tool nvvp for GUI
Profiling with Allinea
Allinea profiling tool: /opt/allinea/forge/
Execution
export CUDA_VISIBLE_DEVICES=X
check which GPUS are used:
nvidia-smi
Material
OpenACC | OpenACC web page OpenACC reference guide OpenACC programming guide |
PGI | Compiler guide |
CUDA | Cuda with C/C++ Cuda with Fortran |
GPU Libraries | GPUs libraries |
Other | Matlab and GPU |
AMGX and MiniFE
AMGX and MiniFE installation instructions (if needed only): http://hpc.kaust.edu.sa/AMGX_MINIFE_GPU
Deep Neural Networks - Cudnn
Available here: /sw/cs/cudnn/cuda