GPU Hack-a-thon 2017

Programming GPU with OpenACC

The introductional talk from Dr Saber Feki can be downloaded Here

The NVIDIA Workshop GPU Hackathon 2017 Results are to be found Here.

A few tips...

Connection

How to connect to buzzard:

    ssh buzzard.hpc.kaust.edu.sa

Before you start on Buzzard, load the following module

    module load hackathon/2017

or by hand execute the following

export PATH=$PATH:/usr/local/cuda/bin/
export PATH=/sw/cs/pgi/linux86-64/16.1/mpi/openmpi-1.10.1/bin/:$PATH
export PATH=$PATH:/sw/cs/pgi/linux86-64/16.1/bin/
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/sw/cs/pgi/linux86-64/16.1/lib/
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/sw/cs/pgi/linux86-64/16.1/mpi/openmpi-1.10.1/lib/
export PATH=$PATH:/opt/allinea/forge/bin/

How to connect to IT P100:

ssh noor-gpu

or

ssh 10.68.209.37

Export variables for the P100 machine

export PATH=$PATH:/usr/local/cuda-8.0/bin
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-8.0/lib64:/usr/local/lib/
 

Compilation

Example:
pgf90  -O2 -ta=tesla -acc -Minfo=accel ...

Options: -ta=tesla:lineinfo -Minfo=all,intensity

lineinfo

It will provide the lines in the code that you have memory problems

Minfo=all
It will provide all the compiler information (including the acceleration), for example

Loop not vectorized: loop count too small
Loop unrolled 6 times (completely unrolled)

Minfo=intensity

Provides the intensity of all the loops, intensity is the (Compute operations/Memory Operations), if it is more or equal to 1.0 then we should move this loop to GPUs otherwise not.

For example,

   210, Possible copy in and copy out of q in call to coef_df4_premiere
         Intensity = 0.50
    442, Intensity = 1.67

Profiling

You compile your code for CPU

Execute:
nvprof --cpu-profiling on ./executable
Profiling instructions

Use the tool nvvp for GUI

Profiling with Allinea
Allinea profiling tool: /opt/allinea/forge/

Execution

export CUDA_VISIBLE_DEVICES=X

check which GPUS are used:
nvidia-smi

Material

OpenACC OpenACC web page
OpenACC reference guide
OpenACC programming guide
 
PGI Compiler guide
CUDA Cuda with C/C++
Cuda with Fortran
GPU Libraries GPUs libraries
Other Matlab and GPU

 

AMGX and MiniFE

AMGX and MiniFE installation instructions (if needed only): http://hpc.kaust.edu.sa/AMGX_MINIFE_GPU

Deep Neural Networks - Cudnn

Available here: /sw/cs/cudnn/cuda