GPU Nodes

The IBEX cluster contains different architectures of GPUs like Turing, Pascal and Volta#. These different GPUs are accessed (for your source code compilation and job submission) using the following login nodes:

  1. V100 and RTX 2080 Ti GPU nodes: vlogin.ibex.kaust.edu.sa

  2. All other GPU nodes: glogin.ibex.kaust.edu.sa

The IBEX cluster has 63 GPU compute nodes (396 GPU cards)  and it’s summarized in Table 1. These various GPUs are accessed by the SLURM scheduling using the constraints "--gres=gpu:<$$$>:<#>”, where: <$$$> is the GPU name and <#> is for number of GPUs.

For example, “--gres=gpu:gtx1080ti:4” is for 4 GTX GPUs.

 

                                        Table 1. List of GPU architectures in IBEX Cluster

Sl. No

GPU Architecture

Available GPU cards per node

Available number of nodes

GPU Memory (per card)

Usable Node Memory^

Constraint for (SLURM) scheduling

1.

Turing:

rtx2080ti#

8

4

12GB

366GB

"--gres=gpu:rtx2080ti:8"

2.

Pascal:

gtx1080ti

4 or 8

12
(4*8 and 8*4)

12GB

246GB
or
366GB

"--gres=gpu:gtx1080ti:4" and "--gres=gpu:gtx1080ti:8"

3.

Pascal:

p100

4

6

16GB

246GB

"--gres=gpu:p100:4"

4.

Pascal:

p6000

2

2

22GB

246GB

"--gres=gpu:p6000:2"

5.

Volta:

v100$,#

4 or 8

38

(8*4 and 30*8)

32GB

366GB

and

745GB

"--gres=gpu:v100:4" and "--gres=gpu:v100:8"

Note: The allocation of CPU memory can be done with `--mem=###G` constraint in SLURM job scripts. The amount of memory depends on the job characteristization. A good starting place would be at least as much as the GPU memory they will use. For example: 2 x v100 GPUs would allocate at least `--mem=64G` for the CPUs.

 

# Newly added into IBEX cluster.

$ The vlogin node has a single v100 GPU for compilation of the source code. We have  4 x v100 GPU cards are available in compute nodes and accessed through SLURM scheduling.

^ The usable node memory represents the available memory for job execution.

 

The CUDA libraries are different for the GPU nodes. Table 2 provides the guidelines and recommendations for the CUDA libraries for the specific GPU architecture for the GPU source code compilations.

 

                                          Table 2. CUDA libraries for GPU architecture

Sl. No

GPU architecture

Login node

CUDA supported version

GNU compiler

1. Turing vlogin CUDA 10.1.105 GCC 6.4.0
2. Volta vlogin CUDA 10.1.105 GCC 6.4.0
3. Pascal glogin CUDA 10.1.105 GCC 4.8.5

Some of the standard compilers, numerical libraries and GPU supported applications are available as a software modules. Table 3 provides the general guidelines of list of software modules available for specific GPU architecture.

 

                                        Table 3. Software modules for different GPU architecture

Sl. No

GPU architecture

Login node

Compilers

Standard libraries

GPU applications

1.

Turing

vlogin

 

 /sw/csgv/modulefiles/compilers

/sw/csgv/modulefiles/libs

/sw/csgv/modulefiles/applications  

2.

Volta

3.

Pascal

glogin

/sw/csg/modulefiles/compilers

/sw/csg/modulefiles/libs

/sw/csg/modulefiles/applications

 
 
For further info or send us a query using the Contact Us page.

Alternatively, send an email:

  1. Application installation/failure/support: 
  2. System issues/failure/support: