KAUST Supercomputing Laboratory Newsletter 20th January 2016

Annual Power Maintenance

Due to the annual power maintenance in the data centre, all of the systems will be unavailable from 08:00 on Thursday 11th February until approximately 10:00 on Monday 15th February.

KSL Workshop Series:  Optimizing I/O on Shaheen II

Thursday, February 4, 2016

The KAUST Supercomputing Core Laboratory (KSL) invites you to the second workshop on our seminar series about Shaheen II. This workshop will focus on maximizing efficient use of the parallel file system. It will provide an overview of Parallel I/O, explore using various profiling tools for validating I/O performance, and cover best practices for efficient I/O operations. The scientific focus will be on codes for climate, seismic, and biology applications.

This seminar is of particular interest to Shaheen II users dealing with large files, or a large number of files.

Seats are limited. Please register your interest at: https://www.surveymonkey.com/r/H6FNM7C
 
Venue/Time: Computer Lab Room, 3rd floor at the Library, from 9:30am-11:00am
 
Agenda:
        09:30am - Optimizing I/O on Shaheen II
        10:00am - Interactive Exercises on Shaheen II
        10:30am - Q&A with KSL team (bring all your HPC questions)

Shaheen I/ Neser Data

We will continue to have the Shaheen I/Neser ‘home' and ‘project' filesystems available until at least 31st July 2016. However, please note that the ‘scratch' filesystem will be taken off-line and deleted on the 1st February 2016.

For data that is needed for projects on Shaheen II, rather than copying it yourself, please contact us and we can assist in moving the data as we have dedicated systems for this that have direct access to both storage subsystems.

Tip of the Week: Job Arrays

SLURM allows you to submit and manage multiple similar jobs quickly and easily thanks to job arrays. Job arrays can be specified using two techniques:

  • in a batch directive,
    • #SBATCH --array=1-10
  • in the command line as
    • cdl:~>sbatch --array=0-10  my_slurm_script.sh

This will generate a job array containing 10 jobs. If the sbatch command responds "Submitted batch job 100 " then the environment variables will be set as follows:

SLURM_JOBID=100
SLURM_ARRAY_JOB_ID=100
SLURM_ARRAY_TASK_ID=1

SLURM_JOBID=101
SLURM_ARRAY_JOB_ID=100
SLURM_ARRAY_TASK_ID=2
….

It is advised to update the job's stdin, stdout, and stderr filenames, as follows:

#SBATCH –outputs= slurm-%A_%a.out

where %A will be replaced by the value of SLURM_ARRAY_JOB_ID  and %a will be replaced by the value of SLURM_ARRAY_TASK_ID

To visualize the status , you can use the squeue  -u username command, however the jobs pending will appear in one line. For a better formatting and to check the status of both running and pending jobs, you can add the option –r

100_1  user1   kx        jarray   R None      2016-01-11T00:16:36   16:19:53    7:40:07     8
100_2  user1   kx        jarray   R None      2016-01-11T00:16:36   16:19:53    7:40:07     8
100_3  user1   kx        jarray   R None      2016-01-11T00:16:36   16:19:53    7:40:07     8
100_4  user1   kx        jarray   R None      2016-01-11T00:16:36   16:19:53    7:40:07     8
100_5  user1   kx        jarray   R None      2016-01-11T00:16:36   16:19:53    7:40:07     8
100_6  user1   kx        jarray  PD JobArrayT N/A                       0:00      10:00     8
100_7  user1   kx        jarray  PD JobArrayT N/A                       0:00      10:00     8
100_8  user1   kx        jarray  PD JobArrayT N/A                       0:00      10:00     8
100_9  user1   kx        jarray  PD JobArrayT N/A                       0:00      10:00     8
100_10 user1   kx        jarray  PD JobArrayT N/A                       0:00      10:00     8

To limit the number of simultaneously running tasks jobs to 2 for example, use the %2 as follows, "--array=0-10%2"

 

Previous Announcements

Previous Tips