KAUST Supercomputing Laboratory Newsletter 19th May

In this newsletter:

  • Data Centre Downtime
  • Shaheen Maintenance: 23rd May 2022 at 5pm until 24th at 5pm.
  • RCAC meeting
  • Tip of the week: Error linked to Read-only file system
  • Follow us on Twitter and YouTube
  • Previous Announcements
  • Previous Tips

 

Data Centre Downtime

As previously announced, in preparation for the arrival of Shaheen III, which is scheduled for 2023, the four power substations that supply Building 1 Data Centre will be replaced, a process that is scheduled to take six weeks. During that time, power available in the data centre will be substantially reduced. This outage will require a shutdown of a significant number of HPC services by KSL - Shaheen compute capability will not be available, Ibex and Neser will be operating at approximately half of their compute capacity. KSL will try to maintain the following Shaheen services -  Shaheen login nodes Shaheen filesystems (project, scratch and home) .

After consulting with the PIs, the Research Computing Allocation Committee (RCAC) has recommended a single continuous downtime of 6 weeks - from 17th June to 28th July pending confirmation by IT Data Center for the substations' shipment. 

 

Shaheen Maintenance : 23rd May 2022 at 5pm.

Shaheen maintenance session will start on Monday at 5pm and hope to restore the service on Tuesday by 5pm.We are planning to install the XC40 patches and upgrade to the latest Slurm version. At the same time, we will be working on the software and firmware controlling the scratch filesystem, thus it will not be available throughout the outage. 

 

RCAC meeting

The project submission deadline for the next RCAC meeting is 31st May 2022. Please note that the RCAC meetings are held once per month. Projects received on or before the submission deadline will be included in the agenda for the subsequent RCAC meeting.The detailed procedures, updated templates and forms are available here: https://www.hpc.kaust.edu.sa/account-applications.

 

Tip of the week: Error linked to Read-only file system

As the new project filesystem (/lustre2) is mounted read-only, you may encounter an error during your run with the following message:

sys-30 : UNRECOVERABLE error on system request Read-only file system

 This is a know issue with Cray Fortran compiler enforcing the Fortran standards. it will try to open the file read-write regardless of the filesystem being mounted read-only, while code compiled and linked with Intel and GNU compiler will not face similar issue. 

You can fix the issue by specify read-only in your code, eg:

open (2, file = '/lustre2/project/kxxx/myusername/mydata.dat', action = 'read', status = 'old')

 

 

Follow us on Twitter and YouTube

Follow all the latest news on HPC within the Supercomputing Lab and at KAUST, on Twitter @KAUST_HPC.

Our KSL training recordings are now available for you to browse on-demand in our KSL YouTube channel , Subscribe and hit the notification button to keep up to date with our latest material. 

Previous Announcements

http://www.hpc.kaust.edu.sa/announcements/

Previous Tips

http://www.hpc.kaust.edu.sa/tip/