Announcements

KAUST Supercomputing Laboratory Newsletter 16th November 2016

System Maintenance

There will be an extended down time of all systems from 15:00 on 30th November until 17:00 on 6th December. This is for an upgrade to the power and cooling to allow Shaheen to be run at full capacity without power capping. During this period we will not be able to read or respond to any emails sent to help@hpc.kaust.edu.sa.

KAUST Supercomputing Laboratory Newsletter 2nd November 2016

System Maintenance

There will be an extended down time of all systems from 15:00 on 30th November until 17:00 on 6th December. This is for an upgrade to the power and cooling to allow Shaheen to be run at full capacity without power capping. During this period we will not be able to read or respond to any emails sent to help@hpc.kaust.edu.sa.

KAUST Supercomputing Laboratory Newsletter 26th October 2016

Data Centre Firewall Upgrade

On 27th October between 17:00 and 21:00, KAUST IT will be upgrading the SCC firewall. As Shaheen is behind this firewall, there will be intermittent access during the upgrade period.

KAUST Supercomputing Laboratory Newsletter 19th October 2016

Data Centre Firewall Upgrade

On 27th October between 17:00 and 21:00, KAUST IT will be upgrading the SCC firewall. As Shaheen is behind this firewall, there will be intermittent access during the upgrade period.

System Maintenance *Updated*

The next scheduled maintenance session on Shaheen requires an extended outage outage from 12:00 on Monday 24th October until 17:00 on Tuesday 25th October. There will be no access to the system during this period.

KAUST Supercomputing Laboratory Newsletter 13th October

Data Centre Firewall Upgrade

On 27th October between 17:00 and 21:00, KAUST IT will be upgrading the SCC firewall. As Shaheen is behind this firewall, there will be intermittent access during the upgrade period.

System Maintenance *Updated*

The next scheduled maintenance session on Shaheen requires a 24 hour outage from 17:00 on Monday 24th October until 17:00 on Tuesday 25th October, there will be no access to the system during this period.

KAUST Supercomputing Laboratory Newsletter 5th October

System Maintenance

The next scheduled maintenance session on Shaheen is Tuesday 25th October from 08:00 until 17:00, there will be no access to the system during this period.

There will be an extended down time of all systems from 15:00 on 30th November until 17:00 on 6th December. This is for an upgrade to the power and cooling to allow Shaheen to be run at full capacity without power capping. During this period we will not be able to read or respond to any emails sent to help@hpc.kaust.edu.sa.

KAUST Supercomputing Laboratory Newsletter 28th September

Two Factor Authentication

Two Factor Authentication is now in operation for the Shaheen login nodes. If you have not already set up 2FA, please follow the instructions at https://www.hpc.kaust.edu.sa/content/two-factor-authentication-shaheen to be able to login.

KAUST Supercomputing Laboratory Newsletter 21st September

Two Factor Authentication

Two Factor Authentication is now in operation for the Shaheen login nodes. If you have not already set up 2FA, please follow the instructions at https://www.hpc.kaust.edu.sa/content/two-factor-authentication-shaheen to be able to login.

KAUST Supercomputing Laboratory Newsletter 7th September

SCC UPS Battery and Capacitor Replacement

Shaheen will be running at a slightly reduced capacity between 10th and 21st September whilst the Data Centre team works on UPS improvements.

Maintenance Session Tuesday 20th September

The next scheduled maintenance on Shaheen will be Tuesday 20th September from 08:00 until 17:00. There will no access to the system during this period.

KAUST Supercomputing Laboratory Newsletter 31st August

Maintenance Session Tuesday 20th September

The next scheduled maintenance on Shaheen will be Tuesday 20th September from 08:00 until 17:00. There will no access to the system during this period.

KAUST Supercomputing Laboratory Newsletter 24th August

Maintenance Session Tuesday 20th September

The next scheduled maintenance on Shaheen will be Tuesday 20th September from 08:00 until 17:00. There will no access to the system during this period.

KAUST Supercomputing Laboratory Newsletter 10th August

Maintenance Session Tuesday 20th September

The next scheduled maintenance on Shaheen will be Tuesday 20th September from 08:00 until 17:00. There will no access to the system during this period.

KAUST Supercomputing Laboratory Newsletter 3rd August

Neser Last Day of Operation

Please note this system will be decommissioned on 30th November 2016.

After this date all data in /project and /home will be deleted. Please ensure that you have transferred any data you wish to retain.

Shaheen unavailable Sunday 7th August

Shaheen will be unavailable from 08:30 on 7th August for approximately 4 hours whilst it is being rebooted.

KAUST Supercomputing Laboratory Newsletter 20th July

Maintenance Session Tuesday 2nd August

The next maintenance session will be on Tuesday 2nd August from 09:00 until 17:00. There will be no access to the system during this period.

Running watch on squeue

In the KAUST Supercomputing Laboratory Newsletter of 18th May, we requested all Shaheen users to refrain from running watch on squeue. Unfortunately, this instruction went unobserved by some users, placing an unacceptable load on the SLURM scheduler. We have therefore taken measures to prevent usage of the watch command.

KAUST Supercomputing Laboratory Newsletter 13th July

Maintenance Session Tuesday 2nd August

The next maintenance session will be on Tuesday 2nd August from 09:00 until 17:00. There will be no access to the system during this period.

KSL Workshop Series: Introduction to Parallel Computing on Shaheen II

Thursday, 14th July from 10:00 am - 12:30 pm

SeaView room, Level 3, University Library

The aim of this course is to give new users of Shaheen II an introductory overview of the system and its usage, and to help them make efficient use of their allocated resources.

KAUST Supercomputing Laboratory Newsletter 23rd June

Maintenance Session Tuesday 2nd August

The next maintenance session will be on Tuesday 2nd August from 09:00 until 17:00. There will be no access to the system during this period.

RCAC Meeting

The project submission deadline for the next RCAC meeting is 30th June 2016. Please note that the RCAC meetings are held once per month. Projects received on or before this deadline will be included in the agenda for the next RCAC meeting, scheduled to be held in July 2016. The detailed procedure and the forms are available here:

KAUST Supercomputing Laboratory Newsletter 1st June

Power Capping

As you are aware the system was previously running with a power cap of 2200KW with occasional sessions without power capping but with a reduced number of nodes. 

 

With the approval of the RCAC is has been decided that we are going to trial an extended run with power capping off, this should mean that code may run faster and will also allow us to prepare a number of projects code to run at scale without capping.

 

KAUST Supercomputing Laboratory Newsletter 25th May

Maintenance Session Tuesday 31st May

The next maintenance session will be on Tuesday 31st May from 09:00 until 17:00. There will be no access to the system during this period.

RCAC Meeting

The project submission deadline for the next RCAC meeting is 31st May 2016. Please note that the RCAC meetings are held once per month. Projects received on or before this deadline will be included in the agenda for the next RCAC meeting, scheduled to be held in June 2016. The detailed procedure and the forms are available in the following webpage.

KAUST Supercomputing Laboratory Newsletter 18th May 2016

Maintenance Session Tuesday 31st May

The next maintenance session will be on Tuesday 31st May from 09:00 until 17:00. There will be no access to the system during this period.

Running watch on squeue

Some users are running a watch on squeue to monitor their jobs.

Please refrain from doing this as it places an increased load on the job scheduler.

If you need notice when a job is starting you can ask Slurm to do this within your job submission file, as follows:

KAUST Supercomputing Laboratory Newsletter 11th May 2016

Shaheen II power uncapped

Shaheen II power will be uncapped this weekend (Thursday 5:00 PM until Sunday 8:00 AM). Around 4,000 nodes will be available. The remaining 2,000 nodes will be drained and not available. This is to allow performance tuning and code execution without power restrictions on compute nodes. This session is open to all Shaheen II users.

 

RCAC Meeting

KAUST Supercomputing Laboratory Newsletter 4th May 2016

XSEDE HPC Workshop: OpenMP

The registration page for the XSEDE HPC Monthly Workshop Series - 10th May - OpenMP session is up. The portal registration page can be found here:
https://portal.xsede.org/course-calendar/-/training-user/class/488/sessi...

If there is enough interest, KSL will investigate the possibility of streaming this evening course live at the Library computer room starting at 6pm.

KAUST Supercomputing Laboratory Newsletter 27th April 2016

RCAC Meeting:

The project submission deadline for the next RCAC meeting is 30th April 2016. Please note that the RCAC meetings are held once per month. Projects received on or before this deadline will be included in the agenda for the next RCAC meeting, scheduled to be held on 26th May 2016.

KAUST Supercomputing Laboratory Newsletter 20th April 2016

RCAC Meeting:

The project submission deadline for the next RCAC meeting is 30th April 2016. Please note that the RCAC meetings are held once per month. Projects received on or before this deadline will be included in the agenda for the next RCAC meeting, scheduled to be held on 26th May 2016.

 

Tip of the week: Performance analysis in 10 steps

1) Connect to Shaheen II with -X:ssh -X username@shaheen.hpc.kaust.edu.sa

2) Load/unload the following modules:

KAUST Supercomputing Laboratory Newsletter 13th April 2016

KSL Workshop Series: Introduction to performance tools on the Cray XC40 supercomputer Shaheen II

Sunday, April 17
9:30 – 11:00 a.m.
Sea view room, Level 3, University Library

KAUST Supercomputing Laboratory Newsletter 23rd March 2016

Maintenance Session Sunday 27th to Tuesday 29th March

The next maintenance session is scheduled to take place from 08:00 on Sunday 27th March until 17:00 on Tuesday 29th March. Please note that this is an extended outage to enable us to test some large scale code that was deferred to avoid disruption to other users and also to install the latest patches and compiler environment.

RCAC Meeting

The project submission deadline for the next RCAC meeting is 31st March 2016. Please note that the RCAC meetings are held once per month.

KAUST Supercomputing Laboratory Newsletter 24th February 2016

XSEDE HPC Workshop: OpenACC

The registration page for the XSEDE HPC Monthly Workshop Series - March 8th - OpenACC session is up.
The portal registration page can be found here:

https://portal.xsede.org/course-calendar/-/training-user/class/461/sessi...

If there is enough interest, KSL will investigate the possibility of streaming this course live in the Library computer room on the 8th of March.

KAUST Supercomputing Laboratory Newsletter 17th February 2016

The Third Annual Workshop on "Accelerating Scientific Applications Using GPUs"

The KAUST Supercomputing Laboratory is co-organizing with NVIDIA, a leader in accelerated computing, a one day workshop on accelerating scientific applications using GPUs on Tuesday February 23rd, 2016 in the auditorium between building 2 and 3. To register to the event, please click here

KAUST Supercomputing Laboratory Newsletter 10th February 2016

Annual Power Maintenance

Due to the annual power maintenance in the data centre, all of the systems will be unavailable from 08:00 on Thursday 11th February until approximately 10:00 on Monday 15th February.

Please note that the system will not be available to run jobs from 15:00 today as it has been reserved to run large scale jobs that were deferred to avoid disruption to other users.

Shaheen Storage

A reminder of the polices in place for Shaheen storage:

KAUST Supercomputing Laboratory Newsletter 3rd February 2016

XSEDE HPC Workshop: MPI

The registration page for the XSEDE HPC Monthly Workshop Series - February 9-10 - MPI session is up.

The portal registration page can be found here: https://portal.xsede.org/course-calendar/-/training-user/class/456/sessi...

If there is enough interest, KSL will investigate the possibility of streaming this course live in the Library computer room next Tuesday and Wednesday.

KAUST Supercomputing Laboratory Newsletter 27th January 2016

RCAC Meeting

The project submission deadline for the next RCAC meeting is 31st January 2016. Please note that the RCAC meetings are held once per month.

Annual Power Maintenance

Due to the annual power maintenance in the data centre, all of the systems will be unavailable from 08:00 on Thursday 11th February until approximately 10:00 on Monday 15th February.

KAUST Supercomputing Laboratory Newsletter 20th January 2016

Annual Power Maintenance

Due to the annual power maintenance in the data centre, all of the systems will be unavailable from 08:00 on Thursday 11th February until approximately 10:00 on Monday 15th February.

KAUST Supercomputing Laboratory Newsletter 13th January 2016

XSEDE HPC Workshop: OpenMP

The registration page for the XSEDE HPC Monthly Workshop Series - January 20th - OpenMP session is up.
The portal registration page can be found here:
https://portal.xsede.org/course-calendar/-/training-user/class/454/sessi...
If there is enough interest, KSL will investigate the possibility of streaming this course live in the Library computer room on the 20th of January.

KAUST Supercomputing Laboratory Newsletter 6th January 2016

XSEDE HPC Workshop: OpenMP

The registration page for the XSEDE HPC Monthly Workshop Series - January 20th - OpenMP session is up.
The portal registration page can be found here:

https://portal.xsede.org/course-calendar/-/training-user/class/454/sessi...

If there is enough interest, KSL will investigate the possibility of streaming this course live in the Library computer room on the 20th of January.

KAUST Supercomputing Laboratory Newsletter 29th December 2015

Firewall Upgrade

Please note the following alert from KAUST IT services:

Firewall upgrade has been scheduled on Thursday 31st December 2015 from 17:00 to 21:00 AST.

All services hosted will face intermittent downtime during the upgrade. These services include KSL website and may affect your login to Shaheen II and Neser.

RCAC Meeting

The deadline for project submission for the next RCAC meeting is Thursday 31st December 2015. Please note that RCAC meeting are held every month.

KAUST Supercomputing Laboratory Newsletter 10th December 2015

RCAC Meeting

The next scheduled RCAC (Research Computing Allocation Committee) meeting is Thursday 17th December.

Retirement of Neser and Associated Scratch Storage

As previously announced, Neser will be decommissioned on January and the last date that jobs will be scheduled to run is 31st December 2015.

We will continue to have the Shaheen I/Neser ‘home' and ‘project' filesystems available until at least 31st July 2016. However, please note that the ‘scratch' filesystem will be taken off-line and deleted on the 1st February 2016.

KAUST Supercomputing Laboratory Newsletter 25th November 2015

Shaheen II Filesystem

We are pleased to confirm that the problem affecting the filesystem availability has been resolved and the system is fully available for use.

Maintenance Session 1st December 2015

The next maintenance session on Shaheen II will be on Tuesday 1st December from 12:00 until 17:00.

XSEDE HPC Workshop: OpenACC

The registration page for the XSEDE HPC Monthly Workshop Series - December 3 - OpenACC session is up.

The portal registration page can be found here:

KAUST Supercomputing Laboratory Newsletter 17th November 2015

SLURM workq_high and workq_low

Please note that workq_high and workq_low will be removed from the system on the 1st December. If you have either of these partitions specified in your job control file, they should be removed. As workq is the default partition, this does not need to be specified.

Neser Last Day of Service

Please note this system will be decommissioned in January, the last day that any job will be able to run is 31st December 2015.

Shaheen II Availability

We are pleased to confirm that Shaheen was brought back online at 08:00 this morning and is fully available for use.

Shaheen II Emergency Shutdown

Dear Users

 

We have encountered a major issue affecting the availability of the Lustre filesystem.

 

Cray have recommended that we perform and immediate shutdown of the system to prevent data loss.

 

We are working on identifying the reason for the failure and will update you when we have more information.

 

Shaheen II hardware is complete!

Dear Shaheen II Users, 

Over the last couple of weeks you have experienced considerable disruption in using Shaheen II due to a combination of scheduled maintenance sessions and unforeseen failures in hardware and software. We sincerely apologize for the inconvenience these might have caused. Our team works hard to minimize these downtimes, keeping as our most important goal to ensure you are highly productive on our systems.

Shaheen II Status

Good Morning

 

We are pleased to confirm that the issues we encountered with system following last week’s maintenance session have now been resolved.

 

The system is now fully available to run jobs.

 

We would also like to remind you that the next maintenance session on Shaheen II will be from 08:00 on the 8th November for 3 days (until 08:00 on the 11th November).

 

Announcements, 27th October 2015

Extended maintenance sessions in October.

Maintenance work on Shaheen II is taking longer than originally envisioned. Service will not be returned by tomorrow but we are hopeful that Shaheen II will be operational by the end of the week. There will be occasional disruption to the CDLs, but at least one CDL will be available for users to login to during this period.

Announcements, 14th October 2015

Extended maintenance sessions in October.

We would like to remind our users of the extended outage on Shaheen II in October for necessary maintenance:

25th October for 3 days

There will be no access to the system during these periods.

Tip of the week: Queues on Shaheen II Cray XC40

Two different queue are available on Shaheen II:

Announcements, 8th September 2015

KSL Workshop Towards High Efficiency Computing with Allinea

KAUST Supercomputing Laboratory presents the Allinea Software workshop on HPC profiling and debugging: "Towards High Efficiency Computing with Allinea" on October 4th, starting at 9am.

Workshop topics include:

Announcements, 11th August 2015

Maintenance Session Tuesday 18th August

The next maintenance session will be on Tuesday 18th August from 12:00 until 17:00. There will be no access to the system during this period. This will affect Shaheen I, Neser and Shaheen II. Important security updates, custom patches and several bug fixes will be applied to the XC40, which will require the whole system to be rebooted.

Tip of the Week: What command did I type before ?

History command

Announcements, 26th May 2015

Shaheen II Cray XC40 Workshop Announcement

Date: 7th June to 11th June 2015

Where: KAUST: Auditorium Al-Haytham (down the steps between Bldg2 and Bldg3)

KAUST Supercomputing Lab and Cray are offering a series of three courses:

*Sunday 7th June to Tuesday 9th, 2015 Introduction to the new Shaheen II Cray XC40

*Wednesday 10th June 2015 Efficient Parallel I/O

*Thursday 11th June 2015 Port and optimize your own code on the Cray XC40

Announcements, 28th April 2015

Shaheen-I job size limitation

We only have a limited number of spare parts for Shaheen, and yesterday we exhausted our stock of node cards.

We have had another node card failure this morning, which means that we are now in the situation where we are ‘cannibalising’ the system to supply parts.

With immediate effect we have taken two node cards offline in rack 00.

This means that we can no longer run 16 rack jobs and the maximum size job that can be run on Shaheen is now 12288 nodes (12 racks).

Announcements, 7th April 2015

Power Outage Thursday 9th April to Monday 13th April

In preparation for the introduction of the new Cray supercomputer, there will be a site-wide power outage to the Data Centre currently housing Shaheen1 and Neser. All services, including Shaheen and Neser, will be shut down from 16:00 on Thursday 9th April until approximately 11:00 on Monday 13th April.

We apologise for the late notice and for any inconvenience that this may cause.

Pages