KAUST Supercomputing Laboratory Newsletter 15th April
In this newsletter:
- Data Centre and Shaheen Maintenance, 22-29 April 2021
- RCAC meeting
- Tip of the week: Fix memory problems by increasing the number of nodes
- Follow us on Twitter
- Previous Announcements
- Previous Tips
Data Centre and Shaheen Maintenance, 22-29 April 2021
Our next maintenance session on Shaheen will take place from 15:30 on the 22nd April until 17:00 on the 29th April. The data centre team will be performing their annual PPM on the power supply equipment. At the same time, we will upgrade the software version and firmware on Shaheen and Neser’s existing Lustre (project and scratch) filesystem. This is an essential step before bringing our newly acquired filesystem online and providing more project storage space. As we are upgrading the Lustre filesystem there will be no access to data during this period.
Please contact us at help@hpc.kaust.edu.sa should you have any concerns or questions.
RCAC meeting
The project submission deadline for the next RCAC meeting is 30th April 2021. Please note that the RCAC meetings are held once per month. Projects received on or before the submission deadline will be included in the agenda for the subsequent RCAC meeting.The detailed procedures, updated templates and forms are available here: https://www.hpc.kaust.edu.sa/account-applications
Tip of the week: Fix memory problems by increasing the number of nodes
On Shaheen, the physical memory available for each compute node is 128 GB and the total number of CPU cores per node is 32. When we put 32 MPI tasks on each node, the memory available for each MPI task is 128 GB/32 = 4 GB. In some situations where the memory requirement per MPI task exceeds 4 GB and could not be further reduced by just increasing the number of MPI tasks, the jobs fail due to memory problems. To fix this issue, we can further increase the number of nodes (--nodes) while keeping the number of MPI tasks (--ntasks) unchanged. By doing this, the number of MPI tasks per node is reduced and the memory for each MPI task is increased (note: By default, the MPI tasks are evenly distributed over the allocated nodes).
Follow us on Twitter
Follow all the latest news on HPC within the Supercomputing Lab and at KAUST, on Twitter @KAUST_HPC.
Previous Announcements
http://www.hpc.kaust.edu.sa/announcements/