Welcome to KAUST Supercomputing Laboratory


KAUST Supercomputing Lab (KSL)'s mission is to inspire and enable scientific, economic and social advances through the development and application of HPC solutions, through collaboration with KAUST researchers and partners, and through the provision of world-class computational systems and services.

  • Offering world-class HPC and data resources in a fashion that stimulates research and development.
  • Assisting KAUST researchers and partners to exploit the HPC resources at KAUST with a combination of training, consultation and collaboration.
  • Collaborating with KAUST researchers in the joint development of HPC solutions that advance the scientific knowledge in the disciplines strategic to KAUST mission.
  • Growing HPC capability at KSL over time to meet the future needs of the KAUST community.

 


 

  • In this newsletter:

    • RCAC meeting
    • KAUST supercomputer Shaheen II joins the fight against COVID-19
    • Tip of the Week: Detecting memory leaks and errors with Valgrind4hpc tool
    • Follow us on Twitter
    • Previous Announcements
    • Previous Tips

     

    RCAC meeting

    The project submission deadline for the next RCAC meeting is 31st July 2020. Please note that the RCAC meetings are held once per month. Projects received on or before the submission deadline will be included in the agenda for the subsequent RCAC meeting. The detailed procedures, updated templates and forms are available here: https://www.hpc.kaust.edu.sa/account-applications

     

    KAUST supercomputer Shaheen II joins the fight against COVID-19

    King Abdullah University of Science and Technology (KAUST) invites researchers from across the Kingdom to submit proposals for COVID-19-related research. Recognizing the urgency to address global challenges related to the COVID-19 pandemic through scientific discovery and innovation, the University’s Supercomputing Core Laboratory (KSL) is making computing resources—including the flagship Shaheen II supercomputer and its expert scientists—available to support research projects.

    Topics may include but are not limited to: understanding the virus on a molecular level; understanding its fluid-dynamical transport; evaluating the repurposing of existing drugs; forecasting how the disease spreads; and finding ways to stop or slow down the pandemic.

    Accepted proposals can access the following resources: (1) Shaheen II, a Cray XC-40 supercomputer based on Intel Haswell processors with nearly 200,000 compute cores tightly connected with Aries high-speed interconnect; (2) Ibex cluster, a high throughput computer system with about 500 computing nodes using Intel Skylake and Cascade Lake CPUs and Nvidia V100 GPUs; and (3) KSL staff scientists, who will provide support, training and consultancy to maximize impact. Through 30 June 2020, up to 15% of these resources will be reserved for fast-tracking competitive COVID-19 proposals through the KAUST Research Computing Allocation Committee.  Thereafter, such proposals remain welcome and will be considered in the standard process.

    Applicants can apply for computing allocations using the COVID-19 Project Proposal form. Please submit the form to projects@hpc.kaust.edu.sa. Submitted proposals will be fast-tracked for processing.

    Please contact help@hpc.kaust.edu.sa with any inquiries.

     

    Tip of the week: Detecting memory leaks and errors with Valgrind4hpc tool

    Valgrind4hpc debugging tool helps in the detection of memory leaks and errors in parallel applications. It's similar to valgrind, which is designed for serial applications.

    Compile and link with  -g option , then allocate and follow the steps shown bellow. This is an example using one node with 2 tasks.

    salloc -N 1
    module unload darshan xalt
    module load valgrind4hpc
    export CTI_WLM_IMPL=slurm
    export CTI_LAUNCHER_NAME=srun
    valgrind4hpc -n2 --launcher-args="--hint=nomultithread --ntasks=2" --valgrind-args="--track-origins=yes --leak-check=full" ./my_exe

    Here is a clean output. Otherwise, follow the instructions to detect the memory leaks:

    RANKS: <0,1>
    HEAP SUMMARY:
      in use at exit: 0 bytes in 0 blocks
    All heap blocks were freed -- no leaks are possible
    ERROR SUMMARY: 0 errors from 0 contexts (suppressed 19)

    To run your program and debug it across multiple nodes, allocate the desired number of nodes and then update accordingly the parameters in the launcher-args similar to the option for the srun/sbatch script.

    Note that valgrind4hpc and target program arguments should be seperated by two dashes, --

    More information is available in the man pages of valgrind and valgrind4hpc.

     

    Follow us on Twitter

    Follow all the latest news on HPC within the Supercomputing Lab and at KAUST, on Twitter @KAUST_HPC.

    Previous Announcements

    http://www.hpc.kaust.edu.sa/announcements/

    Previous Tips

    http://www.hpc.kaust.edu.sa/tip/