Using common directory for complex workflows

For complex workflows, some Shaheen II users might not find all the tools and functionalities in Shaheen II environment that are needed to complete the task. Users produce large amounts of data on Shaheen II and then they might want to perform more operations on that data. These limitations could include: running gpu based code on Shaheen II data, running long data analytic/DL job on live data emanating from Shaheen II HPC workflow, performing live queries on data using data base/tool that is not available on Shaheen II etc. In order to avoid copying of data between different clusters operated by KSL, a common placeholder is available for Shaheen II users. The users can simply run jobs from Ibex (GPU or CPU nodes), Neser or Shaheen II and make use of the common directory without the need for moving data.

Here is how you can access it:

    Login to Shaheen II (ssh -Y user_name@shaheen.hpc.kaust.edu.sa) and do: cd /scratch/$USER
    Login to Neser (ssh -Y user_name@neser.hpc.kaust.edu.sa) and do: cd /scratch/$USER
    Login to Intel nodes of Ibex (ssh -Y user_name@ilogin.ibex.kaust.edu.sa) and do: cd /var/remote/lustre/scratch/$USER
    Login to GPU nodes of Ibex (ssh -Y user_name@glogin.ibex.kaust.edu.sa) and do: cd /var/remote/lustre/scratch/$USER

It is important to view workflows as data-centric tasks, rather than compute-centric tasks. You must not copy your data to different computers/clusers/supercomputers, rather you should operate on that data from various computers.

Please note that data on /scratch is deleted regularly and you should consider data on /scratch as temporary data. Ideally, you should use /scratch for creating raw data and then operate on that data to extract “useful” information from it and then copy it to your /project or /home.

In the coming weeks we will be demonstrating this functionality to you with the help of use cases.