Frequently Asked Questions

  1. I am unable to access the clusters (Account Problems)
    1. I have used Dragon (now called Ibex) (or Shaheen, Noor, SMC,...) in the past OR I am a permanent (not-visiting) faculty, staff or student
      • To verify that there isn't a problem with your KAUST portal username/password, try to log into another KAUST system, for example ​KAUST Webmail or the ​KAUST Portal. If you can log into those systems, then Contact the Ibex sysadmins for help. If you are unable to log into any KAUST system, then wait 15 minutes and try again (to see if your account was temporarily locked out due to password failure) or contact the IT Helpdesk for assistance.
    2. I am an external collaborator or visitor to KAUST.
      • To create account for external user the PI or user sponsor must login to the KAUST Portal (portal.kaust.edu.sa) and select 'Self-Services'. Under 'Self-Services' find the tile named 'VPN access for External Users'.
      • Please fill all necessary entries and make sure to add the following message in field 'Host/IP/Services': 'Please add the user to“ibex-login” VPN group'.
      • Then submit the request.
      • In case of any issues with VPN access or configuration please contact KAUST IT department at ithelpdesk@kaust.edu.sa (VPN is outside of Ibex jurisdiction).
  2. Constraints and Features
    • Ibex makes heavy use of features and the contraints flags to direct jobs onto the appropriate resources. Combined with GRES this is a powerful and flexible way to allow a set of defaults which does the right thing for people who just want to run basic tasks and don't care about architecture, extra memory, accelerators, etc.
    • Below are some examples of how to request different resource configurations. The nodes are weighted such that the least valuable/rare node which can satisfy the request will be used. Be specific if you want a particular shaped resource.
    1. To see a list of full node features:
      [hanksj@dbn-503-5-r:~]$ sinfo --partition=batch --format="%n %f"
      
    2. Specific CPU architecture:

      The Intel nodes perform much better for floating point operations while the AMD nodes are more efficient at integer operations. A common approach to optimizing your workload is to send integer or floating point work to the correct arch. Each node has a feature, either intel or amd, for it's arch. To select one:

      # Intel
      [hanksj@dm511-17:~]$ srun --pty --time=1:00 --constraint=intel bash -l
      [hanksj@dbn711-08-l:~]$ grep vendor /proc/cpuinfo | head -1
      vendor_id	: GenuineIntel
      [hanksj@dbn711-08-l:~]$ 
      
      # AMD
      [hanksj@dm511-17:~]$ srun --pty --time=1:00 --constraint=amd bash -l
      [hanksj@db809-12-5:~]$ grep vendor /proc/cpuinfo | head -1
      vendor_id	: AuthenticAMD
      [hanksj@db809-12-5:~]$ 
      
    3. Specific GPU or specific architecture:
      • There are three basic ways to ask for GPUs.
        1. You want a specific count of a specific model of GPU
          • # Request 2 P100 GPUs.
            [hanksj@dm511-17:~]$ srun --pty --time=1:00 --gres=gpu:p100:2 bash -l
            [hanksj@dgpu703-01:~]$ nvidia-smi
        2. You want a specific count of any type of GPU
          • # Request 1 GPU of any kind
            [hanksj@dm511-17:~]$ srun --pty --time=1:00 --gres=gpu:1 bash -l
            [hanksj@dgpu502-01-r:~]$ nvidia-smi 
        3. If there are no nodes available; raise a ticket to the systems team to do a reservation for a specific node clarifying the reasons and scope of work.
  3. How many types of nodes are available on the GPU cluster?
    1. V100
    2. P100
    3. P6000
    4. GTX 1080 Ti
    5. RTX 2080 Ti
  4. Why should I set --time= in all jobs?
    Setting a --time to the best estimate possible for your job accomplishes several important functions:
    • Using the shortest time possible makes the job better suited to running as backfill, making it run sooner for you and increasing overall utilization of the resources.
    • When a future reservation is blocking nodes for maintenance or other purposes, specifying the shortest time possible can allow more jobs to run to completion before the reservation becomes active.
    • Forcing the inclusion of --time in all jobs reduces confusion resulting from job behavior under non-optimal default time limit settings.
    • Learning to estimate how long your applications will run makes you a better and more well-rounded person.
  5. Why do I get the following locale error?
    Setting locale failed.
    Please check that your locale settings:
    LANGUAGE = (unset),
    LC_ALL = (unset),
    LC_CTYPE = "UTF-8",
    LANG = (unset)
    are supported and installed on your system.

    This is just a warning indicating your locale are not defined so the system is failing back to the standard locale. To avoid receiving these messages you have 2 options:

    If you are working with Mac, change your terminal preferences: Terminal -> Preferences. Then select the Advanced tab. At the bottom you will see a check box labeled "Set locale environment variables on startup", make sure it is unchecked.

    If you are working on a Linux box, add the following lines to your .bashrc file (it should be in your IBEX home directory ~/.bashrc):

    export LANGUAGE=en_US.UTF-8
    export LC_ALL=en_US.UTF-8
    export LC_CTYPE=en_US.UTF-8
    export LANG=en_US.UTF-8

    Now you can either source your .bashrc file (type source ~/.bashrc) or you can execute a new shell (just type bash) or log out and log back in to make sure it works.