Once you submit a job, check its status before closing your sessions. You can get more details and reasons for your job not running by typing squeue --job <jobid > –l :
squeue --job 12376532 -l Thu Jun 4 18:24:21 2020 JOBID PARTITION NAME USER STATE TIME TIME_LIMI NODES NODELIST(REASON) 12376532 workq myjob user1 PENDING 0:00 1-00:00:00 32 (AssocMaxJobsLimit)
A job may be waiting for more than one reason, in which case only one of those reasons is displayed. Here are the most common codes that identify the reason that a job is waiting for execution:
- AssocMaxJobsLimit The Account associated to the job does not have enough core hours
- AssociationJobLimit The job's association has reached its maximum job count.
- Dependency This job is waiting for a dependent job to complete.
- InvalidQOS The job's QOS is invalid.
- PartitionNodeLimit The number of nodes required by this job is outside of it's partitions current limits. Can also indicate that required nodes are DOWN or DRAINED.
- PartitionTimeLimit The job's time limit exceeds it's partition's current time limit.
- QOSJobLimit The job's QOS has reached its maximum job count.
- ReqNodeNotAvail Some node specifically required by the job is not currently available. The node may currently be in use, reserved for another job, in an advanced reservation
More information is available in the man page of squeue or contact us at help@hpc.kaust.edu.sa