Difference between revisions of "Why is my job not running"

From UFRC
Jump to navigation Jump to search
 
(2 intermediate revisions by the same user not shown)
Line 2: Line 2:
 
According to SLURM documentation, when a job cannot be started a reason is immediately found and recorded in the job's "reason" field in the squeue output and the scheduler moves on to the next job to consider.
 
According to SLURM documentation, when a job cannot be started a reason is immediately found and recorded in the job's "reason" field in the squeue output and the scheduler moves on to the next job to consider.
  
Common reasons why jobs are pending:
+
 
 +
Related article: [https://help.rc.ufl.edu/doc/Account_and_QOS_limits_under_SLURM Account and QOS limits under SLURM]
 +
 
 +
==Common reasons why jobs are pending==
  
 
;Priority: Resources being reserved for higher priority job. This is particularly common on Burst QOS jobs. Refer to the [https://help.rc.ufl.edu/doc/Account_and_QOS_limits_under_SLURM#Choosing_QOS_for_a_Job Choosing QOS for a Job] page for details.
 
;Priority: Resources being reserved for higher priority job. This is particularly common on Burst QOS jobs. Refer to the [https://help.rc.ufl.edu/doc/Account_and_QOS_limits_under_SLURM#Choosing_QOS_for_a_Job Choosing QOS for a Job] page for details.

Latest revision as of 19:38, 23 June 2022

According to SLURM documentation, when a job cannot be started a reason is immediately found and recorded in the job's "reason" field in the squeue output and the scheduler moves on to the next job to consider.


Related article: Account and QOS limits under SLURM

Common reasons why jobs are pending

Priority
Resources being reserved for higher priority job. This is particularly common on Burst QOS jobs. Refer to the Choosing QOS for a Job page for details.
Resources
Required resources are in use
Dependency
Job dependencies not yet satisfied
Reservation
Waiting for advanced reservation
AssociationJobLimit
User or account job limit reached
AssociationResourceLimit
User or account resource limit reached
AssociationTimeLimit
User or account time limit reached
QOSJobLimit
Quality Of Service (QOS) job limit reached
QOSResourceLimit
Quality Of Service (QOS) resource limit reached
QOSTimeLimit
Quality Of Service (QOS) time limit reached