Resource limits

limit / Queue >GHPC_v1GHPC_v2GHPC_v3 nav
Max number of CPU cores that can be requested by a job24323232
Default number of CPU cores assigned to a job if not specified by user2222
Max amount of memory that can be requested by a job250 GiB385 GiB740 GiB385 GiB
Default amount of memory assigned to a job if not specified by user10 GiB/core11.75 GiB/core12 GiB/core11.75 GiB/core
Max amount of memory that can be requested per core by a job40 GiB/core40 GiB/core80 GiB/core40 GiB/core

Fair usage limits:

As resourceful as the cluster is, it is unfair for a single user to overwhelm the resource pool att he cost of other users's requests. Hence fair usage limits are put in place. The following limits apply to all users by default. If you reach this limit your further jobs will be made to wait in queue until your prior jobs complete, leaving their occupied resources back to the pool.

Maximum # of CPU cores a user can utilise as part of their running jobs = 72

Maximum amount of memory a user can reserve at any point in time = 768 GiB

Maximum number of jobs a user can have (running + pending) in the system at a time = 144

What if a user hits one of the limits above?

Their jobs will be queued and will get a chance to run only after their currently running jobs relinquish the resources so that the limits could still be satisfied.

For example, if a job is made to wait because a user's memory limit, it would show up like below.

asampath@c07b12:[~] > myst
             JOBID PARTITION     NAME     USER ST     TIME_LIMIT       TIME  NODES  CPUS MIN_MEMORY NODELIST(REASON)
              3945   ghpc_v1     bash asampath PD       12:00:00       0:00      1     1       220G (QOSMaxMemoryPerUser)

How do I know if I hit any of the limits?

myst and squeue commands will clearly state why your jobs are pending and what limits they are waiting to satisfy.

What if I need an exception?

Write an email to your sysadmin and give a convincing reason why you need extra resources.