Resource limits

limit / Queue >	GHPC	ZEN4	nav_zen4	nav
Max number of CPU cores that can be requested by a job	32	128	128	32
Default number of CPU cores assigned to a job if not specified by user	2	2	2	2
Max amount of memory that can be requested by a job	740 GiB	1.5 TiB	1.5 TiB	385 GiB
Default amount of memory assigned to a job if not specified by user	11.7 GiB/core	11.75 GiB/core	11.7 GiB/core	11.75 GiB/core

Fair usage limits:

As resourceful as the cluster is, it is unfair for a single user to overwhelm the resource pool att he cost of other users's requests. Hence fair usage limits are put in place. The following limits apply to all users by default. If you reach this limit your further jobs will be made to wait in queue until your prior jobs complete, leaving their occupied resources back to the pool.

Maximum # of CPU cores a user can utilise as part of their running jobs = 72

Maximum amount of memory a user can reserve at any point in time = 768 GiB

Maximum number of jobs a user can have (running + pending) in the system at a time = 144

What if a user hits one of the limits above?

Their jobs will be queued and will get a chance to run only after their currently running jobs relinquish the resources so that the limits could still be satisfied.

For example, if a job is made to wait because a user's memory limit, it would show up like below.

asampath@c07b12:[~] > myst
             JOBID PARTITION     NAME     USER ST     TIME_LIMIT       TIME  NODES  CPUS MIN_MEMORY NODELIST(REASON)
              3945   ghpc        bash asampath PD       12:00:00       0:00      1     1       220G (QOSMaxMemoryPerUser)

How do I know if I hit any of the limits?

myst and squeue commands will clearly state why your jobs are pending and what limits they are waiting to satisfy.

What if I need an exception?

Write an email to your sysadmin and give a convincing reason why you need extra resources.

GHPC wiki

Resource limits

Fair usage limits:

What if a user hits one of the limits above?

How do I know if I hit any of the limits?

What if I need an exception?