PSTAT(l) Livermore Computing Local UNIX Manual PSTAT(l) NAME pstat - List attributes of jobs under control of the DPCS SYNOPSIS pstat [-b bank] [-u user] [-m host | -c constraint] [-M] [-H] [-T] [-v | -f | -o outspec] [-s sortkey] pstat [-n] jidlist [-M] [-H] [-T] [-v | -f | -o outspec] [-s sortkey] pstat -A [-M] [-H] [-T] [-v | -f | -o outspec] [-s sortkey] pstat -h DESCRIPTION The pstat command is used to display the attributes of selected jobs under control of the DPCS. The command line options are used to del- imit or qualify the set of jobs to be listed except for the -M, -f, -H, -v and -o options which are used to control the amount of infor- mation shown. All display data is written to standard output. All error data is written to standard error. If there are no jobs to show, pstat terminates with no data written to standard out. When pstat is invoked with no arguments the output is restricted to jobs that are under the control of the calling user. Normally only jobs that are running or are waiting to run are displayed. With the use of the -T option, you may see the data regarding jobs that have run and have subsequently terminated. (Jobs that have not run before being removed cannot be displayed.) Ter- minated jobs will have one of two statuses. If the status is REMOVED, the job was forcefully removed from running via the utility prm or by DPCS itself because of the job reached a resource limit. Otherwise, the status will be COMPLETE. Default output has a heading line. Then on one line for each job pstat displays the DPCS jobid (JID), the job or request name (NAME), owner name (USER), account name (ACCOUNT), bank name (BANK), status (STATUS), executing or candidate host (EXEHOST), job class (CL), and expedite count (XCT). OPTIONS Command line options that govern output detail: -H Turns off the header display. -h Causes a usage message to be printed to standard error. -f Displays all known information about the job. The ouput from this option is multiple lines. This option may not be used in conjuction with the -o or -v options. -o outspec List only the fields specified by outspec. This option may not be used in conjuction with the -f or -v options. outspec may specify multiple fields, comma separated. Valid field specifications include: account, DOE/LLNL/LCC September 4, 2002 1 PSTAT(l) Livermore Computing Local UNIX Manual PSTAT(l) aging_time, bank, batchid, cl, constraint, cpn, depend, ecomptime, earliest_start, exehost, highwater, jid, maxcputime, maxmem, maxnodes, maxphyss, maxrss, maxrun- time, memint, memsize, name, nsrs, preempted_by, prior- ity, runtime, sid, status, stoptime, submitted, tasks, timecharged, timeleft, used, user, vmemint, and xct. If no output format specification is supplied (no -f, -o, or -v options), the environment variable PSTAT_CONFIG may specify output format (e.g., "setenv PSTAT_CONFIG name,user"). One additional possible value for outspec is "default", which means to use the pstat default out- put format. Field specifications may be abbreviated. Fields specified by the -o option or via the PSTAT_CONFIG environment variable will be listed in the order specified. -s sortorder This option determines the order in which job statuses are displayed. The default order is in jid order. The sortorder is specified by naming a column of data on which to sort. The column selected for sorting the jobs need not itself be displayed. The columns that can be specified are STRING: account, bank, batchid, exehost, name, sid, user NUMERICAL: aging_time, depend, earliest_start, highwa- ter, jid, maxcputime, maxnodes, maxphyss, maxrss, memsize, stoptime, timecharged, timeleft, used, xct PRIORITY: priority If a STRING column is specified, jobs are displayed in lexical order, from lowest to highest, of the values in the specified column. If a NUMERICAL column is speci- fied, jobs are displayed in numerical order, from lowest to highest, of the values in the specified column. If the PRIORITY column is specified, jobs are displayed in numerical order, from highest to lowest, of the values in the priority column. The order of the displayed values can be reversed by prepending a '-' character to the sortkey. For instance, "-s priority" would cause jobs to be displayed in priority order (highest priority job first, to lowest priority job last) while "-s -prior- ity" would cause jobs to be displayed in reverse prior- ity order (lowest priority job first, to highest prior- ity job last). DOE/LLNL/LCC September 4, 2002 2 PSTAT(l) Livermore Computing Local UNIX Manual PSTAT(l) -v In addition to default fields, aging time (AGING_TIME), maximum cpu time (MAXCPUTIME), the sum of the last measured sizes of all processes in the job (MEMSIZE), dependency (DEPEND), time used per task (USED) and the earliest start time (EARLIEST_START) are displayed. Data is written on a single line for each job. This option may not be used in conjuction with the -f or -o options. Command line options that limit jobs selected: -A List all jobs in the DPCS system. This option is ignored when used with the -b, -m, -n or -u options. -b bank List only jobs that are drawing from bank. This option may not be used in conjunction with the -n option. -c constraints List only jobs that are running, or have the potential to run, on hosts that match the given constraint list. (For more information, please see A WORD ABOUT CON- STRAINTS in the psub manpage.) -m hostname List only jobs that are running, or have the potential to run, on hostname. -n jidlist List only the jobs with DPCS jobids that correspond to one of the items in jidlist. jidlist must be a comma separated list of DPCS jobids. This option may not be used in conjunction with either the -b or -u options. -T Display the status of completed and removed jobs that were under control of the DPCS as well as currently running or waiting jobs. -u uname List only jobs that are owned by uname. This option may not be used in conjunction with the -n option. FIELD DESCRIPTIONS ACCOUNT The account name associated with the job. AGING_TIME If the job is not running but is eligible to be scheduled, the date-time at which it became eligible to be scheduled. If the job is running, the date-time at which the job was scheduled to run. If the job is terminated, the date-time at which it was terminated or removed. Otherwise, this field has no meaning. BANK The bank from which resources used by the job are drawn. DOE/LLNL/LCC September 4, 2002 3 PSTAT(l) Livermore Computing Local UNIX Manual PSTAT(l) BATCHID The identifier assigned to the job by the native batch system that scheduled it. CL The job class. 'N' means normal, 'p' means short production, 'S' means standby and 'X' means expedited. CONSTRAINT The constraints placed on the residence of job execution (the contents of the -c option to psub) as well as the geometry specified for the job if applicable. CPN The number of cpus per node requested for the job. If none has been specified for the job, a value of 1 is assumed. DEPEND The jobid of the job which must complete (or be removed) before this job can be scheduled to run. (If '0', this job is not dependent on any other job.) EARLIEST_START The earliest date-time at which the job is permitted to run (assigned via the -A option to psub or palter.) ECOMPTIME If the job is running, the estimated completion time of the job. EXEHOST If the job is running, the name of the host at which the job is running. HIGHWATER The largest measured size of all processes in the job over the life of the job. JID The DPCS assigned job identifier. MAXCPUTIME The maximum cpu time that the job is permitted to consume. MAXMEM The largest size permitted for any process in the job in kilo- bytes. MAXNODES The number of nodes that the job requested and is permitted to use. MAXPHYSS The maximum across all the nodes of a job of the largest sum of physical sizes through time summed over all process of the job. DOE/LLNL/LCC September 4, 2002 4 PSTAT(l) Livermore Computing Local UNIX Manual PSTAT(l) MAXRSS The maximum across all the nodes of a job of the largest sum of resident set sizes through time summed over all processes of the job. MAXRUNTIME The maximum elapsed time that the job is permitted to run. MEMINT The resident set memory integral used by the job. MEMSIZE If the job is running or has ever run, the sum of the last meas- ured sizes of all processes that are part of the job. Otherwise, the predicted size of the job when it does start running. NAME The name assigned to the job (e.g., via the -r option to psub). NSRS The nonshared resource requirements of the job (the contents of the -ns option to psub). PRIORITY The job's relative scheduling priority. Unlike UNIX "nice" values, jobs with higher priorities are "more important" to be scheduled than jobs with lower priorities. The priority of a job is the weighted sum of three values. The first value is the job's "fair share" priority and is based upon alloted shares and recent resource use by the owner of the job and the bank tree from which it draws resources. The second value is the job's "aging" priority which is based on how long the job has been eligible to run. The third value is the job's "technical" prior- ity and is based on a measure of how well the job can utilize available resources. RUNTIME If the job has started running, the elapsed run time since the job began running. If the job has completed or been removed after it began running, the total elapsed run time for the job. SID The kernel assigned session id of the job. STATUS The status of the job. See NOTES below. STOPTIME The date-time at which the job must be removed. This value is set only when a prm is issued to remove a running job with a grace time. SUBMITTED The date-time at which the job was submitted. DOE/LLNL/LCC September 4, 2002 5 PSTAT(l) Livermore Computing Local UNIX Manual PSTAT(l) TASKS The number of tasks in the job. TIMECHARGED The total amount of cpu time consumed by the job so far. If the job has terminated or been removed, the total cpu time consumed by the job. (See notes about charge rates.) TIMELEFT The estimated amount of wall-clock time left to the job. USED The average amount of cpu time consumed by each task in the job so far. If the job has terminated or been removed, the average amount of cpu time consumed by the each task in the job. USER The job owner. VMEMINT The physical memory integral used by the job. XCT The number of times the job has been expedited. EXAMPLES The following command lists all jobs that are drawing from bank A > pstat -b A JID NAME USER ACCOUNT BANK STATUS EXEHOST CL XCT 874 name1 user1 590001 A RUN host1 N 0 877 name2 user2 590001 A *WAIT N 0 The following command lists all jobs that were drawing from bank A but have completed or have been removed as well as jobs that are run- ning or are not yet scheduled to run. > pstat -Tb A JID NAME USER ACCOUNT BANK STATUS EXEHOST CL XCT 872 name3 user1 590001 A COMPLETE host1 N 0 873 name4 user2 590001 A REMOVED N 0 874 name1 user1 590001 A RUN host1 N 0 877 name2 user2 590001 A *WAIT N 0 The following command lists all jobs, running or waiting to run, on hosts that match the constraint list. > pstat -c 2000MB JID NAME USER ACCOUNT BANK STATUS EXEHOST CL XCT 872 name3 user1 590001 A RUN host1 N 0 873 name4 user2 590001 A *WMEML N 0 874 name1 user1 590001 A *QTOTLIM host2 N 0 877 name2 user2 590001 A *WAIT N 0 DOE/LLNL/LCC September 4, 2002 6 PSTAT(l) Livermore Computing Local UNIX Manual PSTAT(l) The following command lists the statuses of 2 jobs at each host at which they may run. > pstat -M -n 900,901 JID NAME USER ACCOUNT BANK STATUS EXEHOST CL XCT 900 name1 user2 590001 A *QTOTLIM host1 N 0 900 name1 user2 590001 A *QTOTLIMU host2 N 0 900 name1 user2 590001 A *WMEML host3 N 0 901 name2 user2 590001 A RUN host3 N 0 The following command lists the statuses of 2 jobs as above without the -M option. > pstat -n 900,901 JID NAME USER ACCOUNT BANK STATUS EXEHOST CL XCT 900 name1 user2 590001 A *MULTIPLE N 0 901 name2 user2 590001 A RUN host3 N 0 The following command lists only the jid, name, and user fields. >pstat -o jid,name,user JID NAME USER 874 name1 user1 877 name2 user2 The following command lists all of a user's jobs, displaying only the job id, the executing host name, the status, the batchid and the ses- sion id. >pstat -u user1 -o jid,exehost,status,batchid,sid JID EXEHOST STATUS BATCHID SID 874 host1 RUN 176.host1 18885 875 *WAIT The following example is equivalent to the previous example (and assumes csh). >setenv PSTAT_CONFIG jid,exehost,status,batchid,sid >pstat -u user1 JID EXEHOST STATUS BATCHID SID 874 host1 RUN 176.host1 18885 875 *WAIT The following command lists all jobs in verbose mode. (Because of the line length of the output, it will typically display with wrap around.) DOE/LLNL/LCC September 4, 2002 7 PSTAT(l) Livermore Computing Local UNIX Manual PSTAT(l) >pstat -v JID NAME USER ACCOUNT BANK STATUS EXEHOST CL XCT AGING TIME MAXCPUTIME MEMSIZE DEPEND USED EARLIEST_START 1587 job1 user1 123456 A RUN host1 N 0 10/02/95 17:00:13 100:00 5Mb 0 0:38 1578 job2 user2 987654 B RUN host2 N 0 10/02/95 10:35:00 10:00 100Mb 0 6:06 1592 job3 user3 123123 C HELDu host3 N 0 10/02/95 16:11:54 16:00 74432Kb 0 0:53 1591 job4 user4 321321 D *WAIT N 0 10/02/95 17:40:26 0:30 230Mb 0 0:00 10/03/95 00:00:00 The following command lists jobs in full mode. Because of the amount of data, it is multiple lines of output per entry. > pstat -f --------------------------------------------------------------------------- DPCS BATCH JOB ID 16614 user: eckert --------------------------------------------------------------------------- DPCS job name: batch bank: sa batch identifier: blue199.3265.0 account: 000000 session identifier: 48.13172 executing host: blue job status: RUN priority: 0.568 preempted by: N/A submitted at: 09/06/99 08:56:04 earliest start time: N/A must stop at: N/A estimated completion: 09/06/99 12:56:54 dependency: 0 expedited: no short production: no times expedited: 0 resident memory integral: 583Mbh physical memory integral: 614Mbh largest process size: 19Mb process size limit: unlimited max resident set size: 813Mb max physical size: 302Mb job size: 624Mb nodes requested: 8 constraint: N/A geometry: 32 nsrs: N/A elapsed run time limit: 2:00 elapsed run time: 1:03 time limit per task: 2:00 time charged: 32:50 time used per task: 1:00 time used: 32:11 tasks: 32 NOTES If a job is permitted to run on more than one host, it actually has a somewhat independent status for each of the hosts until it is actu- ally scheduled to run on one of them. When a job is not running and DOE/LLNL/LCC September 4, 2002 8 PSTAT(l) Livermore Computing Local UNIX Manual PSTAT(l) the pstat output is not qualified by the -m option, the status shown for the job may apply to only one of the hosts at which it is permit- ted to run. The host to which the status applies is not shown. For example, a job may request a maximum time greater than is currently permitted (by an administrator) at one host and at the same time may be prevented from running on another host because that host's memory is currently overloaded. The status for the job could be shown either as TOOLONG or WMEM. To see a job's status at any par- ticular host, use the -m option. If an asterisk(*) precedes the status field, it indicates that the job has not yet run. Different charge rates for different classes may be applied to jobs. The defined classes are: interactive, short production, expedited, normal and standby. Possible values for "STATUS" are: Value Meaning BAT_WAIT On starting a job (or restarting a checkpointed job), the native batch system has determined that necessary resources are not available to the job and has delayed its execution until those resources are available. (For instance, in the case of a restarted job either the job's session id or one of its process id's may be claimed by another session. Another example would be that a job requests more nodes on an multi-node machine than are currently available.) The user must either wait until the claimed resource has been released or terminate and resubmit the job. COMPLETE The job has completed execution and is no longer under the control of DPCS. CPUS&TIME The number of nodes (-ln) and time limit (-tM) evalutated to a number that exceeds the currently configured maximum permitted node count * time limit per job. The job will be re-evaluated when the configured maximum amount permitted is increased by an administrator. CPUS>MAX The number of requested cpus (-lc) or nodes (-ln) evalu- tated to a number that exceeds the currently configured maximum permitted cpus per job. The job will be re- evaluated when the configured maximum amount permitted is increased by an administrator. DEPEND The job is specifically awaiting the completion of another job. It will be re-evaluated when the job it depends on terminates. DOE/LLNL/LCC September 4, 2002 9 PSTAT(l) Livermore Computing Local UNIX Manual PSTAT(l) DCE_DEF There has been an unexpected DCE error on the production host while submitting the users job to the native batch system. The job will be held for a specific period of time and then re-evaluated. The default waiting period is 10 minutes. DEFERRED The production host returned a bad status for this job. The job will be held for a specific period of time and then re-evaluated. The default waiting period is 10 minutes. ELIG The job is eligible to run. HELD[c][s][u] The job has an explicit user level, coordinator level and/or a system level hold applied to it. If the character 'u' appears after the HELD status, then a user level hold has been applied. If the character 'c' appears, then a coordinator level hold has been applied. If the character 's' appears, then a system level hold has been applied. User level holds can be set or released by a job owner, coordinator or DPCS manager. Coordinator level holds can be set or cleared by a coordinator or DPCS manager. System level holds can be set or cleared only by a DPCS manager. The job will be re-evaluated after all hold levels are removed (via prel). HLD_IDLE A user level hold has been applied to the job by the DPCS system because its use of CPU time has fallen below a minimum threshold. The user level hold (and thus this status) can be removed by using prel. HOLDING The job is in the process of being checkpointed. After the job is checkpointed, its status is re-evaluated and set depending on the reason for the checkpoint. JRESLIM Scheduling the job would cause a limit on the number of jobs that can run concurrently on the machines in a resource partition to be exceeded. The applicable limit is the least number of jobs permitted to the user/bank or any of its parent banks in the resource partition. MOVING The job is in the process of being moved to the host selected for execution. MULTIPLE The job has different statuses at two or more of the hosts at which it may run, or multiple statuses on a single host. Use the -M option to see the job's status at each of its candidate hosts. NOACCT The account name to be associated with the resources this job will use has been removed. The job will be re-evaluated when a valid account is assigned to the job (via palter). DOE/LLNL/LCC September 4, 2002 10 PSTAT(l) Livermore Computing Local UNIX Manual PSTAT(l) NOBANK The bank from which the job was to draw its resources no longer exists. The job will be re-evaluated when a valid bank is assigned to the job (via palter). NOCONF A host permitted to run the job does not currently have a valid configuration parameter set. The job will be re- evaluated when an administrator assigns a valid configura- tion parameter set. NONEW An administrator has disallowed new jobs from running while this job was not running. The job will be re-evaluated when new jobs are again permitted to run by an administrator. NOPRISRV A host permitted to run this job is operating at a priority service level greater than normal, and the the job's bank is not within the priority service bank tree. The job will be re-evaluated with the host is returned to normal service by an administrator. NOTIME The job owner has zero effective shares to the bank from which the job is drawing resources. The job will be re- evaluated when bank shares are reassigned by a coordinator or system administrator. (Use pshare with the "-m" option to determine the shares you have allo- cated to bank(s) on the target machine. Use pshare with the "-m and "-r " options to determine the shares allocated to the bank's parent tree. If any bank in the tree has zero allocated shares, then all subbanks as well as your permission to the specified have zero effective shares at the specified machine.) NRESLIM Scheduling the job would cause a limit on the number of nodes that can be committed to a user or bank in a resource partition to be exceeded. The applicable limit is the least number of nodes permitted to the user/bank or any of its parent banks in the resource partition. NTRESLIM Scheduling the job would cause a limit on the amount of node-time that can be committed to a user or bank in a resource partition to be exceeded. The applicible limit is the least amount of node-time permitted to the user/bank or any of its parent banks in the resource partition. PREEMPTD On machines that support preemption, this status indicates that the job has been preempted in order to allow a higher priority job to run using the preempted jobs set of resources. The output from PSTAT using the -f option will show the job id of the job that has preempted this job. A preempted job is stopped from running and will resume exe- cution once the higher priority job has terminated. DOE/LLNL/LCC September 4, 2002 11 PSTAT(l) Livermore Computing Local UNIX Manual PSTAT(l) PTOOBIG The maximum, or estimated maximum, process size of the job exceeds the maximum permitted size of processes at the exe- cuting host. The job will be re-evaluated when the config- ured maximum permitted size of processes is increased by an administrator. QCKPLIM A host permitted to run this job has reached its checkpoint file space limit. This status is only possible for jobs that may run on hosts that support checkpointing. The job will be re-evaluated when sufficient space is made avail- able by terminating jobs. QTOTLIM The number of running jobs plus checkpointed jobs on a host permitted to run this job is greater than or equal to the maximum number of such jobs permitted by an administrator. The job will be re-evaluated when any of the active jobs on the host terminates or when configuration limits are increased. QTOTLIMU The number of running jobs plus checkpointed jobs that are owned by the job's owner on a host permitted to run this job is greater than or equal to the maximum number of such jobs owned by a single user permitted by an administrator. The job will be re-evaluated when any of the active jobs owned by the user on the host terminates or when configura- tion limits are increased. REMOVED The job has been removed by the 'prm(1)' utility. RM_STBY The job is being removed to start a higher priority job. (Only standby jobs will ever have this status.) RES_WAIT At least one of the nonshared resources required by the job is not currently available on the host. The job will be re-evaluated when any active jobs on the host terminate or when the availability of the nonshared resource changes on the host. RM_INIT The job is in the process of being removed from the system. RUN The job has been given to the underlying batch system and the batch system has started the job. Even if a job spends the majority of its time in a sleep state, the job is still considered running. STAGING The job has been submitted by DPCS to the native batch sys- tem for it to run and the native batch system has not con- firmed yet that the job is actually running. STOPPEND The job is running, but has been removed (prm) with a grace time before it is forcefully removed. DOE/LLNL/LCC September 4, 2002 12 PSTAT(l) Livermore Computing Local UNIX Manual PSTAT(l) TIMEOUT A request has been made to move the job from one host to another but the receiving host did not report the receipt of the job before a time out. The move request is reissued until the move is successful. TOOLONG The job requests more time than the currently configured maximum permitted time. The job will be re-evaluated if its maximum permitted time is decreased (via palter) or if the configured maximum permitted time is increased by an administrator. TQUOTA The job owner's allocation or bank has reached its resource usage quota. The job will be re-evaluated when the user allocation or bank's quota is automatically refreshed or is refreshed by an administrator. USED>MAX The time used by the job is greater than its maximum requested time. This status is only possible for jobs that run on hosts that support checkpointing. On other machines, a job will be terminated if it uses all the time requested for it. The job will be re-evaluated if its max- imum permitted time is increased (via palter). WAIT The job has not yet reached its do-not-run-before time. When this time is reached, the job will be re-evaluated. WCPU There are not enough cpus/nodes available for this job to run. WHOST A host permitted to run this job is not currently being managed by DPCS. When communications with host have been re-established the job will be re-evaluated. WMEML The memory of a host permitted to run this job would be overcommitted based on the limits set by the system administrators. The job will be re-evaluated when enough memory becomes available via termination of processes and/or jobs. WMEMT The memory of a host permitted to run this job would exceed the target range that has been set by the system adminis- trator. The job will be re-evaluated when enough memory becomes available via termination of processes and/or jobs. WPRIO While there are enough nodes to run this job, doing so would cause the start of a higher priority job to be delayed. WSTBY This job is waiting for one or more standby jobs to be removed so that it can start. DOE/LLNL/LCC September 4, 2002 13 PSTAT(l) Livermore Computing Local UNIX Manual PSTAT(l) Possible values for "CL" (class) are: N The job is a normal job. S The job has been specified as a standby job. s The job has been specified as a short production job. X The job has been expedited. AUTHOR Robert R Wood, Lawrence Livermore National Laboratory, bwood@llnl.gov SEE ALSO palter, pexp, plim, phold, prel, prm