sacct - displays accounting data for all jobs and job steps in the
Slurm job accounting log or Slurm database
Accounting information for jobs invoked with Slurm are either
logged in the job accounting log file or saved to the Slurm database, as
configured with the AccountingStorageType parameter.
The sacct command displays job accounting data stored in
the job accounting log file or Slurm database in a variety of forms for your
analysis. The sacct command displays information on jobs, job steps,
status, and exitcodes by default. You can tailor the output with the use of
the --format= option to specify the fields to be shown.
Job records consist of a primary entry for the job as a whole as
well as entries for job steps. The Job Launch page has a more detailed
description of each type of job step.
<https://slurm.schedmd.com/job_launch.html#job_record>
For the root user, the sacct command displays job
accounting data for all users, although there are options to filter the
output to report only the jobs from a specified user or group.
For the non-root user, the sacct command limits the display
of job accounting data to jobs that were launched with their own user
identifier (UID) by default. Data for other users can be displayed with the
--allusers, --user, or --uid options.
Elapsed time fields are presented as
[days-]hours:minutes:seconds[.microseconds]. Only 'CPU' fields will ever
have microseconds.
The default input file is the file named in the
AccountingStorageLoc parameter in slurm.conf.
NOTE: If designated, the slurmdbd.conf option PrivateData
may further restrict the accounting data visible to users which are not
SlurmUser, root, or a user with AdminLevel=Admin. See the slurmdbd.conf man
page for additional details on restricting access to accounting data.
NOTE: The contents of Slurm's database are maintained in
lower case. This may result in some sacct output differing from that
of other Slurm commands.
NOTE: Much of the data reported by sacct has been
generated by the wait3() and getrusage() system calls. Some
systems gather and report incomplete information for these calls;
sacct reports values of 0 for this missing data. See your systems
getrusage (3) man page for information about which data are actually
available on your system.
- -A,
--accounts=<account_list>
- Displays jobs when a comma separated list of accounts are given as the
argument.
-
- -L,
--allclusters
- Display jobs ran on all clusters. By default, only jobs ran on the cluster
from where sacct is called are displayed.
-
- -X,
--allocations
- Only show statistics relevant to the job allocation itself, not taking
steps into consideration.
NOTE: Without including steps, utilization statistics
for job allocation(s) will be reported as zero.
-
- -a,
--allusers
- Displays all users' jobs when run by user root or if PrivateData is
not configured to jobs. Otherwise display the current user's
jobs
-
- -x,
--associations=<assoc_list>
- Displays the statistics only for the jobs running under the association
ids specified by the assoc_list operand, which is a comma-separated
list of association ids. Space characters are not allowed in the
assoc_list. Default is all associations.
-
- -B,
--batch-script
- This option will print the batch script of job if the job used one. If the
job didn't have a script 'NONE' is output.
NOTE: AccountingStoreFlags=job_script is required for this.
NOTE: Requesting specific job(s) with '-j' is required for this.
-
- -b, --brief
- Displays a brief listing consisting of JobID, State, and ExitCode.
-
- -M,
--clusters=<cluster_list>
- Displays the statistics only for the jobs started on the clusters
specified by the cluster_list operand, which is a comma-separated
list of clusters. Space characters are not allowed in the
cluster_list. A value of 'all' will query to run on all
clusters. The default is current cluster you are executing the
sacct command on or all clusters in the federation when executed on
a federated cluster. This option implicitly sets the --local
option.
-
- -c,
--completion
- Use job completion data instead of job accounting. The JobCompType
parameter in the slurm.conf file must be defined to a non-none option.
Does not support federated cluster information (local data only).
-
- -C,
--constraints=<constraint_list>
- Comma separated list to filter jobs based on what constraints/features the
job requested. Multiple options will be treated as 'and' not 'or', so the
job would need all constraints specified to be returned not one or the
other.
-
- --delimiter=<characters>
- ASCII characters used to separate the fields when specifying the -p
or -P options. The default delimiter is a '|'. This option is
ignored if -p or -P options are not specified.
-
- -D,
--duplicates
- If Slurm job ids are reset, some job numbers will probably appear more
than once in the accounting log file but refer to different jobs. Such
jobs can be distinguished by the "submit" time stamp in the data
records.
-
When data for specific jobs are requested with the --jobs
option, sacct returns the most recent job with that number. This
behavior can be overridden by specifying --duplicates, in which case all
records that match the selection criteria will be returned.
-
NOTE: Revoked federated sibling jobs are hidden unless
the --duplicates option is specified.
-
- -E,
--endtime=<end_time>
- Select jobs in any state before the specified time. If states are given
with the -s option return jobs in this state before this period. See the
DEFAULT TIME WINDOW for more details.
Valid time formats are:
HH:MM[:SS][AM|PM]
MMDD[YY][-HH:MM[:SS]]
MM.DD[.YY][-HH:MM[:SS]]
MM/DD[/YY][-HH:MM[:SS]]
YYYY-MM-DD[THH:MM[:SS]]
today, midnight, noon, fika (3 PM), teatime (4 PM)
now[{+|-}count[seconds(default)|minutes|hours|days|weeks]]
-
- --env-vars
- This option will print the running environment of a batch job, otherwise
'NONE' is output.
NOTE: AccountingStoreFlags=job_env is required for this.
NOTE: Requesting specific job(s) with '-j' is required for this.
-
- --federation
- Show jobs from the federation if a member of one.
-
- -f,
--file=<file>
- Causes the sacct command to read job accounting data from the named
file instead of the current Slurm job accounting log file. Only
applicable when running the jobcomp/filetxt plugin. Setting this flag
implicitly enables the -c flag.
-
- -F,
--flags=<flag_list>
- Comma separated list to filter jobs based on what various ways the jobs
were handled. Current flags are SchedSubmit, SchedMain, SchedBackfill.
These particular options describe the scheduler that started the job.
-
- -o, --format
- Comma separated list of fields. (use "--helpformat" for a list
of available fields).
NOTE: When using the format option for listing various
fields you can put a %NUMBER afterwards to specify how many characters
should be printed.
e.g. format=name%30 will print 30 characters of field name
right justified. A %-30 will print 30 characters left justified.
When set, the SACCT_FORMAT environment variable will override
the default format. For example:
SACCT_FORMAT="jobid,user,account,cluster"
-
- -g, --gid=,
--group=<gid_or_group_list>
- Displays the statistics only for the jobs started with the GID or the
GROUP specified by the gid_list or the group_list operand,
which is a comma-separated list. Space characters are not allowed. Default
is no restrictions.
-
- -h, --help
- Displays a general help message.
-
- -e,
--helpformat
- Print a list of fields that can be specified with the --format
option.
-
Fields available:
Account AdminComment AllocCPUS AllocNodes
AllocTRES AssocID AveCPU AveCPUFreq
AveDiskRead AveDiskWrite AvePages AveRSS
AveVMSize BlockID Cluster Comment
Constraints ConsumedEnergy ConsumedEnergyRaw Container
CPUTime CPUTimeRAW DBIndex DerivedExitCode
Elapsed ElapsedRaw Eligible End
ExitCode Flags GID Group
JobID JobIDRaw JobName Layout
MaxDiskRead MaxDiskReadNode MaxDiskReadTask MaxDiskWrite
MaxDiskWriteNode MaxDiskWriteTask MaxPages MaxPagesNode
MaxPagesTask MaxRSS MaxRSSNode MaxRSSTask
MaxVMSize MaxVMSizeNode MaxVMSizeTask McsLabel
MinCPU MinCPUNode MinCPUTask NCPUS
NNodes NodeList NTasks Partition
Priority QOS QOSRAW Reason
ReqCPUFreq ReqCPUFreqGov ReqCPUFreqMax ReqCPUFreqMin
ReqCPUS ReqMem ReqNodes ReqTRES
Reservation ReservationId Reserved ResvCPU
ResvCPURAW Start State Submit
SubmitLine Suspended SystemComment SystemCPU
Timelimit TimelimitRaw TotalCPU TRESUsageInAve
TRESUsageInMax TRESUsageInMaxNode TRESUsageInMaxTask TRESUsageInMin
TRESUsageInMinNode TRESUsageInMinTask TRESUsageInTot TRESUsageOutAve
TRESUsageOutMax TRESUsageOutMaxNode TRESUsageOutMaxTask TRESUsageOutMin
TRESUsageOutMinNode TRESUsageOutMinTask TRESUsageOutTot UID
User UserCPU WCKey WCKeyID
WorkDir
NOTE: When using with Ave[RSS|VM]Size or their values in
TRESUsageIn[Ave|Tot]. They represent the average/total of the highest
watermarks over all ranks in the step. When using sstat they represent the
average/total at the moment the command was run.
NOTE: TRESUsage*Min* values represent the lowest highwater
mark in the step.
The section titled "Job Accounting Fields" describes
these fields.
-
- -j,
--jobs=<job[.step]>
- Displays information about the specified job[.step] or list
of job[.step]s.
The job[.step] parameter is a comma-separated
list of jobs. Space characters are not permitted in this list.
NOTE: A step id of 'batch' will display the information about the
batch step.
By default sacct shows only jobs with Eligible time, but with this option
the non-eligible will be also shown.
NOTE: If --state is also specified, as non-eligible are not PD,
then non-eligible jobs will not be displayed. See the DEFAULT TIME
WINDOW for details about how this option changes the default -S and
-E options.
-
- --json
- Dump job information as JSON. All other formatting arguments will be
ignored.
-
- --local
- Show only jobs local to this cluster. Ignore other clusters in this
federation (if any). Overrides --federation.
-
- -l, --long
- Equivalent to specifying:
--format=jobid,jodidraw,jobname,partition,maxvmsize,maxvmsizenode,
maxvmsizetask,avevmsize,maxrss,maxrssnode,maxrsstask,averss,maxpages,
maxpagesnode,maxpagestask,avepages,mincpu,mincpunode,mincputask,avecpu,ntasks,
alloccpus,elapsed,state,exitcode,avecpufreq,reqcpufreqmin,reqcpufreqmax,
reqcpufreqgov,reqmem,consumedenergy,maxdiskread,maxdiskreadnode,maxdiskreadtask,
avediskread,maxdiskwrite,maxdiskwritenode,maxdiskwritetask,avediskwrite,
reqtres,alloctres,tresusageinave,tresusageinmax,
tresusageinmaxn,tresusageinmaxt,tresusageinmin,tresusageinminn,tresusageinmint,
tresusageintot,tresusageoutmax,tresusageoutmaxn,
tresusageoutmaxt,tresusageoutave,tresusageouttot
-
- --name=<jobname_list>
- Display jobs that have any of these name(s).
-
- -i,
--nnodes=<min[-max]>
- Return jobs that ran on the specified number of nodes.
-
- -I,
--ncpus=<min[-max]>
- Return jobs that ran on the specified number of cpus.
-
- --noconvert
- Don't convert units from their original type (e.g. 2048M won't be
converted to 2G).
-
- -N,
--nodelist=<node_list>
- Display jobs that ran on any of these node(s). node_list can be a
ranged string.
-
- -n,
--noheader
- No heading will be added to the output. The default action is to display a
header.
-
- -p,
--parsable
- Output will be '|' delimited with a '|' at the end. See also the
--delimiter option.
-
- -P,
--parsable2
- Output will be '|' delimited without a '|' at the end. See also the
--delimiter option.
-
- -r,
--partition
- Comma separated list of partitions to select jobs and job steps from. The
default is all partitions.
-
- -q, --qos
- Only send data about jobs using these qos. Default is all.
-
- -R,
--reason=<reason_list>
- Comma separated list to filter jobs based on what reason the job wasn't
scheduled outside resources/priority.
-
- -S,
--starttime
- Select jobs in any state after the specified time. Default is 00:00:00 of
the current day, unless the '-s' or '-j' options are used. If the '-s'
option is used, then the default is 'now'. If states are given with the
'-s' option then only jobs in this state at this time will be returned. If
the '-j' option is used, then the default time is Unix Epoch 0. See the
DEFAULT TIME WINDOW for more details.
Valid time formats are:
HH:MM[:SS][AM|PM]
MMDD[YY][-HH:MM[:SS]]
MM.DD[.YY][-HH:MM[:SS]]
MM/DD[/YY][-HH:MM[:SS]]
YYYY-MM-DD[THH:MM[:SS]]
today, midnight, noon, fika (3 PM), teatime (4 PM)
now[{+|-}count[seconds(default)|minutes|hours|days|weeks]]
-
- -s,
--state=<state_list>
- Selects jobs based on their state during the time period given. Unless
otherwise specified, the start and end time will be the current time when
the --state option is specified and only currently running jobs can
be displayed. A start and/or end time must be specified to view
information about jobs not currently running. See the JOB STATE
CODES section below for a list of state designators. Multiple state
names may be specified using comma separators. Either the short or long
form of the state name may be used (e.g. CA or CANCELLED)
and the name is case insensitive (i.e. ca and CA both work).
NOTE: Note for a job to be selected in the PENDING
state it must have "EligibleTime" in the requested time
interval or different from "Unknown". The
"EligibleTime" is displayed by the "scontrol show
job" command. For example jobs submitted with the
"--hold" option will have "EligibleTime=Unknown" as
they are pending indefinitely.
NOTE: When specifying states and no start time is given
the default start time is 'now'. This is only when -j is not used. If -j
is used the start time will default to 'Epoch'. In both cases if no end
time is given it will default to 'now'. See the DEFAULT TIME
WINDOW for more details.
-
- -K,
--timelimit-max
- Ignored by itself, but if timelimit_min is set this will be the maximum
timelimit of the range. Default is no restriction.
-
- -k,
--timelimit-min
- Only send data about jobs with this timelimit. If used with timelimit_max
this will be the minimum timelimit of the range. Default is no
restriction.
-
- -T,
--truncate
- Truncate time. So if a job started before --starttime the start time would
be truncated to --starttime. The same for end time and --endtime.
-
- -u, --uid=,
--user=<uid_or_user_list>
- Use this comma separated list of UIDs or user names to select jobs to
display. By default, the running user's UID is used.
-
- --units=[KMGTP]
- Display values in specified unit type. Takes precedence over
--noconvert option.
-
- --usage
- Display a command usage summary.
-
- --use-local-uid
- When displaying UID, sacct uses the UID stored in Slurm's accounting
database by default. Use this command to make Slurm use a system call to
get the UID from the username. This option may be useful in an environment
with multiple clusters and one database where the UID's aren't the same on
all clusters.
-
- -v, --verbose
- Primarily for debugging purposes, report the state of various variables
during processing.
-
- -V,
--version
- Print version.
-
- -W,
--wckeys=<wckey_list>
- Displays the statistics only for the jobs started on the wckeys specified
by the wckey_list operand, which is a comma-separated list of wckey
names. Space characters are not allowed in the wckey_list. Default
is all wckeys.
-
- --whole-hetjob[=yes|no]
- When querying and filtering heterogeneous jobs with --jobs, Slurm
will default to retrieving information about all the components of the job
if the het_job_id (leader id) is selected. If a non-leader heterogeneous
job component id is selected then only that component is retrieved by
default. This behavior can be changed by using this option. If set to
'yes' (or no argument), then information about all the components will be
retrieved no matter which component is selected in the job filter. If set
to 'no' then only the selected heterogeneous job component(s) will be
retrieved, even when selecting the leader.
-
- --yaml
- Dump job information as YAML. All other formatting arguments will be
ignored.
-
Descriptions of each job accounting field can be found below. Note
that the Ave*, Max* and Min* accounting fields look at the values for all
the tasks of each step in a job and return the average, maximum or minimum
values for the job step.
- ALL
- Print all fields listed below.
-
- Account
- Account the job ran under.
-
- A comment string on a job that must be set by an administrator, the
SlurmUser or root.
-
- AllocCPUs
- Count of allocated CPUs. Equivalent to NCPUS.
-
- AllocNodes
- Number of nodes allocated to the job/step. 0 if the job is pending.
-
- AllocTres
- Trackable resources. These are the resources allocated to the job/step
after the job started running. For pending jobs this should be blank. For
more details see AccountingStorageTRES in slurm.conf.
NOTE: When a generic resource is configured with the
no_consume flag, the allocation will be printed with a zero.
-
- AssocID
- Reference to the association of user, account and cluster.
-
- AveCPU
- Average (system + user) CPU time of all tasks in job.
-
- AveCPUFreq
- Average weighted CPU frequency of all tasks in job, in kHz.
-
- AveDiskRead
- Average number of bytes read by all tasks in job.
-
- AveDiskWrite
- Average number of bytes written by all tasks in job.
-
- AvePages
- Average number of page faults of all tasks in job.
-
- Average resident set size of all tasks in job.
-
- AveVMSize
- Average Virtual Memory size of all tasks in job.
-
- BlockID
- The name of the block to be used (used with Blue Gene systems).
-
- Cluster
- Cluster name.
-
- The job's comment string when the AccountingStoreFlags parameter in the
slurm.conf file contains 'job_comment'. The Comment string can be modified
by invoking sacctmgr modify job or the specialized
sjobexitmod command.
-
- Constraints
- Feature(s) the job requested as a constraint.
-
- ConsumedEnergy
- Total energy consumed by all tasks in a job, in joules. Value may include
a unit prefix (K,M,G,T,P). Note: Only in the case of an exclusive job
allocation does this value reflect the job's real energy consumption.
-
- ConsumedEnergyRaw
- Total energy consumed by all tasks in a job, in joules. Note: Only in the
case of an exclusive job allocation does this value reflect the job's real
energy consumption.
-
- Container
- Path to OCI Container Bundle requested.
-
- CPUTime
- Time used (Elapsed time * CPU count) by a job or step in HH:MM:SS
format.
-
- CPUTimeRAW
- Time used (Elapsed time * CPU count) by a job or step in cpu-seconds.
-
- DBIndex
- Unique database index for entries in the job table.
-
- DerivedExitCode
- The highest exit code returned by the job's job steps (srun invocations).
Following the colon is the signal that caused the process to terminate if
it was terminated by a signal. The DerivedExitCode can be modified by
invoking sacctmgr modify job or the specialized sjobexitmod
command.
-
- Elapsed
- The job's elapsed time.
The format of this field's output is as follows:
- as defined by the following:
-
- ElapsedRaw
- The job's elapsed time in seconds.
-
- Eligible
- When the job became eligible to run. In the same format as
End.
-
- End
- Termination time of the job. The output is of the format
YYYY-MM-DDTHH:MM:SS, unless changed through the SLURM_TIME_FORMAT
environment variable.
-
- ExitCode
- The exit code returned by the job script or salloc, typically as set by
the exit() function. Following the colon is the signal that caused the
process to terminate if it was terminated by a signal.
-
- Flags
- Job flags. Current flags are SchedSubmit, SchedMain, SchedBackfill.
-
- GID
- The group identifier of the user who ran the job.
-
- Group
- The group name of the user who ran the job.
-
- JobID
- The identification number of the job or job step.
-
Regular jobs are in the form:
-
JobID[.JobStep]
Array jobs are in the form:
-
ArrayJobID_ArrayTaskID
Heterogeneous jobs are in the form:
-
HetJobID+HetJobOffset
When printing job arrays, performance of the command can be
measurably improved for systems with large numbers of jobs when a single
job ID is specified. By default, this field size will be limited to 64
bytes. Use the environment variable SLURM_BITSTR_LEN to specify larger
field sizes.
-
- JobIDRaw
- The identification number of the job or job step. Prints the JobID in the
form JobID[.JobStep] for regular, heterogeneous and array
jobs.
-
- JobName
- The name of the job or job step. The slurm_accounting.log file is a
space delimited file. Because of this if a space is used in the jobname an
underscore is substituted for the space before the record is written to
the accounting file. So when the jobname is displayed by sacct the
jobname that had a space in it will now have an underscore in place of the
space.
-
- Layout
- What the layout of a step was when it was running. This can be used to
give you an idea of which node ran which rank in your job.
-
- MaxDiskRead
- Maximum number of bytes read by all tasks in job.
-
- MaxDiskReadNode
- The node on which the maxdiskread occurred.
-
- MaxDiskReadTask
- The task ID where the maxdiskread occurred.
-
- MaxDiskWrite
- Maximum number of bytes written by all tasks in job.
-
- MaxDiskWriteNode
- The node on which the maxdiskwrite occurred.
-
- MaxDiskWriteTask
- The task ID where the maxdiskwrite occurred.
-
- MaxPages
- Maximum number of page faults of all tasks in job.
-
- MaxPagesNode
- The node on which the maxpages occurred.
-
- MaxPagesTask
- The task ID where the maxpages occurred.
-
- Maximum resident set size of all tasks in job.
-
- The node on which the maxrss occurred.
-
- The task ID where the maxrss occurred.
-
- MaxVMSize
- Maximum Virtual Memory size of all tasks in job.
-
- MaxVMSizeNode
- The node on which the maxvmsize occurred.
-
- MaxVMSizeTask
- The task ID where the maxvmsize occurred.
-
- MCSLabel
- Multi-Category Security (MCS) label associated with the job. Added to a
job when the MCSPlugin is enabled in the slurm.conf.
-
- MinCPU
- Minimum (system + user) CPU time of all tasks in job.
-
- MinCPUNode
- The node on which the mincpu occurred.
-
- MinCPUTask
- The task ID where the mincpu occurred.
-
- NCPUS
- Total number of CPUs allocated to the job. Equivalent to
AllocCPUS.
-
- NNodes
- Number of nodes in a job or step. If the job is running, or ran, this
count will be the number allocated, else the number will be the number
requested.
-
- NodeList
- List of nodes in job/step.
-
- NTasks
- Total number of tasks in a job or step.
-
- Partition
- Identifies the partition on which the job ran.
-
- Priority
- Slurm priority.
-
- QOS
- Name of Quality of Service.
-
- QOSRAW
- Numeric id of Quality of Service.
-
- Reason
- The last reason a job was blocked from running for something other than
Priority or Resources. This will be saved in the database even if the job
ran to completion.
-
- ReqCPUFreq
- Requested CPU frequency for the step, in kHz. Note: This value applies
only to a job step. No value is reported for the job.
-
- ReqCPUFreqGov
- Requested CPU frequency governor for the step, in kHz. Note: This value
applies only to a job step. No value is reported for the job.
-
- ReqCPUFreqMax
- Maximum requested CPU frequency for the step, in kHz. Note: This value
applies only to a job step. No value is reported for the job.
-
- ReqCPUFreqMin
- Minimum requested CPU frequency for the step, in kHz. Note: This value
applies only to a job step. No value is reported for the job.
-
- ReqCPUS
- Number of requested CPUs.
-
- ReqMem
- Minimum required memory for the job. It may have a letter appended to it
indicating units (M for megabytes, G for gigabytes, etc.). Note: This
value is only from the job allocation, not the step.
-
- ReqNodes
- Requested minimum Node count for the job/step.
-
- ReqTres
- Trackable resources. These are the minimum resource counts requested by
the job/step at submission time. For more details see
AccountingStorageTRES in slurm.conf.
-
- Reservation
- Reservation Name.
-
- ReservationId
- Reservation Id.
-
- Reserved
- How much wall clock time was used as reserved time for this job. This is
derived from how long a job was waiting from eligible time to when it
actually started. Format is the same as Elapsed.
-
- ResvCPU
- How many CPU seconds were used as reserved time for this job. Format is
the same as Elapsed.
-
- ResvCPURAW
- How many CPU seconds were used as reserved time for this job. Format is in
processor seconds.
-
- Start
- Initiation time of the job. In the same format as End.
-
- State
- Displays the job status, or state. See the JOB STATE CODES section
below for a list of possible states.
If more information is available on the job state than will
fit into the current field width (for example, the UID that CANCELLED a
job) the state will be followed by a "+". You can increase the
size of the displayed state using the "%NUMBER" format
modifier described earlier.
NOTE: The RUNNING state will return suspended jobs as well. In
order to print suspended jobs you must request SUSPENDED at a different
call from RUNNING.
NOTE: The RUNNING state will return any jobs completed
(cancelled or otherwise) in the time period requested as the job was
also RUNNING during that time. If you are only looking for jobs that
finished, please choose the appropriate state(s) without the RUNNING
state.
-
- Submit
- The time the job was submitted. In the same format as End.
NOTE: If a job is requeued, the submit time is reset. To
obtain the original submit time it is necessary to use the -D or
--duplicate option to display all duplicate entries for a job.
-
- SubmitLine
- The full command issued to submit the job.
-
- Suspended
- The amount of time a job or job step was suspended. Format is the same as
Elapsed.
-
- The job's comment string that is typically set by a plugin. Can only be
modified by a Slurm administrator.
-
- SystemCPU
- The amount of system CPU time used by the job or job step. Format is the
same as Elapsed.
NOTE: SystemCPU provides a measure of the task's parent
process and does not include CPU time of child processes.
-
- Timelimit
- What the timelimit was/is for the job. Format is the same as
Elapsed.
-
- TimelimitRaw
- What the timelimit was/is for the job. Format is in number of
minutes.
-
- TotalCPU
- The sum of the SystemCPU and UserCPU time used by the job or job step. The
total CPU time of the job may exceed the job's elapsed time for jobs that
include multiple job steps. Format is the same as Elapsed.
NOTE: TotalCPU provides a measure of the task's parent process
and does not include CPU time of child processes.
-
- TresUsageInAve
- Tres average usage in by all tasks in job. NOTE: If corresponding
TresUsageInMaxTask is -1 the metric is node centric instead of task.
-
- TresUsageInMax
- Tres maximum usage in by all tasks in job. NOTE: If corresponding
TresUsageInMaxTask is -1 the metric is node centric instead of task.
-
- TresUsageInMaxNode
- Node for which each maximum TRES usage out occurred.
-
- TresUsageInMaxTask
- Task for which each maximum TRES usage out occurred.
-
- TresUsageInMin
- Tres minimum usage in by all tasks in job. NOTE: If corresponding
TresUsageInMinTask is -1 the metric is node centric instead of task.
-
- TresUsageInMinNode
- Node for which each minimum TRES usage out occurred.
-
- TresUsageInMinTask
- Task for which each minimum TRES usage out occurred.
-
- TresUsageInTot
- Tres total usage in by all tasks in job.
-
- TresUsageOutAve
- Tres average usage out by all tasks in job. NOTE: If corresponding
TresUsageOutMaxTask is -1 the metric is node centric instead of task.
-
- TresUsageOutMax
- Tres maximum usage out by all tasks in job. NOTE: If corresponding
TresUsageOutMaxTask is -1 the metric is node centric instead of task.
-
- TresUsageOutMaxNode
- Node for which each maximum TRES usage out occurred.
-
- TresUsageOutMaxTask
- Task for which each maximum TRES usage out occurred.
-
- TresUsageOutMin
- Tres minimum usage out by all tasks in job.
-
- TresUsageOutMinNode
- Node for which each minimum TRES usage out occurred.
-
- TresUsageOutMinTask
- Task for which each minimum TRES usage out occurred.
-
- TresUsageOutTot
- Tres total usage out by all tasks in job.
-
- UID
- The user identifier of the user who ran the job.
-
- User
- The user name of the user who ran the job.
-
- UserCPU
- The amount of user CPU time used by the job or job step. Format is the
same as Elapsed.
NOTE: UserCPU provides a measure of the task's parent process
and does not include CPU time of child processes.
-
- WCKey
- Workload Characterization Key. Arbitrary string for grouping orthogonal
accounts together.
-
- WCKeyID
- Reference to the wckey.
-
- WorkDir
- The directory used by the job to execute commands.
-
- BF BOOT_FAIL
- Job terminated due to launch failure, typically due to a hardware failure
(e.g. unable to boot the node or block and the job can not be
requeued).
-
- CA CANCELLED
- Job was explicitly cancelled by the user or system administrator. The job
may or may not have been initiated.
-
- CD COMPLETED
- Job has terminated all processes on all nodes with an exit code of
zero.
-
- DL DEADLINE
- Job terminated on deadline.
-
- F FAILED
- Job terminated with non-zero exit code or other failure condition.
-
- NF NODE_FAIL
- Job terminated due to failure of one or more allocated nodes.
-
- OOM
OUT_OF_MEMORY
- Job experienced out of memory error.
-
- PD PENDING
- Job is awaiting resource allocation.
-
- PR PREEMPTED
- Job terminated due to preemption.
-
- R RUNNING
- Job currently has an allocation.
-
- RQ REQUEUED
- Job was requeued.
-
- RS RESIZING
- Job is about to change size.
-
- RV REVOKED
- Sibling was removed from cluster due to other cluster starting the
job.
-
- S SUSPENDED
- Job has an allocation, but execution has been suspended and CPUs have been
released for other jobs.
-
- TO TIMEOUT
- Job terminated upon reaching its time limit.
-
The options --starttime and --endtime define the time window
between which sacct is going to search. For historical and practical
reasons their default values (i.e. the default time window) depends on other
options: --jobs and --state.
Depending on if --jobs and/or --state are specified, the default
values of --starttime and --endtime options are:
WITHOUT EITHER --jobs NOR --state specified:
--starttime defaults to Midnight.
--endtime defaults to Now.
WITH --jobs AND WITHOUT --state specified:
--starttime defaults to Epoch 0.
--endtime defaults to Now.
WITHOUT --jobs AND WITH --state specified:
--starttime defaults to Now.
--endtime defaults to --starttime and to Now if --starttime is not
specified.
WITH BOTH --jobs AND --state specified:
--starttime defaults to Epoch 0.
--endtime defaults to --starttime or to Now if --starttime is not
specified.
NOTE: With -v/--verbose a message about the actual
time window in use is shown.
Executing sacct sends a remote procedure call to
slurmdbd. If enough calls from sacct or other Slurm client
commands that send remote procedure calls to the slurmdbd daemon come
in at once, it can result in a degradation of performance of the
slurmdbd daemon, possibly resulting in a denial of service.
Do not run sacct or other Slurm client commands that send
remote procedure calls to slurmdbd from loops in shell scripts or
other programs. Ensure that programs limit calls to sacct to the
minimum necessary for the information you are trying to gather.
Some sacct options may be set via environment variables.
These environment variables, along with their corresponding options, are
listed below. (Note: Command line options will always override these
settings.)
- SACCT_FEDERATION
- Same as --federation
-
- SACCT_LOCAL
- Same as --local
-
- SLURM_BITSTR_LEN
- Specifies the string length to be used for holding a job array's task ID
expression. The default value is 64 bytes. A value of 0 will print the
full expression with any length required. Larger values may adversely
impact the application performance.
-
- SLURM_CONF
- The location of the Slurm configuration file.
-
- SLURM_DEBUG_FLAGS
- Specify debug flags for sacct to use. See DebugFlags in the
slurm.conf(5) man page for a full list of flags. The environment
variable takes precedence over the setting in the slurm.conf.
-
- SLURM_TIME_FORMAT
- Specify the format used to report time stamps. A value of standard,
the default value, generates output in the form
"year-month-dateThour:minute:second". A value of relative
returns only "hour:minute:second" if the current day. For other
dates in the current year it prints the "hour:minute" preceded
by "Tomorr" (tomorrow), "Ystday" (yesterday), the name
of the day for the coming week (e.g. "Mon", "Tue",
etc.), otherwise the date (e.g. "25 Apr"). For other years it
returns a date month and year without a time (e.g. "6 Jun
2012"). All of the time stamps use a 24 hour format.
A valid strftime() format can also be specified. For example,
a value of "%a %T" will report the day of the week and a time
stamp (e.g. "Mon 12:34:56").
-
This example illustrates the default invocation of the
sacct command:
# sacct
Jobid Jobname Partition Account AllocCPUS State ExitCode
---------- ---------- ---------- ---------- ---------- ---------- --------
2 script01 srun acct1 1 RUNNING 0
3 script02 srun acct1 1 RUNNING 0
4 endscript srun acct1 1 RUNNING 0
4.0 srun acct1 1 COMPLETED 0
This example shows the same job accounting information with the
brief option.
# sacct --brief
Jobid State ExitCode
---------- ---------- --------
2 RUNNING 0
3 RUNNING 0
4 RUNNING 0
4.0 COMPLETED 0
# sacct --allocations
Jobid Jobname Partition Account AllocCPUS State ExitCode
---------- ---------- ---------- ---------- ------- ---------- --------
3 sja_init andy acct1 1 COMPLETED 0
4 sjaload andy acct1 2 COMPLETED 0
5 sja_scr1 andy acct1 1 COMPLETED 0
6 sja_scr2 andy acct1 18 COMPLETED 2
7 sja_scr3 andy acct1 18 COMPLETED 0
8 sja_scr5 andy acct1 2 COMPLETED 0
9 sja_scr7 andy acct1 90 COMPLETED 1
10 endscript andy acct1 186 COMPLETED 0
This example demonstrates the ability to customize the output of
the sacct command. The fields are displayed in the order designated
on the command line.
# sacct --format=jobid,elapsed,ncpus,ntasks,state
Jobid Elapsed Ncpus Ntasks State
---------- ---------- ---------- -------- ----------
3 00:01:30 2 1 COMPLETED
3.0 00:01:30 2 1 COMPLETED
4 00:00:00 2 2 COMPLETED
4.0 00:00:01 2 2 COMPLETED
5 00:01:23 2 1 COMPLETED
5.0 00:01:31 2 1 COMPLETED
This example demonstrates the use of the -T (--truncate) option
when used with -S (--starttime) and -E (--endtime). When the -T option is
used, the start time of the job will be the specified -S value if the job
was started before the specified time, otherwise the time will be the job's
start time. The end time will be the specified -E option if the job ends
after the specified time, otherwise it will be the jobs end time.
Without -T (normal operation) sacct output would be like this.
# sacct -S2014-07-03-11:40 -E2014-07-03-12:00 -X -ojobid,start,end,state
JobID Start End State
--------- --------------------- -------------------- ------------
2 2014-07-03T11:33:16 2014-07-03T11:59:01 COMPLETED
3 2014-07-03T11:35:21 Unknown RUNNING
4 2014-07-03T11:35:21 2014-07-03T11:45:21 COMPLETED
5 2014-07-03T11:41:01 Unknown RUNNING
By adding the -T option the job's start and end times are
truncated to reflect only the time requested. If a job started after the
start time requested or finished before the end time requested those times
are not altered. The -T option is useful when determining exact run times
during any given period.
# sacct -T -S2014-07-03-11:40 -E2014-07-03-12:00 -X -ojobid,jobname,user,start,end,state
JobID Start End State
--------- --------------------- -------------------- ------------
2 2014-07-03T11:40:00 2014-07-03T11:59:01 COMPLETED
3 2014-07-03T11:40:00 2014-07-03T12:00:00 RUNNING
4 2014-07-03T11:40:00 2014-07-03T11:45:21 COMPLETED
5 2014-07-03T11:41:01 2014-07-03T12:00:00 RUNNING
NOTE: If no -s (--state) option is given
sacct will display eligible jobs during the specified period of time,
otherwise it will return jobs that were in the state requested during that
period of time.
This example demonstrates the differences running sacct with and
without the --state flag for the same time period. Without the
--state option, all eligible jobs in that time period are shown.
# sacct -S11:20:00 -E11:25:00 -X -ojobid,start,end,state
JobID Start End State
------------ ------------------- ------------------- ----------
2955 11:15:12 11:20:12 COMPLETED
2956 11:20:13 11:25:13 COMPLETED
With the --state=pending option, only job 2956 will be
shown because it had a dependency on 2955 and was still PENDING from
11:20:00 until it started at 11:21:13. Note that even though we requested
PENDING jobs, the State shows as COMPLETED because that is the current State
of the job.
# sacct --state=pending -S11:20:00 -E11:25:00 -X -ojobid,start,end,state
JobID Start End State
------------ ------------------- ------------------- ----------
2956 11:20:13 11:25:13 COMPLETED
Copyright (C) 2005-2007 Copyright Hewlett-Packard Development
Company L.P.
Copyright (C) 2008-2010 Lawrence Livermore National Security. Produced at
Lawrence Livermore National Laboratory (cf, DISCLAIMER).
Copyright (C) 2010-2022 SchedMD LLC.
This file is part of Slurm, a resource management program. For
details, see <https://slurm.schedmd.com/>.
Slurm is free software; you can redistribute it and/or modify it
under the terms of the GNU General Public License as published by the Free
Software Foundation; either version 2 of the License, or (at your option)
any later version.
Slurm is distributed in the hope that it will be useful, but
WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
more details.
- /etc/slurm.conf
- Entries to this file enable job accounting and designate the job
accounting log file that collects system job accounting.
-
- /var/log/slurm_accounting.log
- The default job accounting log file. By default, this file is set to read
and write permission for root only.