DOKK / manpages / debian 12 / oar-user / oarsub.1.en
oarsub(1) OAR commands oarsub(1)

oarsub - OAR batch scheduler job submission command.

oarsub [OPTIONS] <executable program>

oarsub [OPTIONS] <script file>

oarsub [OPTIONS] "<inline script>"

oarsub [OPTIONS] -I

oarsub [OPTIONS] -C <JOB ID>

One uses oarsub to submit a job to the OAR batch scheduler managing the resources of a HPC Cluster. A job is defined by the description of a set of resources needed to execute a task, and a script or executable to run. A job may also be run interactively, and one may also use oarsub to connect to a previously submitted job.

The scheduler is in charge of providing a set of resources which matchs the oarsub request. Once scheduled and launched, a job consists of one process executed on the first node of the resources it was attibuted, with a set of environment variables set, which describe the job. That means that the job's executable is responsible for connecting those resources and dispatching work among them.

Request an interactive job. Open a login shell on the first node of the job instead of running a script.
Connect to a running job.
Set the requested resources for the job. Parameters are resource properties defined in OAR database, and a `walltime' which specifies the maximum duration of the job before its termination (the job process can terminate earlier). The walltime format is [hour:mn:sec|hour:mn|hour]. Ex: nodes=4/cpu=1,walltime=2:00:00

Multiple -l options can be given at the same line. That defines a moldable job: a job which can take different shapes. For example, for a very flexible application, one could perform the following job submission:

  oarsub -l cpu=2,walltime=20:00:00 -l cpu=4,walltime=10:00:00 -l cpu=8,walltime=5:00:00 ./script.sh
    

OAR would schedule one of the three proposed resource definitions, depending of the load of the cluster, and preferring the one with the earliest possible start.

One can also request different groups of resources, for example:

  oarsub -l "{mem > 64}/host=1+{mem < 48}/host=3",walltime=1:00:00 -I
    

The scheduler job would have 1 host with property "mem" > 64 and 3 hosts with property "mem" < 48. The syntax between braces, {...}, is the same as the one used for "-p" option.

Submit an array job containing "NUMBER" subjobs. All the subjobs share the same array_id but each subjob is independent and has its own job_id. All the subjobs have the same characteristics (script, requirements) and can be identified by an environment variable $OAR_ARRAY_INDEX.

Array jobs can neither be Interactive (-I) nor a reservation (-r).

Submit a parametric array job. Each non-empty line of "FILE" defines the parameters for a subjob. All the subjobs have the same characteristics (script, requirements) and can be identified by an environment variable $OAR_ARRAY_INDEX. '#' is the comment sign.

Parametric array jobs can neither be Interactive (-I) nor a reservation (-r).

Batch mode only: ask oarsub to scan the given script for OAR directives (#OAR -l ...)
Set the the queue to submit the job to.
Add a list of constraints on the resource properties for the job. The format of a contraint is the one of a WHERE clause using the SQL syntax.
Request that the job starts at a specified time. A job creation using this option is called an advance reservation (as opposite to a submission).
Enable the checkpointing mechanism for the job. A signal will be sent DELAY seconds before the walltime to the first processus of the job (on the first node of the job resources).
Specify the signal to used to trigger the checkpointing. Use signal numbers (see kill -l), default is 12 (SIGUSR2).
Specify a specific type (besteffort, timesharing, idempotent, cosystem, deploy, noop, container, inner, token:xxx=yy,... )

Notes:
- a job with the besteffort type will be scheduled with the lowest priority and will be killed if a "normal" job needs its resources.
- a job with the idempotent type will be automatically resubmitted if its exit code is 99 and its duration > 60s.
- a job with the idempotent and besteffort types but with no checkpoint delay will automatically be resubmitted whenever killed by OAR before its normal termination to execute a non-besteffort jobs.
- a job with the idempotent and besteffort types with a checkpoint delay will be automatically resubmitted if its exit code is 99 and its duration > 60s (that is, as if only idempotent was present).
- a job with the noop type does nothing except reserving the resources. It is ended at the end of it walltime or when using the oardel command.

Specify the directory where to launch the command (default is current directory)
Specify a name of a project the job belongs to.
Specify an arbitrary name for the job.
Previously submitted job that the new job execution must depend on. The new job will only start upon the end of the previous one.
Specify a notification method (email or command to execute). Ex:
--notify "mail:name@domain.com"
--notify "exec:/path/to/script args"

Arguments are job_id, job_name, TAG, comment

TAG can be:
- RUNNING : when the job is launched
- END : when the job is finished normally
- ERROR : when the job is finished abnormally
- INFO : used when oardel is called on the job
- SUSPENDED : when the job is suspended
- RESUMING : when the job is resumed

By default all TAGs are triggered. It is possible to specify which TAGs must be triggered. Ex:
--notify "[END,ERROR]mail:name@domain.com"
--notify "[RUNNING]mail:name@domain.com"
--notify "[RUNNING,END,ERROR]exec:/path/to/script args"

Resubmit the given job as a new one.
Activate the job-key mechanism: a job-key will be generated for the job, which can be used my the oarsh command in place of the OAR_JOB_ID. That job-key also allows one to connect to a job from a machine which is outside the OAR cluster the job belong to (e.g. in a grid of OAR clusters), given the job-key is available on that machine.

The job-key mechanism may be activated by default in the configuration of your OAR cluster. In that case, the -k option is useless.

Please note that this option is mostly useful when used together with the -e option or the -i option (see below).

Export the job-key to a file (the %jobid% pattern is automatically replaced in the filename). Warning: the file will be overwritten, whenever it already exists.
Import the job-key to use (you may reuse an exported job-key from a previous job for instance) from a file, instead of generating a new one. One may also use the OAR_JOB_KEY_FILE environment variable to set the job-key file.
Import the job-key to use inline (as text in the command line), instead of generating a new one.
Specify the file that will store the standard output stream of the job. The %jobid% and %jobname% patterns are automatically replaced.
Specify the file that will store the standard error stream of the job. The %jobid% and %jobname% patterns are automatically replaced.
Set the job state into Hold instead of Waiting, so that it is not scheduled as long as not resumed (the oarresume command allows one to turn it back into the Waiting state).
Print the command results in Perl's Data::Dumper format.
Print the command results in the XML format.
Print the command results in the YAML format.
Print the command results in the JSON format.
Print this help message.
Print the version of OAR.

Pathname to the file containing the list of the nodes that are attributed to the job.
Name of the job as given using the -n option.
Id of the job. Each job get a unique job identifier. This identifier can be use to retrieve information about the job using oarstat, or to connect to a running job using oarsub -C or oarsh for instance.
Array Id of the job. Every array job gets an unique array identifier that is shared by all the subjobs of the array job. This identifier can be used to identify the different subjobs pertaining to a same array job. Array Id can also be used to deal with all the subjobs of a given array at once (by means of the --array option in the case of oarstat, oarhold, oarresume and oardel). NB: regular jobs are array jobs with only one subjob.
Array Index of the job: within an array job, each subjob gets a unique (for a given array) index, starting from 0, which can be used to identify the subjob.
Walltime of the job in the hh:mm:ss format and in seconds.
Pathname to the file containing the list of all resources attributes for the job, and their value. See oarprint also.
Name of the project the job is part of, as given using the --project option.
Pathname to the files storing the standard output and standard error of the job executable, if not running in interactive mode.
Working directory for the job. The job executable will be executed in that directory, on the first node which is allocated to the job.
Key file to use for the submission (or oarsh) if using the job-key mechanism (-k or --use-job-key option). One may provide the job-key to import using the -i or --import-job-key-from-file option as well.

When submitting a job using a script shell, that script can contain some OAR options, with lines starting with #OAR and using the same options syntax as described above.

 oarsub -l /nodes=4 -I
 oarsub -q default -l /nodes=10/cpu=3,walltime=50:30:00 -p "switch = 'sw1'" /home/username/path/to/my/prog
 oarsub -r "2009-04-27 11:00:00" -l /nodes=12/cpu=2
 oarsub -C 154

 oarsub -l /nodes=4 /home/usename/path/to/my/prog --array 10

 oarsub /home/users/toto/prog --array-param-file /home/username/path/to/params.txt

With /home/username/path/to/param.txt containing for instance:

 # my param file
 # a subjob with a single parameter
 p100
 # a subjob without parameter
 ""
 # a subjob with two strings as parameters
 "arg1a arg1b arg1c" "arg2a arg2b"

 oarsub -S /home/username/path/to/my/script.sh

With /home/username/path/to/my/script.sh containing for instance:

 #!/bin/bash
 #OAR -l /nodes=4/cpu=1,walltime=3:15:00
 #OAR -p switch = 'sw3' or switch = 'sw5'
 #OAR -t besteffort
 #OAR -t type2
 #OAR -k
 #OAR -e /path/to/job/key
 #OAR --stdout stdoutfile.log
 /home/username/path/to/my/prog

oarprint(1), oarsh(1), oardel(1), oarstat(1), oarnodes(1), oarhold(1), oarresume(1)

 Copyright 2003-2016 Laboratoire d'Informatique de Grenoble (http://www.liglab.fr). This software is licensed under the GNU General Public License Version 2 or above. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
2021-01-11 oarsub