makeflow - workflow engine for executing distributed
workflows
makeflow [options] <dagfile>
Makeflow is a workflow engine for distributed computing. It
accepts a specification of a large amount of work to be performed, and runs
it on remote machines in parallel where possible. In addition,
Makeflow is fault-tolerant, so you can use it to coordinate very
large tasks that may run for days or weeks in the face of failures.
Makeflow is designed to be similar to Make, so if you can write a
Makefile, then you can write a Makeflow.
You can run a Makeflow on your local machine to test it
out. If you have a multi-core machine, then you can run multiple tasks
simultaneously. If you have a Condor pool or a Sun Grid Engine batch system,
then you can send your jobs there to run. If you don't already have a batch
system, Makeflow comes with a system called Work Queue that will let
you distribute the load across any collection of machines, large or small.
Makeflow also supports execution in a Docker container, regardless of
the batch system used.
When makeflow is ran without arguments, it will attempt to execute
the workflow specified by the Makeflow dagfile using the local
execution engine.
- -B,
--batch-options=<options>
- Add these options to all batch submit files.
- -j,
--max-local=<#>
- Max number of local jobs to run at once. (default is # of cores)
- -J,
--max-remote=<#>
- Max number of remote jobs to run at once. (default is 1000 for -Twq, 100
otherwise)
- -l,
--makeflow-log=<logfile>
- Use this file for the makeflow log. (default is X.makeflowlog)
- -L,
--batch-log=<logfile>
- Use this file for the batch system log. (default is
X.<type>log)
- -R, --retry
- Automatically retry failed batch jobs up to 100 times.
- -r,
--retry-count=<n>
- Automatically retry failed batch jobs up to n times.
- --send-environment
- Send all local environment variables in remote execution.
- --wait-for-files-upto <#>
- Wait for output files to be created upto this many seconds (e.g., to deal
with NFS semantics).
- -S,
--submission-timeout=<timeout>
- Time to retry failed batch job submission. (default is 3600s)
- -T,
--batch-type=<type>
- Batch system type: local, dryrun, condor, sge, pbs, torque, blue_waters,
slurm, moab, cluster, wq, amazon, mesos. (default is local)
- --safe-submit-spec <#>
- Excludes resources at submission (SLURM, TORQUE, and PBS)
- --ignore-memory-spec <#>
- Excludes memory at submission (SLURM)
- --json
- Interpret <dagfile> as a JSON format Makeflow.
- --jx
- Evaluate JX expressions in <dagfile>. Implies --json.
- --jx-args <args>
- Read variable definitions from the JX file <args>.
- --jx-define <VAL=EXPR>
- Set the variable <VAL> to the JX expression <EXPR>.
- --jx-context <ctx>
- Deprecated.
- -d,
--debug=<subsystem>
- Enable debugging for this subsystem.
- -o,
--debug-file=<file>
- Write debugging output to this file. By default, debugging is sent to
stderr (":stderr"). You may specify logs be sent to stdout
(":stdout"), to the system syslog (":syslog"), or to
the systemd journal (":journal").
- --verbose
- Display runtime progress on stdout.
- -a, --advertise
- Advertise the master information to a catalog server.
- -C,
--catalog-server=<catalog>
- Set catalog server to <catalog>. Format: HOSTNAME:PORT
- -F,
--wq-fast-abort=<#>
- WorkQueue fast abort multiplier. (default is deactivated)
- -M,
---N=<project-name>
- Set the project name to <project>.
- -p,
--port=<port>
- Port number to use with WorkQueue. (default is 9123, 0=arbitrary)
- -Z,
--port-file=<file>
- Select port at random and write it to this file. (default is
disabled)
- -P,
--priority=<integer>
- Priority. Higher the value, higher the priority.
- -W,
--wq-schedule=<mode>
- WorkQueue scheduling algorithm. (time|files|fcfs)
- -s,
--password=<pwfile>
- Password file for authenticating workers.
- --disable-cache
- Disable file caching (currently only Work Queue, default is false)
- --work-queue-preferred-connection <connection>
- Indicate preferred connection. Chose one of by_ip or by_hostname. (default
is by_ip)
- --monitor <dir>
- Enable the resource monitor, and write the monitor logs to
<dir>
- --monitor-with-time-series
- Enable monitor time series. (default is disabled)
- --monitor-with-opened-files
- Enable monitoring of openened files. (default is disabled)
- --monitor-interval <#>
- Set monitor interval to <#> seconds. (default 1 second)
- --monitor-log-fmt <fmt>
- Format for monitor logs. (default resource-rule-%06.6d, %d -> rule
number)
- --allocation <waste>
- When monitoring is enabled, automatically assign resource allocations to
tasks. Makeflow will try to minimize waste or maximize throughput.
- --umbrella-binary <filepath>
- Umbrella binary for running every rule in a makeflow.
- --umbrella-log-prefix <filepath>
- Umbrella log file prefix for running every rule in a makeflow. (default is
<makefilename>.umbrella.log)
- --umbrella-mode <mode>
- Umbrella execution mode for running every rule in a makeflow. (default is
local)
- --umbrella-spec <filepath>
- Umbrella spec for running every rule in a makeflow.
- --docker <image>
-
Run each task in the Docker container with this name. The image will be
obtained via "docker pull" if it is not already available.
- --docker-tar <tar>
-
Run each task in the Docker container given by this tar file. The image
will be uploaded via "docker load" on each execution site.
- --singularity <image>
-
Run each task in the Singularity container with this name. The container
will be created from the passed in image.
- --amazon-credentials <path>
-
Specify path to Amazon credentials file. The credentials file should be in
the following JSON format:
-
-
{
"aws_access_key_id" : "AAABBBBCCCCDDD"
"aws_secret_access_key" : "AAABBBBCCCCDDDAAABBBBCCCCDDD"
}
- --amazon-ami <image-id>
-
Specify an amazon machine image.
- --mesos-master <hostname>
-
Indicate the host name of preferred mesos master.
- --mesos-path <filepath>
-
Indicate the path to mesos python2 site-packages.
- --mesos-preload <library>
-
Indicate the linking libraries for running mesos..
- --lambda-config <path>
-
Path to the configuration file generated by makeflow_lambda_setup
- --k8s-image <docker_image>
-
Indicate the Docker image for running pods on Kubernetes cluster.
- --mounts <mountfile>
- Use this file as a mountlist. Every line of a mountfile can be used to
specify the source and target of each input dependency in the format of
target source (Note there should be a space between target and
source.).
- --cache <cache_dir>
- Use this dir as the cache for file dependencies.
- --archive <path>
- Archive results of workflow at the specified path (by default
/tmp/makeflow.archive.$UID) and use outputs of any archived jobs instead
of re-executing job
- --archive-read <path>
- Only check to see if jobs have been cached and use outputs if it has
been
- --archive-write <path>
- Write only results of each job to the archiving directory at the specified
path
- -A, --disable-afs-check
- Disable the check for AFS. (experts only)
- -z, --zero-length-error
- Force failure on zero-length output files.
- -g,
--gc=<type>
- Enable garbage collection. (ref_cnt|on_demand|all)
- --gc-size <int>
- Set disk size to trigger GC. (on_demand only)
- -G,
--gc-count=<int>
- Set number of files to trigger GC. (ref_cnt only)
- --wrapper <script>
-
Wrap all commands with this script. Each rule's original recipe is
appended to script or replaces the first occurrence of {} in
script.
- --wrapper-input <file>
-
Wrapper command requires this input file. This option may be specified more
than once, defining an array of inputs. Additionally, each job executing a
recipe has a unique integer identifier that replaces occurrences %%
in file.
- --wrapper-output <file>
-
Wrapper command requires this output file. This option may be specified
more than once, defining an array of outputs. Additionally, each job
executing a recipe has a unique integer identifier that replaces
occurrences %% in file.
- --enforcement
- Use Parrot to restrict access to the given inputs/outputs.
- --parrot <path>
- Path to parrot_run executable on the host system.
- --shared-fs <dir>
- Assume the given directory is a shared filesystem accessible at all
execution sites.
When the batch system is set to -T <dryrun>, Makeflow
runs as usual but does not actually execute jobs or modify the system. This
is useful to check that wrappers and substitutions are applied as expected.
In addition, Makeflow will write an equivalent shell script to the batch
system log specified by -L <logfile>. This script will run the
commands in serial that Makeflow would have run. This shell script format
may be useful for archival purposes, since it does not depend on
Makeflow.
The following environment variables will affect the execution of
your Makeflow:
This corresponds to the -B <options> parameter and
will pass extra batch options to the underlying execution engine.
This corresponds to the -j <#> parameter and will set
the maximum number of local batch jobs. If a -j <#> parameter
is specified, the minimum of the argument and the environment variable is
used.
This corresponds to the -J <#> parameter and will set
the maximum number of local batch jobs. If a -J <#> parameter
is specified, the minimum of the argument and the environment variable is
used.
Note that variables defined in your Makeflow are exported
to the environment.
Inclusive low port in range used with -p 0.
Inclusive high port in range used with -p 0.
On success, returns zero. On failure, returns non-zero.
Run makeflow locally with debugging:
-
-
makeflow -d all Makeflow
Run makeflow on Condor will special requirements:
-
-
makeflow -T condor -B "requirements = MachineGroup == 'ccl'" Makeflow
Run makeflow with WorkQueue using named workers:
-
-
makeflow -T wq -a -N project.name Makeflow
Create a directory containing all of the dependencies required to
run the specified makeflow
-
-
makeflow -b bundle Makeflow
The Cooperative Computing Tools are Copyright (C) 2003-2004
Douglas Thain and Copyright (C) 2005-2015 The University of Notre Dame. This
software is distributed under the GNU General Public License. See the file
COPYING for details.