confuga(1) | Cooperative Computing Tools | confuga(1) |
Confuga - An active storage cluster file system.
chirp_server --jobs --root=<Confuga URI> [options]
Configures and starts a Chirp server to act as the head node for a Confuga storage cluster.
For complete details with examples, see the Confuga User's Manual (http://ccl.cse.nd.edu/software/manuals/confuga.html).
A Chirp server acting as the Confuga head node uses normal chirp_server(1) options. In order to run the Chirp server as the Confuga head node, use the --root switch with the Confuga URI. You must also enable job execution with the --jobs switch.
The format for the Confuga URI is: confuga:///path/to/workspace?option1=value&option2=value. The workspace path is the location Confuga maintains metadata and databases for the head node. Confuga specific options are also passed through the URI, documented below. Examples demonstrating how to start Confuga and a small cluster are at the end of this manual.
Confuga uses regular Chirp servers as storage nodes. Each storage node is added to the cluster using the confuga_adm(1) command. All storage node Chirp servers must be run with:
These options are also suggested but not required:
You must also ensure that the storage nodes and the Confuga head node are using the same catalog_server(1). By default, this should be the case. The EXAMPLES section below includes an example cluster using a manually hosted catalog server.
To add storage nodes to the Confuga cluster, use the confuga_adm(1) administrative tool.
The easiest way to execute workflows on Confuga is through makeflow(1). Only two options to Makeflow are required, --batch-type and --working-dir. Confuga uses the Chirp job protocol, so the batch type is chirp. It is also necessary to define the executing server, the Confuga Head Node, and the namespace the workflow executes in. For example:
makeflow --batch-type=chirp --working-dir=chirp://confuga.example.com:9094/path/to/workflow
The workflow namespace is logically prepended to all file paths defined in the Makeflow specification. So for example, if you have this Makeflow file:
a: exe
./exe > a
Confuga will execute /path/to/workflow/exe and produce the output file /path/to/workflow/a.
Unlike other batch systems used with Makeflow, like Condor or Work Queue, all files used by a workflow must be in the Confuga file system. Condor and Work Queue both stage workflow files from the submission site to the execution sites. In Confuga, the entire workflow dataset, including executables, is already resident. So when executing a new workflow, you need to upload the workflow dataset to Confuga. The easiest way to do this is using the chirp(1) command line tool:
chirp confuga.example.com put workflow/ /path/to/
Finally, Confuga does not save the stdout or stderr of jobs. If you want these files for debugging purposes, you must explicitly save them. To streamline the process, you may use Makeflow's --wrapper options to save stdout and stderr:
makeflow --batch-type=chirp \
--working-dir=chirp://confuga.example.com/ \
--wrapper=$'{\n{}\n} > stdout.%% 2> stderr.%%' \
--wrapper-output='stdout.%%' \
--wrapper-output='stderr.%%'
Launch a head node with Confuga state stored in ./confuga.root:
chirp_server --jobs --root="confuga://$(pwd)/confuga.root/"
Launch a head node with workspace /tmp/confuga.root using storage nodes chirp://localhost:10001 and chirp://localhost:10002/u/joe/confuga:
chirp_server --jobs --root='confuga:///tmp/confuga.root/' confuga_adm confuga:///tmp/confuga.root/ sn-add address localhost:10001 confuga_adm confuga:///tmp/confuga.root/ sn-add -r /u/joe/confuga address localhost:10001
Run a simple test cluster on your workstation:
# start a catalog server in the background catalog_server --history=catalog.history \
--update-log=catalog.update \
--interface=127.0.0.1 \
& # sleep for a time so catalog can start sleep 1 # start storage node 1 in the background chirp_server --advertise=localhost \
--catalog-name=localhost \
--catalog-update=10s \
--interface=127.0.0.1 \
--jobs \
--job-concurrency=10 \
--root=./root.1 \
--port=9001 \
--project-name=test \
--transient=./tran.1 \
& # start storage node 2 in the background chirp_server --advertise=localhost \
--catalog-name=localhost \
--catalog-update=10s \
--interface=127.0.0.1 \
--jobs \
--job-concurrency=10 \
--root=./root.2 \
--port=9002 \
--project-name=test \
--transient=./tran.2 \
& # sleep for a time so catalog can receive storage node status sleep 5 confuga_adm confuga:///$(pwd)/confuga.root/ sn-add address localhost:9001 confuga_adm confuga:///$(pwd)/confuga.root/ sn-add address localhost:9002 # start the Confuga head node chirp_server --advertise=localhost \
--catalog-name=localhost \
--catalog-update=30s \
--debug=confuga \
--jobs \
--root="confuga://$(pwd)/confuga.root/?auth=unix" \
--port=9000
The Cooperative Computing Tools are Copyright (C) 2003-2004 Douglas Thain and Copyright (C) 2005-2015 The University of Notre Dame. This software is distributed under the GNU General Public License. See the file COPYING for details.
CCTools 7.0.9 FINAL |