CTDConverter - Convert CTD files into Galaxy tool and CWL
CommandLineTool files
CTDConverter - A project from the WorkflowConversion family
(https://github.com/WorkflowConversion/CTDConverter)
Copyright 2017, WorklfowConversion
Licensed under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
- http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an "AS
IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express
or implied. See the License for the specific language governing permissions
and limitations under the License.
USAGE:
- $ convert.py [FORMAT] [ARGUMENTS ...]
FORMAT can be either one of the supported output formats: cwl,
galaxy.
There is one converter for each supported FORMAT, each taking a
different set of arguments. Please consult the detailed documentation for
each of the converters. Nevertheless, all converters have the following
common parameters/options:
I - Parsing a single CTD file and convert it:
- $ convert.py [FORMAT] -i [INPUT_FILE] -o
[OUTPUT_FILE]
II - Parsing several CTD files, output converted wrappers in a
given folder:
- $ convert.py [FORMAT] -i [INPUT_FILES] -o
[OUTPUT_DIRECTORY]
III - Hardcoding parameters
- It is possible to hardcode parameters. This makes sense if you want to set
a tool in 'quiet' mode or if your tools support multi-threading and accept
the number of threads via a parameter, without giving end users the chance
to change the values for these parameters.
- In order to generate hardcoded parameters, you need to provide a simple
file. Each line of this file contains two or three columns separated by
whitespace. Any line starting with a '#' will be ignored. The first column
contains the name of the parameter, the second column contains the value
that will always be set for this parameter. Only the first two columns are
mandatory.
- If the parameter is to be hardcoded only for a set of tools, then a third
column can be added. This column contains a comma-separated list of tool
names for which the parameter will be hardcoded. If a third column is not
present, then all processed tools containing the given parameter will get
a hardcoded value for it.
- The following is an example of a valid file:
- ##################################### HARDCODED PARAMETERS example
##################################### # Every line starting with a # will
be handled as a comment and will not be parsed. # The first column is the
name of the parameter and the second column is the value that will be
used.
- # Parameter name
- # Value # Tool(s)
- threads
- 8
- mode
- quiet
- xtandem_executable
- xtandem XTandemAdapter
- verbosity
- high Foo, Bar
- #########################################################################################################
- Using the above file will produce a command-line similar to:
- [TOOL] ... -threads 8 -mode quiet ...
- for all tools. For XTandemAdapter, however, the command-line will look
like:
- XtandemAdapter ... -threads 8 -mode quiet
-xtandem_executable xtandem ...
- And for tools Foo and Bar, the command-line will be similar to:
- Foo -threads 8 -mode quiet -verbosity high ...
- IV - Engine-specific parameters
- i - Galaxy
- a. Providing file formats, mimetypes
- Galaxy supports the concept of file format in order to connect compatible
ports, that is, input ports of a certain data format will be able to
receive data from a port from the same format. This converter allows you
to provide a personalized file in which you can relate the CTD data
formats with supported Galaxy data formats. The layout of this file
consists of lines, each of either one or four columns separated by any
amount of whitespace. The content of each column is as follows:
- * 1st column: file extension * 2nd column: data type, as listed in Galaxy
* 3rd column: full-named Galaxy data type, as it will appear on
datatypes_conf.xml * 4th column: mimetype (optional)
- The following is an example of a valid "file formats" file:
- ########################################## FILE FORMATS example
########################################## # Every line starting with a #
will be handled as a comment and will not be parsed. # The first column is
the file format as given in the CTD and second column is the Galaxy data
format. The # second, third, fourth and fifth columns can be left empty if
the data type has already been registered # in Galaxy, otherwise, all but
the mimetype must be provided.
- # CTD type
- # Galaxy type # Long Galaxy data type # Mimetype
- csv
- tabular galaxy.datatypes.data:Text
- fasta ini txt galaxy.datatypes.data:Text txt idxml txt
galaxy.datatypes.xml:GenericXml application/xml options txt
galaxy.datatypes.data:Text grid grid galaxy.datatypes.data:Grid
##########################################################################################################
- Note that each line consists precisely of either one, three or four
columns. In the case of data types already registered in Galaxy (such as
fasta and txt in the above example), only the first column is needed. In
the case of data types that haven't been yet registered in Galaxy, the
first three columns are needed (mimetype is optional).
- For information about Galaxy data types and subclasses, see the following
page:
https://wiki.galaxyproject.org/Admin/Datatypes/Adding%20Datatypes
- b. Finer control over which tools will be converted
- Sometimes only a subset of CTDs needs to be converted. It is possible to
either explicitly specify which tools will be converted or which tools
will not be converted.
- The value of the -s/--skip-tools parameter is a file in which each
line will be interpreted as the name of a tool that will not be converted.
Conversely, the value of the -r/--required-tools is a file in which
each line will be interpreted as a tool that is required. Only one of
these parameters can be specified at a given time.
- The format of both files is exactly the same. As stated before, each line
will be interpreted as the name of a tool. Any line starting with a '#'
will be ignored.
- ii - CWL
- There are, for now, no CWL-specific parameters or options.