DOKK / manpages / debian 12 / macsyfinder / macsyfinder.1.en
MACSYFINDER(1) User Commands MACSYFINDER(1)

macsyfinder - detection of macromolecular systems in protein datasets

macsyfinder [-h] [--sequence-db SEQUENCE_DB] [--db-type {unordered_replicon,ordered_replicon,gembase,unordered}] [--replicon-topology {linear,circular}] [--topology-file TOPOLOGY_FILE] [--idx] [--inter-gene-max-space INTER_GENE_MAX_SPACE INTER_GENE_MAX_SPACE] [--min-mandatory-genes-required MIN_MANDATORY_GENES_REQUIRED MIN_MANDATORY_GENES_REQUIRED] [--min-genes-required MIN_GENES_REQUIRED MIN_GENES_REQUIRED] [--max-nb-genes MAX_NB_GENES MAX_NB_GENES] [--multi-loci MULTI_LOCI] [--hmmer HMMER_EXE] [--index-db INDEX_DB_EXE] [--e-value-search E_VALUE_RES] [--i-evalue-select I_EVALUE_SEL] [--coverage-profile COVERAGE_PROFILE] [-d DEF_DIR] [-o OUT_DIR] [-r RES_SEARCH_DIR] [--res-search-suffix RES_SEARCH_SUFFIX] [--res-extract-suffix RES_EXTRACT_SUFFIX] [-p PROFILE_DIR] [--profile-suffix PROFILE_SUFFIX] [-w WORKER_NB] [-v] [--log LOG_FILE] [--config CFG_FILE] [--previous-run PREVIOUS_RUN] systems [systems ...]

MacSyFinder is a program to model and detect macromolecular systems, genetic pathways... in protein datasets. In prokaryotes, these systems have often evolutionarily conserved properties: they are made of conserved components, and are encoded in compact loci (conserved genetic architecture). The user models these systems with MacSyFinder to reflect these conserved features, and to allow their efficient detection

The systems to detect. This is an obligatory option with no keyword associated to it. To detect all the protein secretion systems and related appendages: set to "all" (case insensitive). Otherwise, a single or multiple systems can be specified. For example: "T2SS T4P".

show this help message and exit

Path to the sequence dataset in fasta format.
The type of dataset to deal with. "unordered_replicon" corresponds to a non-assembled genome, "unordered" to a metagenomic dataset, "ordered_replicon" to an assembled genome, and "gembase" to a set of replicons where sequence identifiers follow this convention: ">RepliconName SequenceID".
The topology of the replicons (this option is meaningful only if the db_type is 'ordered_replicon' or 'gembase'.
Topology file path. The topology file allows one to specify a topology (linear or circular) for each replicon (this option is meaningful only if the db_type is 'ordered_replicon' or 'gembase'. A topology file is a tabular file with two columns: the 1st is the replicon name, and the 2nd the corresponding topology: "RepliconA linear"
Forces to build the indexes for the sequence dataset even if they were presviously computed and present at the dataset location (default = False)

Co-localization criterion: maximum number of components non-matched by a profile allowed between two matched components for them to be considered contiguous. Option only meaningful for 'ordered' datasets. The first value must match to a system, the second to a number of components. This option can be repeated several times: "--inter-gene-max-space T2SS 12 --inter-gene-max-space Flagellum 20"
The minimal number of mandatory genes required for system assessment. The first value must correspond to a system name, the second value to an integer. This option can be repeated several times: "--minmandatory-genes-required T2SS 15 --min-mandatorygenes-required Flagellum 10"
The minimal number of genes required for system assessment (includes both 'mandatory' and 'accessory' components). The first value must correspond to a system name, the second value to an integer. This option can be repeated several times: "--min-genesrequired T2SS 15 --min-genes-required Flagellum 10"
The maximal number of genes required for system assessment. The first value must correspond to a system name, the second value to an integer. This option can be repeated several times: "--max-nb-genes T2SS 5 --max-nb-genes Flagellum 10
Allow the storage of multi-loci systems for the specified systems. The systems are specified as a comma separated list (--multi-loci sys1,sys2) default is False

Path to the Hmmer program.
The indexer to be used for Hmmer. The value can be either 'makeblastdb' or 'formatdb' or the path to one of these binary (default = makeblastb)
Maximal e-value for hits to be reported during Hmmer search. (default = 1)
Maximal independent e-value for Hmmer hits to be selected for system detection. (default = 0.001)
Minimal profile coverage required in the hit alignment to allow the hit selection for system detection. (default = 0.5)

Path to the systems definition files.
Path to the directory where to store results. if outdir is specified res-search-dir will be ignored.
Path to the directory where to store MacSyFinder search results directories (default current working directory).
The suffix to give to Hmmer raw output files.
The suffix to give to filtered hits output files.
Path to the profiles directory.
The suffix of profile files. For each 'Gene' element, the corresponding profile is searched in the 'profile_dir', in a file which name is based on the Gene name + the profile suffix. For instance, if the Gene is named 'gspG' and the suffix is '.hmm3', then the profile should be placed at the specified location and be named 'gspG.hmm3'

Number of workers to be used by MacSyFinder. In the case the user wants to run MacSyFinder in a multithread mode. (0 mean all cores will be used, default 1)
Increases the verbosity level. There are 4 levels: Error messages (default), Warning (-v), Info (-vv) and Debug.(-vvv)
Path to the directory where to store the 'macsyfinder.log' log file.
Path to a putative MacSyFinder configuration file to be used.
Path to a previous MacSyFinder run directory. It allows one to skip the Hmmer search step on same dataset, as it uses previous run results and thus parameters regarding Hmmer detection. The configuration file from this previous run will be used. (conflict with options --config, --sequence-db, --profile-suffix, --resextract-suffix, --e-value-res, --db-type, --hmmer)

For more details, visit the MacSyFinder website and see the MacSyFinder documentation.

January 2015 macsyfinder 1.0.2