pfw - weight sequences of a multiple sequence alignment
- pfw
- [ -hm ] [ -N shuffles ] [ -R seed ] [
-W weight ] [ -X gap_excision ] [
ms_file | - ] [ parameters ]
pfw computes new weights for individual sequences in a
multiple sequence alignment using the method of Sibbald and Argos (1990).
The file containing the multiple sequence alignment ('ms_file') must
be either in MSF format as generated by GCG programs or by
readseq (checksums are ignored) or in MSA format as produced by
psa2msa(1). If '-' is specified instead of a filename, the
multiple sequence alignment is read from the standard input. pfw
writes a new multiple sequence alignment with modified weights in either MSF
or MSA format to the standard output.
- ms_file
- Input multiple sequence alignment file.
This file contains a multiple sequence alignment in either MSF (default) or
MSA format. If the format is MSA, pfw will include the new weight
of each sequence in the FASTA header using the
xpsa(5) keyword weight. It will thus replace any existing
weight=value pair in the header line. If the filename is
replaced by a '-', pfw will read the multiple alignment from
stdin.
- -h
- Display usage help text.
- -m
- Input multiple sequence alignment is in MSA format.
- -N shuffles
- Number of shuffles per sequence to be performed.
Note that an average relative precision of r percent is achieved by
approximately (100/r)-squared shuffles.
Type: integer
Default: 100 (10% precision)
- -R seed
- Seed for the random number generator.
This must be a negative integer (zero or positive integers will be reset to
negative integers).
Type: integer
Default: -123456789
- -W weight
- Total weight.
The initially computed weights will be multiplied by a constant factor such
that the sum of all weights equals this value.
Default: 1
- -X gap_excision
- Gap excision threshold.
This is the minimal fraction of non-gap characters a column of the multiple
sequence alignment must contain in order to be considered for weighting.
Default: 0.5
- Note:
- for backwards compatibility, release 2.3 of the pftools package
will parse the version 2.2 style parameters, but these are
deprecated and the corresponding option (refer to the
options section) should be used instead.
- N=#
- Shuffles per sequence.
Use option -N instead.
- R=#
- Random number seed.
Use option -R instead.
- W=#
- Total weight.
Use option -W instead.
- X=#
- Gap excision threshold.
Use option -X instead.
On successful completion of its task, pfw will return an
exit code of 0. If an error occurs, a diagnostic message will be output on
standard error and the exit code will be different from 0. When conflicting
options where passed to the program but the task could nevertheless be
completed, warnings will be issued on standard error.
Sibbald PR & Argos P. (1990). Weighting aligned protein or
nucleic acid sequences to correct for unequal
representation. J. Mol. Biol. 216:813-818.
The pftools package was developed by Philipp Bucher.
Any comments or suggestions should be addressed to
<pftools@sib.swiss>.