alf - Alignment free sequence comparison
alf [OPTIONS] -i IN.FASTA [-o
OUT.TXT]
Compute pairwise similarity of sequences using alignment-free
methods in IN.FASTA and write out tab-delimited matrix with pairwise
scores to OUT.TXT.
- -h, --help
- Display the help message.
- --version
- Display version information.
- -v, --verbose
- When given, details about the progress are printed to the screen.
- -i, --input-file
INPUT_FILE
- Name of the multi-FASTA input file. Valid filetypes are: .sam[.*],
.raw[.*], .gbk[.*], .frn[.*], .fq[.*],
.fna[.*], .ffn[.*], .fastq[.*], .fasta[.*],
.faa[.*], .fa[.*], .embl[.*], and .bam, where
* is any of the following extensions: gz, bz2, and
bgzf for transparent (de)compression.
- -o, --output-file
OUTPUT_FILE
- Name of the file to which the tab-delimtied matrix with pairwise scores
will be written to. Default is to write to stdout. Valid filetype is:
.alf[.*], where * is any of the following extensions: tsv
for transparent (de)compression.
- -rc,
--reverse-complement STRING
- Which strand to score. Use both_strands to score both strands
simultaneously. One of input, both_strands, mean,
min, and max. Default: input.
- -mm, --mismatches
INTEGER
- Number of mismatches, one of 0 and 1. When 1 is used,
N2 uses the k-mer-neighbour with one mismatch. Default: 0.
- -mmw,
--mismatch-weight DOUBLE
- Real-valued weight of counts for words with mismatches. Default:
0.1.
- -kwf,
--k-mer-weights-file OUTPUT_FILE
- Print k-mer weights for every sequence to this file if given. Valid
filetype is: .txt.