razers3 - Faster, fully sensitive read mapping
razers3 [OPTIONS] <GENOME FILE>
<READS FILE>
razers3 [OPTIONS] <GENOME FILE> <PE-READS
FILE1> <PE-READS FILE2>
RazerS 3 is a versatile full-sensitive read mapper based on k-mer
counting and seeding filters. It supports single and paired-end mapping,
shared-memory parallelism, and optimally parametrizes the filter based on a
user-defined minimal sensitivity. See
http://www.seqan.de/projects/razers for more information.
Input to RazerS 3 is a reference genome file and either one file
with single-end reads or two files containing left or right mates of
paired-end reads. Use - to read single-end reads from stdin.
(c) Copyright 2009-2014 by David Weese.
- ARGUMENT 0
INPUT_FILE
- A reference genome file. Valid filetypes are: .sam[.*],
.raw[.*], .gbk[.*], .frn[.*], .fq[.*],
.fna[.*], .ffn[.*], .fastq[.*], .fasta[.*],
.faa[.*], .fa[.*], .embl[.*], and .bam, where
* is any of the following extensions: gz, bz2, and
bgzf for transparent (de)compression.
- READS List of
INPUT_FILE's
- Either one (single-end) or two (paired-end) read files. Valid filetypes
are: .sam[.*], .raw[.*], .gbk[.*], .frn[.*],
.fq[.*], .fna[.*], .ffn[.*], .fastq[.*],
.fasta[.*], .faa[.*], .fa[.*], .embl[.*], and
.bam, where * is any of the following extensions: gz,
bz2, and bgzf for transparent (de)compression.
- -h, --help
- Display the help message.
- --version
- Display version information.
- -i, --percent-identity
DOUBLE
- Percent identity threshold. In range [50..100]. Default: 95.
- -rr,
--recognition-rate DOUBLE
- Percent recognition rate. In range [80..100]. Default: 100.
- -ng,
--no-gaps
- Allow only mismatches, no indels. Default: allow both.
- -f, --forward
- Map reads only to forward strands.
- -r, --reverse
- Map reads only to reverse strands.
- -m, --max-hits
INTEGER
- Output only <NUM> of the best hits. In range [1..inf].
Default: 100.
- --unique
- Output only unique best matches (-m 1 -dr 0 -pa).
- -tr, --trim-reads
INTEGER
- Trim reads to given length. Default: off. In range [14..inf].
- -o, --output
OUTPUT_FILE
- Mapping result filename (use - to dump to stdout in razers format).
Default: <READS FILE>.razers. Valid filetypes are:
.sam, .razers, .gff, .fasta, .fa,
.eland, .bam, and .afg.
- -v, --verbose
- Verbose mode.
- -vv,
--vverbose
- Very verbose mode.
- -fl, --filter
STRING
- Select k-mer filter. One of pigeonhole and swift. Default:
pigeonhole.
- -mr, --mutation-rate
DOUBLE
- Set the percent mutation rate (pigeonhole). In range [0..20].
Default: 5.
- -ol,
--overlap-length INTEGER
- Manually set the overlap length of adjacent k-mers (pigeonhole). In
range [0..inf].
- -pd, --param-dir
STRING
- Read user-computed parameter files in the directory <DIR>
(swift).
- -t, --threshold
INTEGER
- Manually set minimum k-mer count threshold (swift). In range
[1..inf].
- -tl, --taboo-length
INTEGER
- Set taboo length (swift). In range [1..inf]. Default:
1.
- -s, --shape
STRING
- Manually set k-mer shape.
- -oc,
--overabundance-cut INTEGER
- Set k-mer overabundance cut ratio. In range [0..1]. Default:
1.
- -rl, --repeat-length
INTEGER
- Skip simple-repeats of length <NUM>. In range [1..inf].
Default: 1000.
- -lf, --load-factor
DOUBLE
- Set the load factor for the open addressing k-mer index. In range
[1..inf]. Default: 1.6.
RazerS 3 supports various output formats. The output format is
detected automatically from the file name suffix.
- .razers
- Razer format
- .fa, .fasta
- Enhanced Fasta format
- .eland
- Eland format
- .gff
- GFF format
- .sam
- SAM format
- .bam
- BAM format
- .afg
- Amos AFG format
By default, reads and contigs are referred by their Fasta ids
given in the input files. With the -gn and -rn options
this behaviour can be changed:
- 0
- Use Fasta id.
- 1
- Enumerate beginning with 1.
- 2
- Use the read sequence (only for short reads!).
- 3
- Use the Fasta id, do NOT append /L or /R for mate pairs.
The way matches are sorted in the output file can be changed
with the -so option for the following formats: razers,
fasta, sam, and afg. Primary and secondary sort
keys are:
- 0
- 1. read number, 2. genome position
- 1
- 1. genome position, 2. read number
The coordinate space used for begin and end positions can be
changed with the -pf option for the razer and fasta
formats:
- 0
- Gap space. Gaps between characters are counted from 0.
- 1
- Position space. Characters are counted from 1.