hisat2-align-s - graph-based alignment of short nucleotide reads
to many genomes, wrapper script
HISAT2 version 2.1.0 by Daehwan Kim (infphilo@gmail.com,
www.ccb.jhu.edu/people/infphilo) Usage:
- hisat2 [options]* -x <ht2-idx> {-1 <m1> -2
<m2> | -U <r>} [-S <sam>]
- <ht2-idx>
- Index filename prefix (minus trailing .X.ht2).
- <m1>
- Files with #1 mates, paired with files in <m2>. Could be gzip'ed
(extension: .gz) or bzip2'ed (extension: .bz2).
- <m2>
- Files with #2 mates, paired with files in <m1>. Could be gzip'ed
(extension: .gz) or bzip2'ed (extension: .bz2).
- <r>
- Files with unpaired reads. Could be gzip'ed (extension: .gz) or bzip2'ed
(extension: .bz2).
- <sam>
- File for SAM output (default: stdout)
- <m1>, <m2>, <r> can be comma-separated lists (no
whitespace) and can be specified many times. E.g. '-U file1.fq,file2.fq
-U file3.fq'.
Options (defaults in parentheses):
- Input:
- -q
- query input files are FASTQ .fq/.fastq (default)
- --qseq
- query input files are in Illumina's qseq format
- -f
- query input files are (multi-)FASTA .fa/.mfa
- -r
- query input files are raw one-sequence-per-line
- -c
- <m1>, <m2>, <r> are sequences themselves, not files
- -s/--skip
<int>
- skip the first <int> reads/pairs in the input (none)
- -u/--upto
<int>
- stop after first <int> reads/pairs (no limit)
- -5/--trim5 <int>
- trim <int> bases from 5'/left end of reads (0)
- -3/--trim3 <int>
- trim <int> bases from 3'/right end of reads (0)
- --phred33
- qualities are Phred+33 (default)
- --phred64
- qualities are Phred+64
- --int-quals
- qualities encoded as space-delimited integers
- Alignment:
- --n-ceil
<func>
- func for max # non-A/C/G/Ts permitted in aln (L,0,0.15)
- --ignore-quals
- treat all quality values as 30 on Phred scale (off)
- --nofw
- do not align forward (original) version of read (off)
- --norc
- do not align reverse-complement version of read (off)
- Spliced Alignment:
- --pen-cansplice
<int>
- penalty for a canonical splice site (0)
- --pen-noncansplice
<int>
- penalty for a non-canonical splice site (12)
- --pen-canintronlen
<func>
- penalty for long introns (G,-8,1) with canonical splice sites
- --pen-noncanintronlen
<func>
- penalty for long introns (G,-8,1) with noncanonical splice sites
- --min-intronlen
<int>
- minimum intron length (20)
- --max-intronlen
<int>
- maximum intron length (500000)
- --known-splicesite-infile
<path>
- provide a list of known splice sites
- --novel-splicesite-outfile
<path>
- report a list of splice sites
- --novel-splicesite-infile
<path>
- provide a list of novel splice sites
- --no-temp-splicesite
- disable the use of splice sites found
- --no-spliced-alignment
- disable spliced alignment
- --rna-strandness
<string>
- specify strand-specific information (unstranded)
- --tmo
- reports only those alignments within known transcriptome
- --dta
- reports alignments tailored for transcript assemblers
- --dta-cufflinks
- reports alignments tailored specifically for cufflinks
- --avoid-pseudogene
- tries to avoid aligning reads to pseudogenes (experimental option)?
- --no-templatelen-adjustment
- disables template length adjustment for RNA-seq reads
- Scoring:
- --mp
<int>,<int>
- max and min penalties for mismatch; lower qual = lower penalty
<6,2>
- --sp
<int>,<int>
- max and min penalties for soft-clipping; lower qual = lower penalty
<2,1>
- --no-softclip
- no soft-clipping
- --np <int>
- penalty for non-A/C/G/Ts in read/ref (1)
- --rdg
<int>,<int>
- read gap open, extend penalties (5,3)
- --rfg
<int>,<int>
- reference gap open, extend penalties (5,3)
- --score-min
<func> min acceptable alignment score w/r/t read length
- (L,0.0,-0.2)
- Reporting:
-k <int> (default: 5) report up to <int>
alns per read
- Paired-end:
- -I/--minins
<int>
- minimum fragment length (0), only valid with
--no-spliced-alignment
- -X/--maxins
<int>
- maximum fragment length (500), only valid with
--no-spliced-alignment
--fr/--rf/--ff -1, -2 mates align fw/rev,
rev/fw, fw/fw (--fr)
- --no-mixed
- suppress unpaired alignments for paired reads
- --no-discordant
- suppress discordant alignments for paired reads
- Output:
- -t/--time
- print wall-clock time taken by search phases
- --un <path>
- write unpaired reads that didn't align to <path>
- --al <path>
- write unpaired reads that aligned at least once to <path>
- --un-conc
<path>
- write pairs that didn't align concordantly to <path>
- --al-conc
<path>
- write pairs that aligned concordantly at least once to <path>
- (Note: for --un, --al, --un-conc, or
--al-conc, add '-gz' to the option name, e.g. --un-gz
<path>, to gzip compress output, or add '-bz2' to bzip2 compress
output.) --summary-file print alignment summary to this file.
--new-summary print alignment summary in a new style, which is more
machine-friendly. --quiet print nothing to stderr except serious
errors --met-file <path> send metrics to file at <path>
(off) --met-stderr send metrics to stderr (off) --met
<int> report internal counters & metrics every <int> secs
(1) --no-head supppress header lines, i.e. lines starting with @
--no-sq supppress @SQ header lines --rg-id <text> set
read group id, reflected in @RG line and RG:Z: opt field --rg
<text> add <text> ("lab:value") to @RG line of SAM
header.
- Note: @RG line only printed when --rg-id is set.
- --omit-sec-seq
- put '*' in SEQ and QUAL fields for secondary alignments.
- Performance:
-o/--offrate <int> override offrate of index; must
be >= index's offrate
-p/--threads <int> number of alignment threads to
launch (1)
- --reorder
- force SAM output order to match order of input reads
- --mm
- use memory-mapped I/O for index; many 'hisat2's can share
- Other:
- --qc-filter
- filter out reads that are bad according to QSEQ filter
- --seed
<int>
- seed for random number generator (0)
--non-deterministic seed rand. gen. arbitrarily instead
of using read attributes
- --remove-chrname
- remove 'chr' from reference names in alignment
- --add-chrname
- add 'chr' to reference names in alignment
- --version
- print version information and quit
- -h/--help
- print this usage message
64-bit Built on Debian 24 September 2018 Compiler: gcc version
8.2.0 (Debian 8.2.0-7) Options: -O3 -funroll-loops -g3
-Wdate-time -D_FORTIFY_SOURCE=2
-DPOPCNT_CAPABILITY Sizeof {int, long, long long, void*, size_t,
off_t}: {4, 8, 8, 8, 8, 8}