sam-stats - ea-utils: produce digested statistics
sam-stats [options] [file1]
[file2...filen]
Version: 1.38.681
Produces lots of easily digested statistics for the files
listed
Options (default in parens):
-D Keep track of multiple alignments -O PREFIX
Output prefix enabling extended output (see below) -R FIL
Coverage/RNA output (coverage, 3' bias, etc, implies -A) -A
Report all chr sigs, even if there are more than 1000 -b INT Number
of reads to sample for per-base stats (1M) -S INT Size of
ascii-signature (30) -x FIL File extension for handling multiple
files (stats) -M Only overwrite if newer (requires -x, or
multiple files) -B Input is bam, don't bother looking at magic
-z Don't fail when zero entries in sam
OUTPUT:
If one file is specified, then the output is to standard out. If
multiple files are specified, or if the -x option is supplied, the
output file is <filename>.<ext>. Default extension is
'stats'.
Complete Stats:
- <STATS>
- : mean, max, stdev, median, Q1 (25 percentile), Q3
- reads
- : # of entries in the sam file, might not be # reads
- phred
- : phred scale used
- bsize
- : # reads used for qual stats
- mapped reads
- : number of aligned reads (unique probe id sequences)
- mapped bases
- : total of the lengths of the aligned reads
- forward
- : number of forward-aligned reads
- reverse
- : number of reverse-aligned reads
- snp rate
- : mismatched bases / total bases (snv rate)
- ins rate
- : insert bases / total bases
- del rate
- : deleted bases / total bases
- pct mismatch
- : percent of reads that have mismatches
- pct align
- : percent of reads that aligned
- len <STATS>
- : read length stats, ignored if fixed-length
- mapq <STATS>
- : stats for mapping qualities
- insert
<STATS>
- : stats for insert sizes
- %<CHR>
- : percentage of mapped bases per chr, followed by a signature
Subsampled_stats_(1M_reads_max):">Subsampled_stats_(1M_reads_max):">Subsampled
stats (1M reads max):
- base qual <STATS> : stats for base qualities %A,%T,%C,%G : base
percentages
- A ascii-histogram of mapped reads by chromosome position. It is only
output if the original SAM/BAM has a header. The values are the log2 of
the # of mapped reads at each position + ascii '0'.
- .stats
- : primary output
- .fastx
- : fastx-toolkit compatible output
- .rcov
- : per-reference counts & coverage
- .xdist
- : mismatch distribution
- .ldist
- : length distribution (if applicable)
- .mqdist
- : mapping quality distribution