NAME

parseAlignment - pre-compute probabilities of (observed) reads alignments

SYNOPSIS

parseAlignment -o <outFileName> -s <trSeqFileName> [OPTIONS] [alignment file]

DESCRIPTION

Pre-computes probabilities of (observed) reads' alignments.

: [alignment file] should be in either SAM or BAM format.

OPTIONS

--help

: Show this help information.

--distributionFile=<distributionFileName>

: Name of file to which read-distribution should be saved.

--excludeSingletons

: Exclude single mate alignments for paired-end reads. (default: Off)

-e <expFileName> , --expressionFile=<expFileName>

: Transcript relative expression estimates --- for better non-uniform read distribution estimation.

--failed=<failed>

: File name where to save names of reads that failed to align.

-f <format> , --format=<format>

: Input format: either SAM, BAM.

--lenMu=<lenMu>

: Set mean of log fragment length distribution. (l_frag ~ LogNormal(mu,sigma^2))

--lenSigma=<lenSigma>

: Set sigma^2 (or variance) of log fragment length distribution. (l_frag ~ LogNormal(mu,sigma^2))

--mateNamesDiffer

: Mates from paired-end reads have different names. (default: Off)

-l <maxAlignments> , --limitA=<maxAlignments>

: Limit maximum number of alignments per read. (Reads with more alignments are skipped.)

--noiseMismatches=<numNoiseMismatches>

: Number of mismatches to be considered as noise. (default: 6)

-o <outFileName> , --outFile=<outFileName>

: Name of the output file.

-P <procN> , --procN=<procN>

: Maximum number of threads to be used. This provides speedup mostly when using non-uniform read distribution model (i.e. no --uniform flag). (default: 4)

-N <readsN> , --readsN=<readsN>

: Total number of reads. This is not necessary if [SB]AM contains also reads with no valid alignments.

--show1warning

: Show first alignments that are considered wrong (TID unknown, TID mismatch, wrong strand). (default: Off)

-t <trInfoFileName> , --trInfoFile=<trInfoFileName>

: File to save transcript information extracted from [BS]AM file and reference.

-s <trSeqFileName> , --trSeqFile=<trSeqFileName>

: Transcript sequence in FASTA format --- for non-uniform read distribution estimation.

--trSeqHeader=<trSeqHeader>

: Transcript sequence header format enables gene name extraction (standard/gencode). (default: standard)

--uniform

: Use uniform read distribution. (default: Off)

--unstranded

: Paired read are not strand specific. (default: Off)

-v , --verbose

: Verbose output. (default: Off)

-V , --veryVerbose

: Very verbose output. (default: Off)

September 2021

parseAlignment 0.7.5+dfsg