FASTQ-MCF(1) | User Commands | FASTQ-MCF(1) |
fastq-mcf - ea-utils: detect levels of adapter presence, compute likelihoods and locations of the adapters
fastq-mcf [options] <adapters.fa> <reads.fq> [mates1.fq ...]
Version: 1.04.676
Detects levels of adapter presence, computes likelihoods and locations (start, end) of the adapters. Removes the adapter sequences from the fastq file(s).
Stats go to stderr, unless -o is specified.
Specify -0 to turn off all default settings
If you specify multiple 'paired-end' inputs, then a -o option is required for each. IE: -o read1.clip.q -o read2.clip.fq
If mate- prefix is used, then applies to second non-barcode read only
Adapter files are 'fasta' formatted:
Specify n/a to turn off adapter clipping, and just use filters
Increasing the scale makes recognition-lengths longer, a scale of 100 will force full-length recognition of adapters.
Adapter sequences with _5p in their label will match 'end's, and sequences with _3p in their label will match 'start's, otherwise the 'end' is auto-determined.
Skew is when one cycle is poor, 'skewed' toward a particular base. If any nucleotide is less than the skew percentage, then the whole cycle is removed. Disable for methyl-seq, etc.
Set the skew (-k) or N-pct (-x) to 0 to turn it off (should be done for miRNA, amplicon and other low-complexity situations!)
Duplicate read filtering is appropriate for assembly tasks, and never when read length < expected coverage. -D 50 will use 4.5GB RAM on 100m DNA reads - be careful. Great for RNA assembly.
*Quality filters are evaluated after clipping/trimming
Homopolymer filtering is a subset of low-complexity, but will not be separately tracked unless both are turned on.
July 2015 | fastq-mcf 1.1.2 |