obiclean - description of obiclean
obiclean is a command that classifies sequence records
either as head, internal or singleton.
By default, tagging is done once for the whole dataset, but it can
also be done sample by sample by specifying the -s option. In such a
case, the counts are extracted from the sample information.
Finally, each sequence record is annotated with three new
attributes head, internal and singleton. The attribute
values are the numbers of samples in which the sequence record has been
classified in this manner.
- -r <FLOAT>,
--ratio=<FLOAT>
- Threshold ratio between counts (rare/abundant counts) of two sequence
records so that the less abundant one is a variant of the more abundant
(default: 1, i.e. all less abundant sequences are variants).
- -C, --cluster
- Switch obiclean into its clustering mode. This adds information to
each sequence about the true.
- -H, --head
- Select only sequences with the head status in a least one sample.
- -g, --graph
- Creates a file containing the set of DAG used by the obiclean clustering
algorithm. The graph file follows the dot format
- --skip
<N>
- The N first sequence records of the file are discarded from the analysis
and not reported to the output file
- --only
<N>
- Only the N next sequence records of the file are analyzed. The following
sequences in the file are neither analyzed, neither reported to the output
file. This option can be used conjointly with the –skip
option.
- --embl
- Input file is in embl format.
- --fasta
- Input file is in fasta format (including OBITools fasta extensions).
- --sanger
- Input file is in Sanger fastq format (standard fastq used by HiSeq/MiSeq
sequencers).
- --solexa
- Input file is in fastq format produced by Solexa (Ga IIx) sequencers.
- --ecopcr
- Input file is in ecoPCR format.
- --nuc
- Input file contains nucleic sequences.
- --prot
- Input file contains protein sequences.
- --fasta-output
- Output sequences in OBITools fasta format
- --ecopcrdb-output=<PREFIX_FILENAME>
- Creates an ecoPCR database from sequence records results
- --uppercase
- Print sequences in upper case (default is lower case)
- --DEBUG
- Sets logging in debug mode.
- obiclean_cluster
- obiclean_count
- obiclean_head
- obiclean_headcount
- obiclean_internalcount
- obiclean_samplecount
- obiclean_singletoncount
- obiclean_status
The OBITools Development Team - LECA
2019 - 2015, OBITool Development Team