DOKK / manpages / debian 12 / mothur / uchime.1.en
UCHIME(1) User Commands UCHIME(1)

uchime - reads a fasta file and reference file and outputs potentially chimeric sequences

The chimera.uchime command reads a fasta file and reference file and outputs potentially chimeric sequences. The original uchime program was written by Robert C. Edgar.

uchime --input query.fasta [--db db.fasta] [--uchimeout results.uchime]

[--uchimealns results.alns]

--input filename

Query sequences in FASTA format. If the --db option is not specificed, uchime uses de novo detection. In de novo mode, relative abundance must be given by a string /ab=xxx/ somewhere in the label, where xxx is a floating-point number, e.g. >F00QGH67HG/ab=1.2/.

--db filename

Reference database in FASTA format. Optional, if not specified uchime uses de novo mode.
***WARNING*** The database is searched ONLY on the plus strand. You MUST include reverse-complemented sequences in the database if you want both strands to be searched.

--abskew x

Minimum abundance skew. Default 1.9. De novo mode only. Abundance skew is:
min [ abund(parent1), abund(parent2) ] / abund(query).

--uchimeout filename

Output in tabbed format with one record per query sequence. First field is score (h), second field is query label. For details, see manual.

--uchimealns filename

Multiple alignments of query sequences to parents in humanreadable format. Alignments show columns with differences that support or contradict a chimeric model.

--minh h

Minimum score to report chimera. Default 0.3. Values from 0.1 to 5 might be reasonable. Lower values increase sensitivity but may report more false positives. If you decrease --xn, you may need to increase --minh, and vice versa.

--mindiv div

Minimum divergence ratio, default 0.5. Div ratio is 100% - %identity between query sequence and the closest candidate for being a parent. If you don't care about very close chimeras, then you could increase --mindiv to, say, 1.0 or 2.0, and also decrease --min h, say to 0.1, to increase sensitivity. How well this works will depend on your data. Best is to tune parameters on a good benchmark.

--xn beta

Weight of a no vote, also called the beta parameter. Default 8.0. Decreasing this weight to around 3 or 4 may give better performance on denoised data.

--dn n

Pseudo-count prior on number of no votes. Default 1.4. Probably no good reason to change this unless you can retune to a good benchmark for your data. Reasonable values are probably in the range from 0.2 to 2.

--xa w

Weight of an abstain vote. Default 1. So far, results do not seem to be very sensitive to this parameter, but if you have a good training set might be worth trying. Reasonable values might range from 0.1 to 2.

--chunks n

Number of chunks to extract from the query sequence when searching for parents. Default 4.

--[no]ovchunks

[Do not] use overlapping chunks. Default do not.

--minchunk n

Minimum length of a chunk. Default 64.

--idsmoothwindow w

Length of id smoothing window. Default 32.

--minsmoothid f

Minimum factional identity over smoothed window of candidate parent. Default 0.95.

--maxp n

Maximum number of candidate parents to consider. Default 2. In tests so far, increasing --maxp gives only a very small improvement in sensivity but tends to increase the error rate quite a bit.

--[no]skipgaps --[no]skipgaps2

These options control how gapped columns affect counting of diffs. If --skipgaps is specified, columns containing gaps do not found as diffs. If --skipgaps2 is specified, if column is immediately adjacent to a column containing a gap, it is not counted as a diff. Default is --skipgaps --skipgaps2.

--minlen L --maxlen L

Minimum and maximum sequence length. Defaults 10, 10000. Applies to both query and reference sequences.

--ucl

Use local-X alignments. Default is global-X. On tests so far, global-X is always better; this option is retained because it just might work well on some future type of data.

--queryfract f

Minimum fraction of the query sequence that must be covered by a local-X alignment. Default 0.5. Applies only when --ucl is specified.

--quiet

Do not display progress messages on stderr.

--log filename

Write miscellaneous information to the log file. Mostly of interest to me (the algorithm developer). Use --verbose to get more info.

--self

In reference database mode, exclude a reference sequence if it has the same label as the query. This is useful for benchmarking by using the ref db as a query to test for false positives.
help
help
help
help
help
help
help
--[no]blast_termgaps
help
help
help
--[no]cartoon_orfs
help
help
help
help
help
help
help
help
help
help
help
help
help
Write info about compiler types and #defines to stdout.
help
help
help
--[no]denovo
help
help
help
help
help
help
help
help
help
--[no]fastalign
help
help
help
help
--[no]flushuc
help
help
help
help
help
help
help
help
help
Display command-line options.
help
help
help
help
help
help
help
help
help
help
help
--[no]isort
help
help
help
help
help
help
help
--[no]label_ab
help
help
--[no]leftjust
help
help
help
Log file name.
--[no]log_hothits
help
--[no]log_query
help
--[no]logmemgrows
help
Log options.
--[no]logwordstats
help
help
help
help
help
help
help
help
help
help
help
help
help
help
help
help
help
help
help
help
help
help
help
help
help
help
help
help
help
help
help
--[no]minus_frames
help
help
help
--[no]nb
help
help
help
help
help
--[no]output_rejects
help
help
help
help
help
Turn off progress messages.
help
help
--[no]rev
help
--[no]rightjust
help
help
help
help
help
help
help
help
--[no]selfid
help
help
--[no]skipgaps
help
--[no]skipgaps2
help
help
help
help
help
help
--[no]ssort
help
help
--[no]stable_sort
help
help
help
help
help
help
help
--[no]trace
help
help
--[no]trunclabels
help
--[no]twohit
help
help
help
help
help
help
help
help
--[no]ucl
help
help
help
help
help
help
help
--[no]usort
help
help
--[no]verbose
help
Show version and exit.
help
help
help
--[no]wordcountreject
help
--[no]wordweight
help
help
help
help
help
help
help
help
help

Robert C. Edgar

http://www.drive5.com/uchime

July 2013 uchime v4.2.40