DOKK / manpages / debian 11 / sepp / run_abundance.py.1.en
RUN_ABUNDANCE.PY(1) User Commands RUN_ABUNDANCE.PY(1)

run_abundance.py - helper script to estimate the abundance at a given taxonomic level

usage: run_abundance.py [-h] [-v] [-A N] [-P N] [-F N] [--distance DISTANCE]

[-M DIAMETER] [-S DECOMP] [-p DIR] [-o OUTPUT]
[-d OUTPUT_DIR] [-t TREE] [-r RAXML] [-a ALIGN] [-f FRAG] [-m MOLECULE] [-x N] [-cp CHCK_FILE] [-cpi N] [-seed N] [-at N] [-pt N] [-g N] [-b N] [-bin N] [-D] [-C N] [-G GENES]

This script runs the SEPP algorithm on an input tree, alignment, fragment file, and RAxML info file.

show this help message and exit
show program's version number and exit

These options determine the alignment decomposition size and taxon insertion size. If None is given, then the default is to align/place at 10% of total taxa. The alignment decomosition size must be less than the taxon insertion size.
max alignment subset size of N [default: 10% of the total number of taxa or the placement subset size if given]
max placement subset size of N [default: 10% of the total number of taxa or the alignment length (whichever bigger)]
maximum fragment chunk size of N. Helps controlling memory. [default: 20000]
minimum p-distance before stopping the decomposition[default: 1]
maximum tree diameter before stopping the decomposition[default: None]
decomposition strategy [default: using tree branch length]

These options control output.
Tempfile files will be written to DIR. Full-path required. [default: /tmp/sepp]
output files with prefix OUTPUT. [default: output]
output to OUTPUT_DIR directory. full-path required. [default: .]

These options control input. To run SEPP the following is required. A backbone tree (in newick format), a RAxML_info file (this is the file generated by RAxML during estimation of the backbone tree. Pplacer uses this info file to set model parameters), a backbone alignment file (in fasta format), and a fasta file including fragments. The input sequences are assumed to be DNA unless specified otherwise.
Input tree file (newick format) [default: None]
RAxML_info file including model parameters, generated by RAxML.[default: None]
Aligned fasta file [default: None]
fragment file [default: None]
Molecule type of sequences. Can be amino, dna, or rna [default: dna]

These options control how SEPP is run
Use N cpus [default: number of cpus available on the machine]
checkpoint file [default: no checkpointing]
Interval (in seconds) between checkpoint writes. Has effect only with -cp provided. [default: 3600]
random seed number. [default: 297834]

These arguments set settings specific to TIPP
Enough alignment subsets are selected to reach a commulative probability of N. This should be a number between 0 and 1 [default: 0.95]
Enough placements are selected to reach a commulative probability of N. This should be a number between 0 and 1 [default: 0.95]
Classify on only the specified gene.
Blast file with fragments already binned.
Tool for binning
Treat fragments as distribution
Placement probability requirement to count toward the distribution. This should be a number between 0 and 1 [default: 0.0]
Use markers or cogs genes [default: markers]

run_sepp.py(1), run_tipp.py(1),

October 2020 run_abundance.py 4.3.10