DOKK / manpages / debian 10 / bali-phy / alignment-thin.1.en
alignment-thin(1) alignment-thin(1)

alignment-thin - Remove sequences or columns from an alignment.

alignment-thin alignment-file [OPTIONS]

Remove sequences or columns from an alignment.

Print usage information.
Output more log messages on stderr.

Sequences that cannot be removed (comma-separated).
Remove sequences in comma-separated list arg.
Remove sequences not longer than arg.
Remove sequences not shorter than arg.
Remove similar sequences with #mismatches < cutoff.
Remove similar sequences down to arg sequences.
Remove arg outlier sequences -- defined as sequences that are missing too many conserved sites.
Fraction of sequences that must contain a letter for it to be considered conserved.

Remove columns with fewer than arg letters.
Remove insertions in a single sequence if longer than arg letters

Sort partially ordered columns to group similar gaps.
Just print out sequence lengths.
For each sequence, find the closest other sequence.

Remove columns without a minimum number of letters:

% alignment-thin --min-letters=5 file.fasta > file-thinned.fasta

Remove sequences by name:

% alignment-thin --remove=seq1,seq2 file.fasta > file2.fasta

Remove short sequences:

% alignment-thin --longer-than=250 file.fasta > file-long.fasta

Remove sequences with <= 5 differences from the closest other sequence:

% alignment-thin --cutoff=5 file.fasta > more-than-5-differences.fasta

Like --cutoff, but stop when we have the right number of sequences:

% alignment-thin --down-to=30 file.fasta > file-30taxa.fasta

Remove dissimilar sequences that are missing conserved columns:

% alignment-thin --remove-crazy=10 file.fasta > file2.fasta

Protect some sequences from being removed:

% alignment-thin --down-to=30 file.fasta --protect=seq1,seq2 > file2.fasta

BAli-Phy online help: <http://www.bali-phy.org/docs.php>.

Please send bug reports to <bali-phy-users@googlegroups.com>.

Benjamin Redelings.

Feb 2018