DOKK / manpages / debian 12 / bali-phy / alignment-thin.1.en
alignment-thin(1) alignment-thin(1)

alignment-thin - Remove sequences or columns from an alignment.

alignment-thin alignment-file [OPTIONS]

Remove sequences or columns from an alignment.

Print usage information.
Output more log messages on stderr.

Sequences that cannot be removed (comma-separated).
Remove sequences not in comma-separated list arg.
Remove sequences in comma-separated list arg.
Remove sequences not longer than arg.
Remove sequences not shorter than arg.
Remove similar sequences with #mismatches < cutoff.
Remove similar sequences down to arg sequences.
Remove arg outlier sequences -- defined as sequences that are missing too many conserved sites.
Fraction of sequences that must contain a letter for it to be considered conserved.

Keep columns from this sequence
Remove columns with fewer than arg letters.
Remove insertions in a single sequence if longer than arg letters
Remove columns with no characters (all gaps).

Sort partially ordered columns to group similar gaps.
Just print out sequence lengths.
Just print out sequence lengths.
For each sequence, find the closest other sequence.

Remove columns without a minimum number of letters:

% alignment-thin --min-letters=5 file.fasta > file-thinned.fasta
    

Remove sequences by name:

% alignment-thin --remove=seq1,seq2 file.fasta > file2.fasta
    
% alignment-thin --keep=seq1,seq2   file.fasta > file2.fasta
    

Remove short sequences:

% alignment-thin --longer-than=250 file.fasta > file-long.fasta
    

Remove similar sequences with <= 5 differences from the closest other sequence:

% alignment-thin --cutoff=5 file.fasta > more-than-5-differences.fasta
    

Remove similar sequences until we have the right number of sequences:

% alignment-thin --down-to=30 file.fasta > file-30taxa.fasta
    

Remove dissimilar sequences that are missing conserved columns:

% alignment-thin --remove-crazy=10 file.fasta > file2.fasta
    

Protect some sequences from being removed:

% alignment-thin --down-to=30 file.fasta --protect=seq1,seq2 > file2.fasta
    
% alignment-thin --down-to=30 file.fasta --protect=@filename > file2.fasta
    

BAli-Phy online help: <http://www.bali-phy.org/docs.php>.

Please send bug reports to <bali-phy-users@googlegroups.com>.

Benjamin Redelings.

Feb 2018