DOKK / manpages / debian 10 / hhsuite / hhfilter.1.en
HHFILTER(1) User Commands HHFILTER(1)

hhfilter - filter an alignment by maximum sequence identity of match states and minimum coverage

hhfilter -i infile -o outfile [options]

HHfilter 3.0.0 (15-03-2015) Filter an alignment by maximum pairwise sequence identity, minimum coverage, minimum sequence identity, or score per column to the first (seed) sequence.n(C) Johannes Soeding, Michael Remmert, Andreas Biegert, Andreas Hauser Remmert M, Biegert A, Hauser A, and Soding J. HHblits: Lightning-fast iterative protein sequence searching by HMM-HMM alignment. Nat. Methods 9:173-175 (2011).

read input file in A3M/A2M or FASTA format
write to output file in A3M format
append to output file in A3M format

verbose mode: 0:no screen output 1:only warings 2: verbose
[0,100] maximum pairwise sequence identity (%) (def=90)
filter MSA by selecting most diverse set of sequences, keeping at least this many seqs in each MSA block of length 50 (def=0)
[0,100] minimum coverage with query (%) (def=0)
[0,100] minimum sequence identity with query (%) (def=0)
[0,100] minimum score per column with query (def=-20.0)
target diversity of alignment (default=off)

use A2M/A3M (default): upper case = Match; lower case = Insert; '-' = Delete; '.' = gaps aligned to inserts (may be omitted)
use FASTA: columns with residue in 1st sequence are match states
use FASTA: columns with fewer than X% gaps are match states

Example: hhfilter -id 50 -i d1mvfd_.a2m -o d1mvfd_.fil.a2m

February 2019 hhfilter 3.0~beta3+dfsg