hhfilter - filter an alignment by maximum sequence identity of
match states and minimum coverage
hhfilter -i infile -o outfile [options]
HHfilter 3.0.0 (15-03-2015) Filter an alignment by maximum
pairwise sequence identity, minimum coverage, minimum sequence identity, or
score per column to the first (seed) sequence.n(C) Johannes Soeding, Michael
Remmert, Andreas Biegert, Andreas Hauser Remmert M, Biegert A, Hauser A, and
Soding J. HHblits: Lightning-fast iterative protein sequence searching by
HMM-HMM alignment. Nat. Methods 9:173-175 (2011).
- -i <file>
- read input file in A3M/A2M or FASTA format
- -o <file>
- write to output file in A3M format
- -a <file>
- append to output file in A3M format
- -v <int>
- verbose mode: 0:no screen output 1:only warings 2: verbose
- -id
- [0,100] maximum pairwise sequence identity (%) (def=90)
- -diff [0,inf[
- filter MSA by selecting most diverse set of sequences, keeping at least
this many seqs in each MSA block of length 50 (def=0)
- -cov
- [0,100] minimum coverage with query (%) (def=0)
- -qid
- [0,100] minimum sequence identity with query (%) (def=0)
- -qsc
- [0,100] minimum score per column with query (def=-20.0)
- -neff [1,inf]
- target diversity of alignment (default=off)
- -M a2m
- use A2M/A3M (default): upper case = Match; lower case = Insert; '-' =
Delete; '.' = gaps aligned to inserts (may be omitted)
- -M first
- use FASTA: columns with residue in 1st sequence are match states
- -M [0,100]
- use FASTA: columns with fewer than X% gaps are match states
Example: hhfilter -id 50 -i d1mvfd_.a2m -o
d1mvfd_.fil.a2m