hhfilter - filter an alignment by maximum sequence identity of
match states and minimum coverage
hhfilter -i infile -o outfile [options]
HHfilter 3.3.0 Filter an alignment by maximum pairwise sequence
identity, minimum coverage, minimum sequence identity, or score per column
to the first (seed) sequence.n(c) The HH-suite development team Steinegger
M, Meier M, Mirdita M, V??hringer H, Haunsberger S J, and S??ding J (2019)
HH-suite3 for fast remote homology detection and deep protein annotation.
BMC Bioinformatics, doi:10.1186/s12859-019-3019-7
- -i <file>
- read input file in A3M/A2M or FASTA format
- -o <file>
- write to output file in A3M format
- -a <file>
- append to output file in A3M format
- -v <int>
- verbose mode: 0:no screen output 1:only warings 2: verbose
- -id
- [0,100] maximum pairwise sequence identity (%) (def=90)
- -diff [0,inf[
- filter MSA by selecting most diverse set of sequences, keeping at least
this many seqs in each MSA block of length 50 (def=0)
- -cov
- [0,100] minimum coverage with query (%) (def=0)
- -qid
- [0,100] minimum sequence identity with query (%) (def=0)
- -qsc
- [0,100] minimum score per column with query (def=-20.0)
- -neff [1,inf]
- target diversity of alignment (default=off)
- -M a2m
- use A2M/A3M (default): upper case = Match; lower case = Insert; '-' =
Delete; '.' = gaps aligned to inserts (may be omitted)
- -M first
- use FASTA: columns with residue in 1st sequence are match states
- -M [0,100]
- use FASTA: columns with fewer than X% gaps are match states
- -maxseq
<int>
- max number of input rows (def=65535)
- -maxres
<int>
- max number of HMM columns (def=20001)
Example: hhfilter -id 50 -i d1mvfd_.a2m -o
d1mvfd_.fil.a2m