NAME

hhfilter - filter an alignment by maximum sequence identity of match states and minimum coverage

SYNOPSIS

hhfilter -i infile -o outfile [options]

DESCRIPTION

HHfilter 3.0.0 (15-03-2015) Filter an alignment by maximum pairwise sequence identity, minimum coverage, minimum sequence identity, or score per column to the first (seed) sequence.n(C) Johannes Soeding, Michael Remmert, Andreas Biegert, Andreas Hauser Remmert M, Biegert A, Hauser A, and Soding J. HHblits: Lightning-fast iterative protein sequence searching by HMM-HMM alignment. Nat. Methods 9:173-175 (2011).

-i <file>: read input file in A3M/A2M or FASTA format
-o <file>: write to output file in A3M format
-a <file>: append to output file in A3M format

OPTIONS

-v <int>: verbose mode: 0:no screen output 1:only warings 2: verbose
-id: [0,100] maximum pairwise sequence identity (%) (def=90)
-diff [0,inf[: filter MSA by selecting most diverse set of sequences, keeping at least this many seqs in each MSA block of length 50 (def=0)
-cov: [0,100] minimum coverage with query (%) (def=0)
-qid: [0,100] minimum sequence identity with query (%) (def=0)
-qsc: [0,100] minimum score per column with query (def=-20.0)
-neff [1,inf]: target diversity of alignment (default=off)

Input alignment format:

-M a2m: use A2M/A3M (default): upper case = Match; lower case = Insert; '-' = Delete; '.' = gaps aligned to inserts (may be omitted)
-M first: use FASTA: columns with residue in 1st sequence are match states
-M [0,100]: use FASTA: columns with fewer than X% gaps are match states

Example: hhfilter -id 50 -i d1mvfd_.a2m -o d1mvfd_.fil.a2m

February 2019

hhfilter 3.0~beta3+dfsg