DOKK / manpages / debian 10 / pftools / pfscan.1.en
PFSCAN(1) General Commands Manual PFSCAN(1)

pfscan - scan a protein or DNA sequence with a profile library

pfscan [ -abflLrsuxy ] [ seq-file | - ]
[ profile-library-file | - ] [L=#] [W=#]

pfscan compares a protein or nucleic acid sequence against a profile library. The result is an unsorted list of profile-sequence matches written to the standard output. A variety of output formats containing different information can be specified via the options -a, -l, -L, -r, -u, -s, -x, -y and -z. seq-file contains a sequence in EMBL/SWISS-PROT format (assumed by default) or in Pearson/Fasta format (indicated by option -f). profile-library-file contains a library of profiles in PROSITE format. pfscan can be used as a filter if - is used instead of one of the input filenames.

Report optimal alignment scores for all profiles regardless of the cut-off value. This option simultaneously forces DISJOINT=UNIQUE.
Search the complementary strand of the DNA sequence as well.
Input sequence is in Pearson/Fasta format.
Indicate highest cut-off level exceeded by the match score in the output list.
Indicate by character string the highest cut-off level exceeded by the match score in the output list. Note that the generalized profile format includes a text string field to specify a name for a cut-off level. The -L option causes the program to display the first two characters of this text string (usually something like "!" "?", "??", etc.) at the beginning of each match description.
Use raw scores rather than normalized scores for match selection. Normalized scores will not be listed in the output.
List the sequences of the matched regions as well. The output will be a Pearson/Fasta-formatted sequence library.
Forces DISJOINT=UNIQUE.
List profile-sequence alignments in pftools PSA format.
Display alignments between the profile and the matched sequence regions in a human-friendly format.
Indicate starting and ending position of the matched profile range. The latter position will be given as a negative offset from the end of the profile. Thus the range [ 1, -1] means entire profile.

Cut-off level to be used for match selection. If level L is not specified in the profile, the next higher (if L is negative) or next lower (if L is positive) level specified is used instead.
Output width. Output lines will be truncated after W characters. Default: W=132.

(1)
pfscan -s GTPA_HUMAN prosite13.prf

Scans the human GAP protein for matches to profiles in PROSITE release 13. GTPA_HUMAN contains the SWISS-PROT entry P20936|GTPA_HUMAN. prosite13.prf contains all profile entries of PROSITE release 13. The output is a Pearson/Fasta-formatted sequence library containing all sequence regions of the input sequence matching a profile in the profile library.

(2)
pfscan -by CVPBR322 ecp.prf L=2

Scans both strands of plasmid PBR322 for high-scoring (level 2) E. coli promoter matches. CVPBR322 contains EMBL entry J01749|CVPBR322. ecp.prf contains a profile for E. coli promoters. The output includes profile-sequence alignments in a human-friendly format.

Philipp Bucher
Philipp.Bucher@isrec.unil.ch

June 1999 pftools 2.2