PatMaN - search for approximate patterns in DNA libraries
patman [ option | file ...
  ]
PatMaN searches for (small) patterns in (huge) DNA
    databases, allowing for some mismatches and optionally gaps.
    Patterns and databases are read from one or more
    fasta(5) files listed as non-option arguments, depending on whether
    the -D or -P option last preceded them, and matched against
    each other. The output of PatMaN is a table containing one line for
    each match, consisting of tab-separated fields:
  - name of database sequence,
- name of pattern,
- position of first matched base in database sequence, the sequence's
      beginning has position 1,
- position of last matched base in database sequence,
- strand (+ for literal match, - for reverse complement),
- edit distance (number of mismatches plus number of gaps).
    
  
  - -V, --version
- Print version number and exit.
    
  
- -e num, --edits num
- Allow up to num mismatches and/or gaps per match.
    
  
- -g num, --gaps num
- Allow up to num gaps per match. Note that gaps count as mismatches,
      too, so the -e option should always be set at least as high as the
      -g option. Allowing many gaps can incur a considerable
      computational cost.
    
  
- -D, --databases
- Treat the following files as database. Databases must be in
      fasta(5) format. Multiple database files, including
      "-" for standard input, are allowed and are read in turn.
    
  
- -P, --patterns
- Treat the following files as patterns. Pattern files must be in
      fasta(5) format. Multiple pattern files, including
      "-" for standard input, are allowed and are all read before
      touching the databases.
    
  
- -o file, --output file
- Redirect output to file. The file name "-" causes output
      to be written to stdout, which is also the default
    
  
- -a, --ambicodes
- Activate the interpretation of ambiguity codes in patterns. This results
      in the expansion of any pattern with ambiguity codes into multiple
      patterns which can match independently. Compare Unknown Nucleotides
      below.
    
  
- -s, --singlestrand
- Deactivate matching of reverse-complements. Normally, PatMaN will
      try to match patterns both literally and after reverse-complementing them,
      with this option set, only straight forward matches are considered.
    
  
- -p num, --prefetch num
- Causes num pointers to be prefetched in advance. This feature can
      improve performance, if PatMaN has been compiled for a processor
      architecture that supports prefetching. The optimum value for your
      particular setup has to be determined empirically, but the default should
      be reasonably good.
    
  
- -l len, --min-length len
- Only consider patterns with a length of at least len. Use this if
      your pattern collection contains short sequences that you don't
      want lots of possible matches reported for.
    
  
- -x num, --chop3 num
- Cut off num bases from the 3' end of each pattern. Use this
      for patterns with damaged, edited, etc. 3' ends that should be
      ignored. The chopped bases are neither matched nor included in the
      reported match regions.
    
  
- -X num, --chop5 num
- Cut off num bases from the 5' end of each pattern. Use this
      for patterns with damaged, edited, etc. 5' ends that should be
      ignored. The chopped bases are neither matched nor included in the
      reported match regions.
    
  
- -A, --adenine-hack
- Allow adenine to be ignored in patterns. This is essentially equivalent to
      not counting gaps in the database, as long as it was an A that was
      gapped. Using -A can be computationally extremely expensive, both
      in terms of memory and time consumed.
    
  
- -q, --quiet
- Suppress warnings (about unrecognized characters in input sequences or
      missing input files). Even without -q, at most one such warning is
      given per run.
    
  
- -v, --verbose
- Prints additional progress information to stderr.
    
  
- -d flags, --debug flags
- Sets debugging flags to flags.Flags may be the logical
      OR of any of the following values, each of which causes some output
      to appear on stderr. Some of the values may only work if
      PatMaN has been compiled in debug mode. The default value is 1.
    
  
- 1
- Print warnings. Equivalent to not setting -q.
    
  
- 2
- Print progress information. Equivalent to setting -v.
    
  
- 4
- Dump the suffix trie of the patterns. Only available in debug
      build.
    
  
- 8
- Count number of visited nodes and print that number in each iteration.
      Only available in debug build.
    
  
- 16
- Print total number of nodes fetched from memory after completing all
      databases.
    
  
- 32
- Output database sequence while it is being matched.
    
  
Non-option arguments (bare filenames) are either treated as
    database or pattern files, depending on whether the -D
    or -P option was the the last that occurred before the filename. If
    neither -D nor -P was given, file names are treated as
    pattern files. If no database was given, it is instead read
    from standard input. Standard input can be explicitly given as either a
    database or a pattern file by using the filename
    "-". A warning is given if standard input is selected implicitly
    as database, an error message is given if no pattern files
    have been named at all.
Allowing gaps often causes overlapping matches of single
    patterns at almost the same position. PatMaN makes no attempt
    to filter these redundant matches. Also note that allowing many gaps, and
    especially allowing an arbitrary amount of gaps through the -A hack
    can slow down PatMaN considerably and cause it to produce enormous
    amounts of output. The use of some sorty of post-processor to filter these
    is highly recommended.
Unknown nucleotides are most often encoded by the letter N.
    If the --ambicodes option is not given, Ns in patterns are
    interpreted as unknown nucleotides and can never match without penalty. If
    --ambicodes is given, Ns in patterns are expanded just like
    the other amibuguity codes, and effectively work as wildcards. Unknown
    nucleotides can still be encoded by an X and will never match
    anything. The database is treated differently in that anything other than
    A, C, G, T and U, including ambiguity
    codes, is treated as unknown and can never match without penalty.
/etc/popt
The system wide configuration file for 
popt(3).
  
PatMaN identifies itself as "patman" to popt.
~/.popt
Per user configuration file for 
popt(3).
Kay Pruefer <pruefer@eva.mpg.de>
  
  Udo Stenzel <udo_stenzel@eva.mpg.de>