fst-infl, fst-infl2, fst-infl3 - morphological analysers
fst-infl [ options ] file [ input-file [
output-file ] ]
fst-infl2 [ options ] file [ input-file [
output-file ] ]
fst-infl3 [ options ] file [ input-file [
output-file ] ]
- -t file
- Read an alternative transducer from file and use it if the main
transducer fails to find an analysis. By iterating this option, a cascade
of transducers may be tried to find an analysis.
- -b
- Print surface and analysis symbols. (fst-infl2 only)
- -n
- Print multi-character symbols without the enclosing angle brackets.
(fst-infl only)
- -d
- The analyses are symbolically disambiguated by returning only analyses
with a minimal number of morphemes. This option requires that morpheme
boundaries are marked with the tag <X>. If no <X> tag is found
in the analysis string, then the program (basically) counts the number of
multi-character symbols consisting entirely of upper-case characters and
uses this count for disambiguation. The latter heuristic was developed for
the German SMOR morphology. (This option is only available with fst-infl2
and fst-infl3.)
- -e n
- If no regular analysis is found, do robust matching and print analyses
with up to n edit errors. The set of edit operations currently
includes replacement, insertion and deletion. Each operation has currently
a fixed error weight of 1. (fst-infl2 only)
- -% f
- Disambiguates the analyses statistically and prints the most likely
analyses with at least f % of the total probability mass of the analyses.
The transducer weights are read from a file obtained by appending
.prob to the name of the transducer file. The weight files are
created with fst-train. (fst-infl2 only)
- -p
- Print the probability of each analysis. (fst-infl2 only)
- -c
- use this option if the transducer was compiled on a computer with a
different endianness. If you have a transducer which was compiled on a
Sparc computer and you want to use it on a Pentium, you need to use this
option. (fst-infl2 only)
- -q
- Suppress status messages.
- -h
- Print usage information.
fst-infl is a morphological analyser. The first argument is
the name of a file which was generated by fst-compiler. The second
argument is the name of the input file. The third argument is the output
file. If the third argument is missing, output is directed to stdout.
If the second argument is missing, as well, input is read from
stdin.
fst-infl2 is similar to fst-infl but needs a
transducer in compact format (see the man pages for fst-compiler and
fst-compact). fst-infl2 is implemented differently from fst-infl and
usually much faster.
fst-infl3 is also similar to fst-infl but needs a
transducer in lowmem format (see the man pages for fst-compiler and
fst-lowmem). fst-infl3 accesses the transducer on disc rather than
reading it into memory. It starts very fast and needs very little memory,
but is slower than fst-infl2.
fst-infl reads the transducer which is stored in the
argument file. Then it reads the input file line by line. Each line is
analysed with the transducer and all resulting analyses are printed (see
also the man pages for fst-mor).
No bugs are known so far.
Helmut Schmid, Institute for Computational Linguistics, University
of Stuttgart, Email: schmid@ims.uni-stuttgart.de, This software is available
under the GNU Public License.