TRIETOOL(1) | General Commands Manual | TRIETOOL(1) |
trietool - trie manipulation tool
trietool [ options ] trie command arg ...
trietool is the command-line tool for manipulating double-array trie data. It can be used to query, add and remove words in a trie.
The trie argument specifies the name of the trie to manipulate. A trie is stored in a file with `.tri' extension. However, to create a new trie, one needs to prepare a file with `.abm' extension, describing the Unicode ranges of alphabet set of the trie. The ABM defines a set of vectors that map Unicode characters into a continuous range of integers. The mapped integers will be used as internal alphabet for the trie. Such mapping can improve the space allocation within the trie data, regardless of non-continuity of the character set being used, as the mapped range is always continuous.
The ABM file is a plain text file, with each line listing a range of 32-bit Unicodes to be added to the alphabet set, in the format:
where `0xSSSS' and `0xTTTT' are hexadecimal values of starting and ending character code for the range, respectively.
For example, for a dictionary that contains only English words witout any punctuations, one may prepare `trie.abm' as:
The first line lists the ASCII codes for A-Z, and the second for a-z.
No more than 255 alphabets are allowed in a trie.
The created `.tri' file will incorporate the ABM data. So, the `.abm' file is not required after the first creation, and will be ignored.
Available commands are:
This program follows the usual GNU command line syntax, with long options starting with two dashes (`--'). A summary of options is included below.
libdatrie was written by Theppitak Karoonboonyanan.
This manual page was written by Theppitak Karoonboonyanan <theppitak@gmail.com>.
DECEMBER 2008 |