DOKK / manpages / debian 11 / triehash / triehash.1.en
TRIEHASH(1) triehash TRIEHASH(1)

triehash - Generate a perfect hash function derived from a trie.

triehash [option] [input file]

triehash takes a list of words in input file and generates a function and an enumeration to describe the word

The file consists of multiple lines of the form:

    [label ~ ] word [= value]

This maps word to value, and generates an enumeration with entries of the form:

    label = value

If label is undefined, the word will be used, the minus character will be replaced by an underscore. If value is undefined it is counted upwards from the last value.

There may also be one line of the format

    [ label ~] = value

Which defines the value to be used for non-existing keys. Note that this also changes default value for other keys, as for normal entries. So if you place

    = 0

at the beginning of the file, unknown strings map to 0, and the other strings map to values starting with 1. If label is not specified, the default is Unknown.

Generate code in the given file.
Generate a header in the given file, containing a declaration of the hash function and an enumeration.
The name of the enumeration.
The name of the function.
The prefix to use for labels.
Uppercase label names when normalizing them.
Put the function and enum into a namespace (C++)
Put the function and enum into a class (C++)
Generate an enum class instead of an enum (C++)
Use name for a counter that is set to the latest entry in the enumeration + 1. This can be useful for defining array sizes.
Ignore case for words.
Generate code reading multiple bytes at once. The value is a string of power of twos to enable. The default value is 320 meaning that 8, 4, and single byte reads are enabled. Specify 0 to disable multi-byte completely, or add 2 if you also want to allow 2-byte reads. 2-byte reads are disabled by default because they negatively affect performance on older Intel architectures.

This generates code for both multiple bytes and single byte reads, but only enables the multiple byte reads of GNU C compatible compilers, as the following extensions are used:

We must be able to generate integers that are aligned to a single byte using:

    typedef uint64_t __attribute__((aligned (1))) triehash_uu64;
    
The macros __BYTE_ORDER__ and __ORDER_LITTLE_ENDIAN__ must be defined.

We forcefully disable multi-byte reads on platforms where the variable __ARM_ARCH is defined and __ARM_FEATURE_UNALIGNED is not defined, as there is a measurable overhead from emulating the unaligned reads on ARM.

Generate a file in the specified language. Currently known are 'C' and 'tree', the latter generating a tree.
Add the header to the include statements of the header file. The value must be surrounded by quotes or angle brackets for C code. May be specified multiple times.

triehash is available under the MIT/Expat license, see the source code for more information.

Julian Andres Klode <jak@jak-linux.org>

2020-03-08 triehash v0.3