DOKK / manpages / debian 12 / libmtbl-dev / mtbl_sorter.3.en
MTBL_SORTER(3)   MTBL_SORTER(3)

mtbl_sorter - sort a sequence of unordered key-value pairs

#include <mtbl.h>

Sorter objects:

struct mtbl_sorter *
mtbl_sorter_init(const struct mtbl_sorter_options *sopt);

void
mtbl_sorter_destroy(struct mtbl_sorter **s);

mtbl_res
mtbl_sorter_add(struct mtbl_sorter *s,

const uint8_t *key, size_t len_key,
const uint8_t *val, size_t len_val);

mtbl_res
mtbl_sorter_write(struct mtbl_sorter *s, struct mtbl_writer *w);

struct mtbl_iter *
mtbl_sorter_iter(struct mtbl_sorter *s);

Sorter options:

struct mtbl_sorter_options *
mtbl_sorter_options_init(void);

void
mtbl_sorter_options_destroy(struct mtbl_sorter_options **sopt);

void
mtbl_sorter_options_set_merge_func(

struct mtbl_sorter_options *sopt,
mtbl_merge_func fp,
void *clos);

void
mtbl_sorter_options_set_temp_dir(

struct mtbl_sorter_options *sopt,
const char *temp_dir);

void
mtbl_sorter_options_set_max_memory(

struct mtbl_sorter_options *sopt,
size_t max_memory);

The mtbl_sorter interface accepts a sequence of key-value pairs with keys in arbitrary order and provides these entries in sorted order. The sorted entries may be consumed via the mtbl_iter interface using the mtbl_sorter_iter() function, or they may be dumped to an mtbl_writer object using the mtbl_sorter_write() function. The mtbl_sorter implementation buffers entries in memory up to a configurable limit before sorting them and writing them to disk in chunks. When the caller has finishing adding entries and requests the sorted output, entries from these sorted chunks are then read back and merged. (Thus, mtbl_sorter(3) is an "external sorting" implementation.)

Because the MTBL format does not allow duplicate keys, the caller must provide a function which will accept a key and two conflicting values for that key and return a replacement value. This function may be called multiple times for the same key if the same key is inserted more than twice. See mtbl_merger(3) for further details about the merge function.

mtbl_sorter objects are created with the mtbl_sorter_init() function, which requires a non-NULL sopt argument which has been configured with a merge function fp.

mtbl_sorter_add() copies key-value pairs from the caller into the mtbl_sorter object. Keys are specified as a pointer to a buffer, key, and the length of that buffer, len_key. Values are specified as a pointer to a buffer, val, and the length of that buffer, len_val.

Once the caller has finished adding entries to the mtbl_sorter object, either mtbl_sorter_write() or mtbl_sorter_iter() should be called in order to consume the sorted output. It is a runtime error to call mtbl_sorter_add() on an mtbl_sorter object after iteration has begun, and once the sorted output has been consumed, it is also a runtime error to call any other function but mtbl_sorter_destroy() on the depleted mtbl_sorter object.

temp_dir

Specifies the temporary directory to use. Defaults to /var/tmp.

max_memory

Specifies the maximum amount of memory to use for in-memory sorting, in bytes. Defaults to 1 Gigabyte. This specifies a limit on the total number of bytes allocated for key-value entries and does not include any allocation overhead.

merge_func

See mtbl_merger(3). An mtbl_merger object is used internally for the external sort.

If the merge function callback is unable to provide a merged value (that is, it fails to return a non-NULL value in its merged_val argument), the sort process will be aborted, and mtbl_sorter_write() or mtbl_iter_next() will return mtbl_res_failure.

mtbl_sorter_write() returns mtbl_res_success if the sorted output was successfully written, and mtbl_res_failure otherwise.

01/31/2014