cdb(3) | Library Functions Manual | cdb(3) |
cdb - Constant DataBase library
#include <cdb.h>
cc ... -lcdb
cdb is a library to create and access Constant DataBase files. File stores (key,value) pairs and used to quickly find a value based on a given key. Cdb files are create-once files, that is, once created, file cannot be updated but recreated from scratch -- this is why database is called constant. Cdb file is optimized for quick access. Format of such file described in cdb(5) manpage. This manual page corresponds to version 0.78 of tinycdb package.
Library defines two non-interlaced interfaces: for querying existing cdb file data (read-only mode) and for creating such a file (almost write-only). Strictly speaking, those modes allows very limited set of opposite operation as well (i.e. in query mode, it is possible to update key's value).
All routines in this library are thread-safe as no global data used, except of errno variable for error indication.
cdb datafiles may be moved between systems safely, since format does not depend on architecture.
There are two query modes available. First uses a structure that represents a cdb database, just like FILE structure in stdio library, and another works with plain filedescriptor. First mode is more sophisticated and flexible, and usually somewhat faster. It uses mmap(2) internally. This mode may look more "natural" or object-oriented compared to second one.
The following routines works with any mode:
unsigned cdb_unpack(buf)
const unsigned char buf[4];
All query operations in first more deals with common data structure, struct cdb, associated with an open file descriptor. This structure is opaque to application.
The following routines exists for accessing cdb database:
int cdb_init(cdbp, fd)
struct cdb *cdbp;
int fd;
void cdb_free(cdbp)
struct cdb *cdbp;
int cdb_fileno(cdbp)
const struct cdb *cdbp;
int cdb_read(cdbp, buf, len, pos) int cdb_readdata(cdbp, buf, len, pos) int cdb_readkey(cdbp, buf, len, pos)
const struct cdb *cdbp;
void *buf;
unsigned len;
unsigned pos;
const void *cdb_get(cdbp, len, pos) const void *cdb_getdata(cdbp) const void *cdb_getkey(cdbp)
const struct cdb *cdbp;
unsigned len;
unsigned pos;
int cdb_find(cdbp, key, klen) unsigned cdb_datapos(cdbp) unsigned cdb_datalen(cdbp) unsigned cdb_keypos(cdbp) unsigned cdb_keylen(cdbp)
struct cdb *cdbp;
const void *key;
unsigned klen;
int cdb_findinit(cdbfp, cdbp, key, klen) int cdb_findnext(cdbfp)
struct cdb_find *cdbfp;
const struct cdb *cdbp;
const void *key;
unsigned klen;
void cdb_seqinit(cptr, cdbp) int cdb_seqnext(cptr, cdbp)
unsigned *cptr;
struct cdb *cdbp;
In this mode, one need to open a cdb file using one of standard system calls (such as open(2)) to obtain a filedescriptor, and then pass that filedescriptor to cdb routines. Available methods to query a cdb database using only a filedescriptor include:
int cdb_seek(fd, key, klen, dlenp)
int fd;
const void *key;
unsigned klen;
unsigned *dlenp;
int cdb_bread(fd, buf, len)
int fd;
void *buf;
int len;
Note that value of any given key may be updated in place by another value of the same size, by writing to file at position found by cdb_find() or cdb_seek(). However one should be very careful when doing so, since write operation may not succeed in case of e.g. power failure, thus leaving corrupted data. When database is (re)created, one can guarantee that no incorrect data will be written to database, but not with inplace update. Note also that it is not possible to update any key or to change length of value.
cdb database file should usually be created in two steps: first, temporary file created and written to disk, and second, that temporary file is renamed to permanent place. Unix rename(2) call is atomic operation, it removes destination file if any AND renaes another file in one step. This way it is guaranteed that readers will not see incomplete database. To prevent multiple simultaneous updates, locking may also be used.
All routines used to create cdb database works with struct cdb_make object that is opaque to application. Application may assume that struct cdb_make has at least the same member(s) as published in struct cdb above.
int cdb_make_start(cdbmp, fd)
struct cdb_make *cdbmp;
int fd;
int cdb_make_add(cdbmp, key, klen, val, vlen)
struct cdb_make *cdbmp;
const void *key, *val;
unsigned klen, vlen;
int cdb_make_finish(cdbmp)
struct cdb_make *cdbmp;
int cdb_make_exists(cdbmp, key, klen)
struct cdb_make *cdbmp;
const void *key;
unsigned klen;
int cdb_make_find(cdbmp, key, klen, mode)
struct cdb_make *cdbmp;
const void *key;
unsigned klen;
int mode;
If no matching keys was found, routine returns 0. In case at least one record has been found/removed, positive value will be returned. On error, negative value will be returned and errno will be set appropriately. When cdb_make_find() returned negative value in case of error, it is not possible to continue constructing the database.
cdb_make_exists() is the same as calling cdb_make_find() with mode set to CDB_FIND.
int cdb_make_put(cdbmp, key, klen, val, vlen, mode)
struct cdb_make *cdbmp;
const void *key, *val;
unsigned klen, vlen;
int mode;
If any error occurred during operations, the routine will return negative integer and will set global variable errno to indicate reason of failure. In case of successful operation and no duplicates found, routine will return 0. If any duplicates has been found or removed (which, in case of CDB_PUT_INSERT mode, indicates that the new record was not added), routine will return positive value. If an error occurred and cdb_make_put() returned negative error, it is not possible to continue database construction process.
As with cdb_make_exists() and cdb_make_find(), usage of this routine with any but CDB_PUT_ADD mode can significantly slow down database creation process, especially when mode is equal to CDB_PUT_REPLACE0.
void cdb_pack(num, buf)
unsigned num;
unsigned char buf[4];
unsigned cdb_hash(buf, len)
const void *buf;
unsigned len;
cdb library may set errno to following on error:
Note: in all examples below, error checking is not shown for brewity.
int fd;
struct cdb cdb;
char *key, *data;
unsigned keylen, datalen;
/* opening the database */
fd = open(filename, O_RDONLY);
cdb_init(&cdb, fd);
/* initialize key and keylen here */
/* single-record search. */
if (cdb_find(&cdb, key, keylen) > 0) {
datalen = cdb_datalen(&cdb);
data = malloc(datalen + 1);
cdb_read(&cdb, data, datalen, cdb_datapos(&cdb));
data[datalen] = '\0';
printf("key=%s data=%s\n", key, data);
free(data);
}
else
printf("key=%s not found\n", key);
/* multiple record search */
struct cdb_find cdbf;
int n;
cdb_findinit(&cdbf, &cdb, key, keylen);
n = 0;
while(cdb_findnext(&cdbf) > 0) {
datalen = cdb_datalen(&cdb);
data = malloc(datalen + 1);
cdb_read(&cdb, data, datalen, cdb_datapos(&cdb));
data[datalen] = '\0';
printf("key=%s data=%s\n", key, data);
free(data);
++n;
}
printf("key=%s %d records found\n", n);
/* sequential database access */
unsigned pos;
int n;
cdb_seqinit(&pos, &cdb);
n = 0;
while(cdb_seqnext(&pos, &cdb) > 0) {
keylen = cdb_keylen(&cdb);
key = malloc(keylen + 1);
cdb_read(&cdb, key, keylen, cdb_keypos(&cdb));
key[keylen] = '\0';
datalen = cdb_datalen(&cdb);
data = malloc(datalen + 1);
cdb_read(&cdb, data, datalen, cdb_datapos(&cdb));
data[datalen] = '\0';
++n;
printf("record %n: key=%s data=%s\n", n, key, data);
free(data); free(key);
}
printf("total records found: %d\n", n);
/* close the database */
cdb_free(&cdb);
close(fd);
/* simplistic query mode */
fd = open(filename, O_RDONLY);
if (cdb_seek(fd, key, keylen, &datalen) > 0) {
data = malloc(datalen + 1);
cdb_bread(fd, data, datalen);
data[datalen] = '\0';
printf("key=%s data=%s\n", key, data);
}
else
printf("key=%s not found\n", key);
close(fd);
int fd;
struct cdb_make cdbm;
char *key, *data;
unsigned keylen, datalen;
/* initialize the database */
fd = open(filename, O_RDWR|O_CREAT|O_TRUNC, 0644);
cdb_make_start(&cdbm, fd);
while(have_more_data()) {
/* initialize key and data */
if (cdb_make_exists(&cdbm, key, keylen) == 0)
cdb_make_add(&cdbm, key, keylen, data, datalen);
/* or use cdb_make_put() with appropriate flags */
}
/* finalize and close the database */
cdb_make_finish(&cdbm);
close(fd);
The tinycdb package written by Michael Tokarev <mjt@corpit.ru>, based on ideas and shares file format with original cdb library by Dan Bernstein.
Public domain.
Jun 2006 |