dyndb - a
database library for multiple readers and one writer
A
dyndb is a simple one-writer/multiple-reader disk
based data structure. This library offers a low-level
interface to it. Multiple processes may use the same
database, provided that they are carefully using some kind
of locking.
See limitations
for a list of limitations.
Overview
dyndb_create_name or
dyndb_create_fd create a
database.
dyndb_init initializes the
management information on an existing database.
dyndb_enter,
dyndb_enterstart,
dyndb_enteraddkey,
dyndb_enteradddata and
dyndb_enterfinish enter new
information.
dyndb_lookup respectively
dyndb_lookupstart and
dyndb_lookupnext allow to
search for information by a key.
dyndb_walkstart and
dyndb_walknext allows to walk
through all records of the database.
dyndb_fwalk provides an
alternative interface.
dyndb_delete deletes the record
found by the latest call to dyndb_lookupnext.
Data Types
The dyndb data structure
This structure
describes the internal state of the library. Please treat it
as opaque data type. Exception: you may access the fd
member _readonly_.
uint32
is a 32bit unsigned
integer. On many systems you can use
#include <inttypes.h>
#define uint32 uint32_t
to get a definition. Do not use unsigned int
or unsigned long, unless you do not need portability
of your source.
int32
is a 32bit signed
integer. On many systems you can use
#include <inttypes.h>
#define int32 int32_t
to get a definition. Do not use int or
long, unless you do not need portability of your
source.
Integration
To use the library
in your program you have to include dyndb.h inside
your C sources (C++ is not supported, it may or may not
work), and link a few functions to your binaries. The source
files names of the library functions match dyndb_*.c
(a wild card, not a regular expression).
dyndb.h
includes uint32.h and int32.h, which are
expected to define uint32 and int32. The two files included
in the distribution, which you might want to replace if they
don't fit into your framework, work this way:
They include typesize.h, which in turn includes
typesize2.h, a file created at compile time, if if
`HAVE_CONFIG_H' is not defined. Otherwise, if
`HAVE_CONFIG_H' is defined, you have either define a
few constants or include config.h before you include
dyndb.h.
Back when i used GNU autoconf I used the
following magic:
AC_CHECK_SIZEOF(unsigned short,2)
AC_CHECK_SIZEOF(short,2)
AC_CHECK_SIZEOF(int,4)
AC_CHECK_SIZEOF(unsigned int,4)
AC_CHECK_SIZEOF(long,4)
AC_CHECK_SIZEOF(unsigned long,4)
AC_CHECK_SIZEOF(long long,0)
AC_CHECK_SIZEOF(unsigned long long,0)
If `HAVE_CONFIG_H' is not defined then
typesize2.h has to define the following C macros,
possibly to other values:
#define SIZEOF_SHORT 2
#define SIZEOF_UNSIGNED_SHORT 2
#define SIZEOF_INT 4
#define SIZEOF_UNSIGNED_INT 4
#define SIZEOF_LONG 4
#define SIZEOF_UNSIGNED_LONG 4
#define SIZEOF_LONG_LONG 8
#define SIZEOF_UNSIGNED_LONG_LONG 8
Functions
dyndb_create_fd
- int
dyndb_create_fd(struct dyndb *, int fd);
This function creates a database on file descriptor
fd. fd should be a file of zero bytes
length.
This function returns -1 in case or error or fd
otherwise.
dyndb_create_name
- int
dyndb_create_name(struct dyndb *, const char *fname,
int mode);
dyndb_create_name creates a dyndb database at
the place fname. Any existing file fname will
be overwritten. mode describes the access rights of
the file.
This function returns -1 in case or error or fd
otherwise.
dyndb_datalen
- uint
dyndb_datalen(struct dyndb *);
This macro returns the length of the data found by the
last call to
dyndb_lookupnext or
dyndb_walknext. It's results
are undefined if no data was found.
dyndb_datapos
- uint32 dyndb_datapos(struct dyndb *);
This macro returns the positions of the data found by
the last call to
dyndb_lookupnext or
dyndb_walknext. It's results
are undefined if no data was found.
dyndb_delete
- int dyndb_delete(struct dyndb *dy);
dyndb_delete deletes the record found by the
latest call to
dyndb_lookupnext. It returns
-1 in case of error and 0 if successful.
You must only call dyndb_delete if the call to
dyndb_lookupnext was successful.
dyndb_enter
- int dyndb_enter(struct dyndb *dy,
- const char *key, uint32 keylen,
- const char *data, uint32 datalen);
This function enters a record into the database
dy. It is implemented by calls to
dyndb_enterstart,
dyndb_enteraddkey,
dyndb_enteradddata and
dyndb_enterfinish. You
have to ensure that no other process enters data into
dy during the call to dyndb_enter.
dyndb_enterstart
- int
dyndb_enterstart(struct dyndb *dy)
dyndb_enterstart prepares dy for a new
record. It returns -1 in case of error and 0 otherwise.
You have to ensure that no other process enters data into
dy between the calling of dyndb_enterstart and
the end of the dyndb_enterfinish call.
dyndb_enteraddkey
- int
dyndb_enteraddkey(struct dyndb *dy,
- const char *key,uint32 keylen);
dyndb_enteraddkey adds key, which may be
part of a key (that means you may split key between multiple
calls to dyndb_enteraddkey) into dy. It
returns -1 in case of error and 0 otherwise. You must
call dyndb_enteraddkey only after a successful call
to dyndb_enterstart. You must follow
dyndb_enteraddkey with either one or more calls to
dyndb_enteradddata or a call to
dyndb_enterfinish.
dyndb_enteradddata
- int
dyndb_enteradddata(struct dyndb *dy,
- const char *data,uint32 datalen);
dyndb_enteradddata adds data of
datalen bytes to the record being added info
dy. It returns -1 in case of error and 0 otherwise.
You must call dyndb_enteradddata only after the
key has been entered. You may calldyndb_enteradddata
multiple times. You must call dyndb_enterfinish to
actually add the record to the database index.
dyndb_enterfinish
- int
dyndb_enterfinish(struct dyndb *dy);
dyndb_enterfinish adds the record created by
calls to dyndb_enterstart, dyndb_enteraddkey
and dyndb_enteradddata to the index structures of
dy. It returns -1 in case of erorr and 0 otherwise.
dyndb_fromdisk
- uint32 dyndb_fromdisk(unsigned char
buf[4]);
dyndb_fromdisk converts 4 bytes of the little
endian character array buf into an unsigned 32 bit
integer.
dyndb_fwalk
- int
dyndb_fwalk(struct dyndb *dy,
- int (*callback)(uint32 pos, void *key, uint32
keylen,
- void *data, uint32 datalen));
dyndb_fwalk provides one way to read all
elements of a table. It calls a callback function
callback. If callback returns a value other
than 0 then dyndb_fwalk stops the search (you should use -1
to signal an error, since that is what dyndb_fwalk
will return in that case). dyndb_fwalk will
return 0 if callback has been called for all elements
of the database, -1 in case of an error, and any possible
integer value callback returned.
dyndb_fwalk finds every record in the database at
least once. It finds every record exactly once if noone else
is writing to the database at that moment, and may find
records more than once if a table split happens during the
walk through the table.
An alternative way to read all
elements is the dyndb_walk
interface. The main difference between the two is that
dyndb_walk is slower, has a more clean API and is
likely to provide more useful results in case another
process writes to the database during the walk, while
dyndb_fwalk is faster.
dyndb_hash
- int
dyndb_hash(const unsigned char *buf, uint32 len);
Low level function, not to be used outside the
library. Implemented in terms of
dyndb_hashstart and
dyndb_hashadd.
dyndb_hashstart
- int
dyndb_hashstart(void)
Low level function, not to be used outside the
library. Returns the hash seed.
dyndb_hashadd
- int
dyndb_hashadd(uint32 h, const unsigned char
*buf,uint32 len);
Low level function, not to be used outside the
library. Adds len bytes starting at buf to the
hash h and returns the hash value.
dyndb_keylen
- uint dyndb_keylen(struct dyndb *);
This macro returns the length of the key found by the
last call to dyndb_walknext.
It's results are undefined if no record was found.
dyndb_keypos
- uint32 dyndb_keypos(struct dyndb *);
This macro returns the positions of the key found by
the last call to
dyndb_lookupnext or
dyndb_walknext. It's results
are undefined if no record was found.
dyndb_init
- int dyndb_init(struct dyndb *db, int fd);
dyndb_init initializes the db structure
for operations on from fd. fd has to be a
valid file descriptor, opened for access matching whatever
you want to do with it. dyndb_init returns 0 in
case of success or -1 in case of error.
dyndb_init does not lock db. If this is needed
then the application has to do this before the call to
dyndb_init.
You may call dyndb_init on the
same dy more than once. You have to close fd
yourself.
dyndb_lookupstart
- void
dyndb_lookupstart(struct dyndb *dy);
dyndb_lookupstart prepares dy for a new
search. All information about the search is kept in memory.
dyndb_lookupnext
- int
dyndb_lookupnext(struct dyndb *dy,
- const char *key, uint32 keylen);
dyndb_lookupnext searches, starting from the
last lookup position (as set by
dyndb_lookupstart or
dyndb_lookupnext), a key of keylen in the
database dy. It returns 1 if key has been
found, 0 if key has not been found, and -1 in case of
an error. The data is .intlink dyndb_datalen(dy) bytes
long. It may be read with
dyndb_read either directly after
the dyndb_lookupnext call or later after the file
descriptor has been moved to the position returned by
dyndb_datapos(dy), using
the dyndb_seek function.
This
example retrieves all records with a given key from the DB
on the standard input and prints them, separated by a space,
to the standard output:
#include "dyndb.h"
void error(void) { _exit(1); }
int main(int argc, char **argv)
{
struct dyndb dy;
if (argc<2) error();
if (-1==dyndb_init(&dy,0)) error();
dyndb_lookupstart(&dy);
while (1) {
int r;
int32 l;
r=dyndb_lookupnext(&dy,argv[1],strlen(argv[1]));
if (-1==r)
error();
if (!r)
break; /* no further records */
/* read and print data */
for (l=dyndb_datalen(&dy); l ; ) {
char buf[128];
int32 got;
got=dyndb_read(dy.fd,buf, l > sizeof(buf) ? sizeof(buf) : l);
if (-1==got) error();
write(1,buf,got);
l-=got;
}
write(1,"0,1);
}
return 0;
}
dyndb_lookup
- int dyndb_lookup"(structdyndb
- const char *key, uint32 keylen);
This is implemented as a call to dyndb_lookupstart and
a call to dyndb_lookupnext.
dyndb_read
- int32
dyndb_read(int fd,char *buf,uint32 len);
This functions copies len bytes from file
descriptor fd to buf. It returns -1 in case of
error or len otherwise. EIO is set if the end
of the file is reached and not less than len bytes
have been read.
dyndb_seek
- int32
dyndb_seek(int fd,uint32 pos);
dyndb_seek moves the file pointer fd to
position pos. It returns -1 in case of error or pos
otherwise.
dyndb_seekend
- int32
dyndb_seekend(int fd);
dyndb_seek advances the file pointer fd
to the end of the file, returning -1 in case of error or the
new position.
dyndb_todisk
- void
dyndb_todisk(unsigned char *buf, uint32 num);
This function converts the unsigned 32 bit integer
num to the little endian character array buf.
dyndb_walkstart
- int
dyndb_walkstart(struct dyndb *dy);
A call to this function resets the start of dy
so that a later call to
dyndb_walknext will find the
first record in the database. The function returns -1 in
case of error and zero if successful.
dyndb_walknext
- int dyndb_walknext(struct dyndb *dy);
dyndb_walknext provides a way to read all
elements of a table. It returns -1 on error, 0 if no further
elements exists and 1 if an element is found. If an
element is found then key and data may be read, in that
order, from dy->fd.
dyndb_keylen returns to length
of the key, and dyndb_datalen
returns the length of the data. In case you need to rewind
the file descriptor to the old position then
dyndb_keypos returns the
position of the key and
dyndb_datapos returns the
position of the data (key position + key length).
dyndb_walknext finds every record in the database at
least once. It finds every record exactly once if noone else
is writing to the database at that moment, and may find
records more than once if a table split happens during the
walk through the table.
This example dumps all keys to
the standard output:
struct dyndb dy;
if (-1==dyndb_init(&dy,0)) error();
if (-1==dyndb_walkstart(&dy)) error();
while (1) {
int r;
uint32 l;
r=dyndb_walknext(&dy);
if (-1==r)
error();
if (!r)
break; /* no further records */
/* read and print the key */
for (l=dyndb_keylen(&dy); l ; ) {
char buf[128];
int32 got;
got=dyndb_read(dy.fd,buf, l > sizeof(buf) ? sizeof(buf) : l);
if (-1==got)
error();
write(1,buf,got);
l-=got;
}
write(1,"0,1);
/* could do the same for the data, using dyndb_datalen() */
}
dyndb_write
- int dyndb_write(int fd, const char *buf,
uint32 len);
This functions writes len bytes from buf
to file descriptor fd, restarting if interrupted. It
returns -1 in case of error and len otherwise.
Limitations
- 1. the database size is limited to 31 bits (2
gigabytes).
- 2. key and data length are limited to 31
bits, too.
- 3. hash, key length, data length and pointers
are stored in little endian byte order.
- 4. although
deletions of elements are possible there is no way to
reclaim the disk space of the records: deleted data is
unavailable, not freed.
- 5. dyndb is limited to one
writing process at any time. There may be many readers at
any time.
- 6. although it's fine to share a file
description between threads each file descriptor must not be
be used for more than one operation at any given time. That
means: be careful in multithreaded applications or use
dup(2).
- 7. Be careful when using buffered
libraries I/O like <stdio.h> or avoid them.
Author
Uwe Ohse, uwe@ohse.de
See also
dyndb_intro(7),
cdb(3).