Wayne's Little Data Structures and Algorithms Library
Note: this page is also available in Romanian,
Thanks to Maxim Petrenko.
The algorithms in libwayne are by no means original. Many of them
are taken verbatim from textbooks on data structures and algorithms,
and I simply translated them into C. They include efficient and correct
routines for priority queues, event-driven simulations, queues, stacks,
binary trees, sets of integers, graphs (the node-edge kind), some
combinatorics routines, ODE integration routines, a simple statistics
package, and a matrix-vector library. Many of the routines (heap, stack,
queue, bintree) can work with arbitrary objects, not just integers.
Comparisons are done with pointers to comparison functions, similar to
ANSI C's standard qsort. This library is not meant to
be complete; I write the routines as I need them, but only high-quality
code goes into libwayne.
One thing that many people ask me is ``why didn't you use C++?''
Without going into a long tirade, suffice it to say that, although I am
not a C++ expert (actually, the only stuff I haven't learned in intimate
detail is templates), I know
enough C++ to realize that it is not the be-all, end-all of programming
languages. In fact, after several years of C++ being around, it is
already beginning a slow fading into history, with Java being its
successor --- and not a very good one, at that. At the risk of sounding
like the 40-50 year olds out there who still insist that FORTRAN is
a good enough language for everything, I'll be a 30-something who
insists that, until something better comes along, C is still a good
all-purpose language in which to write heavy, data-structure intensive
programs. I believe it was Dennis Ritchie who said something like,
"C is rarely the
best language for a given task, but it's often the second-best,"
the implication being that it's better to learn one language that is
second-best for everything, than to learn a new language for every
programming task. (One could say the same of English.)
I started libwayne when I realized that I was constantly
re-writing little bits of code that did important things that should
be in the C standard, but are not. For example, how many times have
you written code like this:
if((p = malloc(n)) == NULL) /* or some other fatal error condition */
{
fprintf(stderr, "error: %s\n", err_msg);
exit(1);
}
I got sick of it. Furthermore, I often wanted to know more about why
my program failed. So I wrote Fatal. Here's its prototype:
void Fatal(char *fmt, ...); /* generates an assertion failure */
It uses varargs so you can pass it an arbitrary list of output
arguments just like printf,
but it generates an assertion failure so that if you run it under a debugger,
you can look at the program nicely as it dies.
It turned out to be only the first function I wrote for libwayne,
and it was put into a file called "misc.c" which I started including in
most of the code I wrote. Another early member of the library was
Malloc, which calls Fatal if the standard malloc
fails.
Eventually "misc.c" started getting pretty big, with macros for
MIN, MAX, ABS, SQR, etc, so I created misc.h and
compiled misc.c into an object module. That was about 1993.
About that time I started to realize that, at least in C, we need
a way to pass "objects" around in a reasonably transparent way, but
sometimes we want to treat pointers as integers. This makes some
people's teeth sweat (my own included), so I invented the voint
datatype, which is (you guessed it) a union of (void*) and
(int).
Then I started adding more complex algorithms to libwayne,
whenever I needed them. Each and every piece of libwayne
was written because I needed it, but only things that I took careful
time to do well went into libwayne. Any algorithm that
needs to compare objects needs a comparison function like the one
used by the ANSI standard qsort routine.
Some of the sub-libraries in libwayne
Discrete (integer) Algorithms
- Heap: heap implementation of a priority queue. The
array grows as necessary using realloc, so you don't need
to know the maximum size a priori, although you need to give it
an initial guess. The standard priority queue operations are supported:
create, destroy, insert, getnext, size, and each takes time no
longer than O(log n). In addition, I support
(slow) deletion, and once you have insert and getnext,
it's trivial to
write a heap sort. Here's what its header file looks like.
#include "misc.h" /* for voint */
/* this is returned if any errors occur */
#define HEAPERROR ABSTRACT_ERROR
typedef struct _heaptype
{
int HEAPSIZE;
pCmpFcn HeapCmp; /* pointer to comparison function */
voint *heap;
} HEAP;
HEAP *HeapAlloc(int maxNumber, pCmpFcn);
void HeapReset(HEAP*); /* make heap empty */
int HeapSize(HEAP*); /* how many things currently in heap? */
void HeapFree(HEAP*); /* free the entire heap */
voint HeapPeek(HEAP*); /* what's the top element? */
voint HeapNext(HEAP*); /* pop and return top element */
voint HeapInsert(HEAP*, voint);
/*
** delete is slow O(n); if more than one match, only one is deleted, but
** no guarantees on which gets deleted.
*/
voint HeapDelete(HEAP*, voint);
int HeapSort(voint *list, pCmpFcn);
int HeapCheck(HEAP*); /* sanity check, for debugging */
void HeapPrint(HEAP*); /* dump an entire heap. You must define a HeapTypeP
rint function */
void HeapTypePrint();
- Event: Routines for event-driven simulation. Of course,
they use my heap routines. Operations supported:
InitEventList, InsertEvent, NextEvent. Events are function
pointers that get called when the appropriate event is activated.
I found it very useful for prototyping even complicated event-driven
simulations. (Although, for serious research,
I "upgraded" to a more sophisticated event-driven simulation package.)
- Queue and stack.h: circular buffer (first-in-first-out)
and stack (last-in-first-out).
- Set: bit-string implemenation sets of integers. Operations
supported: SetAlloc, SetFree, SetEmpty (delete all members),
SetCopy, SetAdd, SetDelete, SetIn (query membership), SetUnion,
SetIntersection, SetCardinality. All these routines are very
fast, since almost every operation is just a few bit manipualitons.
It's also memory efficient, taking up N/8 bytes of storage, where N
is the maximum integer you want to store in the set.
- Graph: algorithms for manipulating graphs (node and
edge graphs, not mathematical functions). I wrote these
for my Ramsey Graph searching.
Operations supported: GraphAlloc, GraphFree, GraphDeleteAllEdges,
GraphConnect (two nodes), GraphDisconnect (two nodes), GraphAreConnected
(query connection between two nodes), GraphComplement
(the entire graph), GraphUnion (two graphs), GraphInduced (induce a
subgraph on a set of nodes), GraphsIsomorphic (reasonably clever but
not genius algorithm for graph isomorphism), GraphClique, GraphIndepSet.
The last two search for cliques and independent sets on
an arbitrary graph using the brute-force exponential time algorithm.
- Combin: some basic combinatorial algorithms, used by
the clique-searching routines in graph. Operations supported:
generate all permutations or combinations; count them without
generating them; generate Nth permutation or combination quickly.
- sorts: various sorting routines with the exact same interface
is ANSI qsort: QuickSort, Insertion sort, Heapsort, CombSort,
MergeSort, FredSort, PointerSort. The latter is good for sorting large
objects. It creates a list of pointers, sorts the pointers, and then permutes
all the big objects in O(n) time. For 16384 elements each about 1K in size,
it's about 10 times faster than qsort. FredSort is Fredrickson's Sort,
which is an O(n log n) time, near-space-optimal sorting routine, based
on a talk I saw in Toronto in 1998.
- BinTree: algorithms for a binary tree: BinTreeAlloc,
BinTreeInsert, BinTreeLookup, BinTreeLookupKey, BinTreeFree.
- Oalloc: allocate chunks of memory that can not be freed
individually, but can all be freed in
one call. I wrote this for use in a graphical renderer/animator,
which needs to allocate literally millions of small chunks of memory
(small = 8-20 bytes), and when the picture is finished being generated,
all
the memory associated with its generation can be thrown away in one
step, in preparation for the next image. Oalloc means "Once Alloc".
Floating Point Algorithms
I wrote these for my thesis
research.
- sdriv2: a C language interface to the very good FORTRAN ODE
integrator sdriv2,
from the book Numerical Methods and Software by
Kahaner,
Molar and Nash, Prentice Hall 1988. I also have ddriv2, the
double-precision version.
- lsode: a C interface to another very good FORTRAN ODE
integrator called LSODE. Also here for more details.
- leapfrog: an sdriv2 and LSODE-like interface to a simple
leapfrog integrator.
- matvec: routines for matrix and vector manipulations.
- stats: routines for simple hand-calculator-like statistics,
like mean, variance, geometric mean, skewness. I used these in
my network research.
If you have read this far and are interested in libwayne, then
please contact me via e-mail.
libwayne is availble for free. Originally, I wasn't going to
release this code because I thought it wasn't "ready" for release.
But, I've been using it and modifying it for years, and some of my
friends have also used bits of it. I
believe that David Neto
is using my heap routines in his Travelling Saleman Problem research.
In any case, after reading
The Cathedral
and the Bazaar, I've decided to release it. If you'd just like to get
it quickly, you can download a tar'd, gzip'd archive of the entire tree
here. It's about 1.5MB. (Most of that is
actually high-quality FORTRAN numerical stuff that I didn't write,
but for which I've written C front-ends to.) If you find any bugs,
or make any improvements, please
contact me via e-mail.
libwayne is dual-licensed: the default is the Library GNU General Public
License (LGPL), but you can contact me if you are interested in arranging another
mutually agreeable licensing scheme.
If you came to this page looking for my quad precision routines,
don't bother: I found a
much
better implementation.
Access count (updated once a day) since 1 Jan 1997: 79206