Family: srt
Authors: Stephen Uitti, Purdue University Computing Center
Mail: explode at ksb dot npcguild.org
Version: 2.1
Bugs: Well documented.

Introduction

Sometimes you just want to remember a list of strings, but you don't want to repeat any of them. This code makes that really easy.

Configuration

None.

Synopsis

#include <stdio.h>
#include "srtunq.h"

Description

Srtunq is used to extract unique items from a possibly long list, where items are likely to be replicated numerously. The list of unique items must be small enough to fit in memory, but the number of repetitions is possibly high.

The caller has control over the database through the use of a SRTTABLE variable. The subroutines provide for data entry and retrieval, memory allocation and deallocation.

Provides

typedef struct ... SRTTABLE;
Users will define one variable of this type to hold the root of each sorted list they wish to keep. Most routines in this library require the address of such a variable as their first argument.
typedef struct ... SRTENTRY;
Strings are kept in a structure of this type internally. The user should not depend on the internal details of this type.
void
srtinit(tbl)
SRTTABLE *tbl;
This subroutine must be called to initialize the database tag tbl before any data are entered or retrieved from that tree. It assumes that the tag has not been used to store a tree, and therefore does not attempt to free any such data.
char *
srtin(tbl, string, compare)
SRTTABLE *tbl;
char *string;
int (*compare)();
The existing data tree is searched for the string, if it is found then a pointer to that string is returned. Otherwise, space is allocated for the string and pointer structure via malloc(3). The string is copied to this new space which is linked into the tree and a pointer to the new string is returned. If space cannot be obtained, the operation is aborted and NULL is returned (the data structure remains consistent, but the string is not added). The strings are compared and sorted with the subroutine pointed to by compare. This subroutine takes two string pointers as arguments. It returns zero if the strings are the same, less than zero if the first string should precede the second, and greater than zero if the second string should precede the first. Use strcmp(3) if simple lexicographical ordering is desired. It is confusing at best if different compare functions are used when inserting strings into a given tree.
char *
srtmem(tbl, string, compare)
SRTTABLE *tbl;
char *string;
int (*compare)();
Return the database entry for string if it is already a member of the table, else NULL.
int
srtdel(tbl, string, compare)
SRTTABLE *tbl;
char *string;
int (*compare)();
The existing data tree is searched for the string, if it is found then it is deleted and a nonzero value is returned. Otherwise, 0 is returned.
void
srtgti(tbl);
SRTTABLE *tbl;
This subroutine initializes the database tag pointed to by tbl so that a tree traversal can be made via srtgets.
char *
srtgets(tbl);
SRTTABLE *tbl;
This subroutine extracts the next string from the data structure. The strings are returned in the order specified by the compare function when they were inserted with srtin. When the list is exhausted, NULL is returned.
int
srtapply(tbl, func)
SRTTABLE *tbl;
int (*func)(char *, void *);
This subroutine applies the func to each string in the tree (order determined by compare when they were inserted) until the func returns non-zero, or there are no more strings. If the func returned non-zero then that value is returned, otherwise 0 is returned. The (void *) will always we a NULL pointer.
int
srtapply2(tbl, func, pv)
SRTTABLE *tbl;
int (*func)(char *, void *);
void *pv;
This subroutine applies the func to each string in the tree (order determined by compare when they were inserted) until the func returns non-zero, or there are no more strings. If the func returned non-zero then that value is returned, otherwise 0 is returned. The (void *) is common to all calls the func, this is usually used to return a value by reference, keep some state, or may be used to get to a recursive instance of srtapply2.
void
srtfree(tbl)
SRTTABLE *tbl;
This subroutine deletes a database, and re-initializes the database tag. It assumes that the database tag was initialized at one time via srtinit (other routines will probably also have been called). The space formerly occupied by string data and pointer structures is deallocated via free(3).
void
srtdtree(tbl, ent)
SRTTABLE *tbl;
SRTENTRY *ent;
This subroutine recursively deletes a database subtree. The space formerly occupied by the string data and pointer structures is deallocated via free(3). This routine is most likely only of use internally.

EXAMPLE

#include <stdio.h>
main()
{
	extern int strcmp();
	SRTTABLE tree;
	char buf[80], *p;
	int i;

	/* init the tree */
	srtinit(&tree);

	/* add some strings */
	while (NULL != fgets(buf, 80, stdin))  /* want the \n terminator */
		if (NULL == (p = srtin(&tree, buf, strcmp)))
			printf("out of memory!\n");

	/* init tree for srtgets */
	srtgti(&tree);

	/* print out the strings with srtapply and printf -- can't have
	 * a legal printf % escape in the strings! */
	(void) srtapply(&tree, printf);

	/* use srtgets to print the strings out this time -- keep count */
	for (i = 0; NULL != (p = srtgets(&tree)); ++i)
		printf("string %2d is: %s\n", i, p);
	printf("there were %d strings\n", i);

	/* free the database */
	srtfree(&tree);
}

Diagnostics

There are no messages printed by these routines. Catchable errors are returned as NULL. Compiled in errors such as the use of strings that are not null terminated tend to result in core files.

See Also

malloc(3), free(3), qsort(3), strcmp(3)

To Do List

The structure names and typedefs seem confused. The typedefs make more sense.
$Id: srtunq.html,v 6.8 2012/03/21 16:15:05 ksb Exp $