SourceFiles.org - Use the Source, Luke
Home | Register | News | Forums | Guide | MyLinks | Bookmark

Related Sites

Latest News
  General News
  Reviews
  Press Releases
  Software
  Hardware
  Security
  Tutorials
  Off Topic


Back to files

+------------------------------------------------+ | Suffix tree implementation library version 1.2 | +------------------------------------------------+

Fabien Menemenlis 2001-07-19
nihilist@dead-inside.org

This archive contains an implementation of a fast indexing library based on suffix trees. It features the basic functions you can expect from this kind of library: storing, retrieving, deleting and dumping/loading the database. All work is done on disk except for a custom cache used to save a few read/writes on nodes and thus requires fast disks but little memory. A performance of about 500 inserts per second has been achieved on an array of 3 * 18 GB 10000 rpm SCSI disks. Indexing time is also linear and lookup is extremely fast (requires only n + 2 disk access at most when looking up a word of n letters).

It has been developped on FreeBSD/gcc but should be fairly portable.

The source code "testsfx.c" show an example of how to use the library both for inserting, retrieving, and deleting data. There aren't many functions and comments should be enough to give you an idea of how to use the library. (read the header of the source file)

You should edit sfxdisk.h to suit your needs: you can change the alphabet size and the offset type. It should be OK to use "long long" 64 bits ints instead of long, in fact I tested it succesfully but haven't gone to the point of filling more than 2 GB of data (needless to say you need a 64 bits filesystem).

Two "tools" come with the library (new with version 1.2): dumpsfx and loadsfx. dumpsfx is used to dump the database: dumpsfx <suffix tree file> [-s separator] if you want to output the result as readable text or dumpsfx <file.sfx> -h
to output it for reloading with loadsfx. dumpsfx outputs on stdout and loadsfx reads from stdin. loadsfx <suffix tree file to create> < dumped_file

Please note that deleting does "free" any space for future inserts, dumping and reloading the database is necessary to retrieve freed space. This could change in the future.
There are still a few french variables names, sorry about that, I'll translate them for the next release.

There's still room for improvement... like partial key lookup. Contribute if you can!
Any feedback will be greatly appreciated, mail me at nihilist@dead-inside.org

The library is provided under the GPL license. See COPYING file for more info.


Other Sites

Discussion Groups
  Beginners
  Distributions
  Networking / Security
  Software
  PDAs

About | FAQ | Privacy | Awards | Contact
Comments to the webmaster are welcome.
Copyright 2006 Sourcefiles.org All rights reserved.