LOGO

Multifast 2.0.0

A Total Solution for
Mass String Search & Substitution

Download

Multifast 2.0.0's Library

Build and Link

The Multifast's library can be easily built in a typical Linux distribution which contains the build tools. To build the library, first download the multifast-v2.0.0.tar.gz file from the Download Page. Then do the following in the command line:

$ tar -zxvf multifast-v2.0.0.tar.gz
$ cd multifast-v2.0.0/ahocorasick
$ make

The static library file, libahocorasick.a, will be created in the current directory. You can add it to other projects by gcc's library options: -lahocorasick and -L<path>. You also must declare include path for the header file, ahocorasick.h, with -I option. See the Makefile in examples/example0 folder as an example. You can build and run the example by the following commands:

$ cd ../examples/example0/
$ make
$ ./example0

API

The API defines the following data types:

AC_TRIE_t
Defines the main structure to store all data related to the trie (automata) and given patterns.
AC_STATUS_t
Defines the status returned by the ac_trie_add() function.
AC_PATTERN_t
Defines the pattern data structure that must be fed to the trie by the ac_trie_add() function.
AC_TEXT_t
Defines the input text type.
AC_MATCH_t
Defines the match data structure which contains matched patterns' data and is used to report a match occurrence.
AC_MATCH_CALBACK_f
A function of type int (*)(AC_MATCH_t *, void *) which must be defined by the user to be called by the library's search internal whenever a match occurrence was found.
MF_REPLACE_MODE_t
Defines existing options for the replacement mode. See example2.
MF_REPLACE_CALBACK_f
A function of type int (*)(AC_MATCH_t *, void *) which must be defined by the user to be called by the trie's replace internal whenever the replacement buffer is full or the multifast_rep_flush() function was called.

The thorough definitions for data types can be found in actypes.h and ahocorasick.h source code files.

The library defines the following functions:

AC_TRIE_t *ac_trie_create (void);
AC_STATUS_t ac_trie_add (AC_TRIE_t *thiz, AC_PATTERN_t *patt, int copy);
void ac_trie_finalize (AC_TRIE_t *thiz);
void ac_trie_release (AC_TRIE_t *thiz);
void ac_trie_display (AC_TRIE_t *thiz);

int  ac_trie_search (AC_TRIE_t *thiz, AC_TEXT_t *text, int keep, 
        AC_MATCH_CALBACK_f callback, void *param);

void ac_trie_settext (AC_TRIE_t *thiz, AC_TEXT_t *text, int keep);
AC_MATCH_t ac_trie_findnext (AC_TRIE_t *thiz);

int  multifast_replace (AC_TRIE_t *thiz, AC_TEXT_t *text, 
        MF_REPLACE_MODE_t mode, MF_REPLACE_CALBACK_f callback, void *param);
void multifast_rep_flush (AC_TRIE_t *thiz, int keep);

A typical usage of the library involves the following steps:

  1. Initialize the trie: performed by ac_trie_create() function. It builds and initializes the trie.
  2. Add patterns to the trie: performed by making multiple call to ac_trie_add() function.
  3. Finalize the trie: performed by ac_trie_finalize() function after adding the last pattern.
  4. Search and/or Replace: described in the below.
  5. Release the trie: performed by ac_trie_release() function.

Search

The library provides two different interfaces for search:

Note: In this version the _search interface is a bit faster than the _settext/_findnext interface.

Replace

In order to substitute the patterns with their alternatives, the multifast_replace() function must be used. This function receives a function of type MF_REPLACE_CALBACK_f, which must be defined by the user and will be called by the library's internal whenever the replacement buffer of the trie is full or the function multifast_rep_flush() is called. See example2 for more details.

Examples

The package comes with a handful of illustrative examples that can be used as a template in your code. Here are the list of these examples and their contents:

It is recommended to study examples before using the library.