Multifast 2.0.0's Library
Build and Link
The Multifast's library can be easily built in a typical Linux distribution which contains the build tools. To build the library, first download the multifast-v2.0.0.tar.gz file from the Download Page. Then do the following in the command line:
$ tar -zxvf multifast-v2.0.0.tar.gz
$ cd multifast-v2.0.0/ahocorasick
$ make
The static library file, libahocorasick.a, will be created in the current directory. You can add it to other projects by gcc's library options: -lahocorasick and -L<path>. You also must declare include path for the header file, ahocorasick.h, with -I option. See the Makefile in examples/example0 folder as an example. You can build and run the example by the following commands:
$ cd ../examples/example0/
$ make
$ ./example0
API
The API defines the following data types:
- AC_TRIE_t
- Defines the main structure to store all data related to the trie (automata) and given patterns.
- AC_STATUS_t
- Defines the status returned by the ac_trie_add() function.
- AC_PATTERN_t
- Defines the pattern data structure that must be fed to the trie by the ac_trie_add() function.
- AC_TEXT_t
- Defines the input text type.
- AC_MATCH_t
- Defines the match data structure which contains matched patterns' data and is used to report a match occurrence.
- AC_MATCH_CALBACK_f
- A function of type int (*)(AC_MATCH_t *, void *) which must be defined by the user to be called by the library's search internal whenever a match occurrence was found.
- MF_REPLACE_MODE_t
- Defines existing options for the replacement mode. See example2.
- MF_REPLACE_CALBACK_f
- A function of type int (*)(AC_MATCH_t *, void *) which must be defined by the user to be called by the trie's replace internal whenever the replacement buffer is full or the multifast_rep_flush() function was called.
The thorough definitions for data types can be found in actypes.h and ahocorasick.h source code files.
The library defines the following functions:
AC_TRIE_t *ac_trie_create (void); AC_STATUS_t ac_trie_add (AC_TRIE_t *thiz, AC_PATTERN_t *patt, int copy); void ac_trie_finalize (AC_TRIE_t *thiz); void ac_trie_release (AC_TRIE_t *thiz); void ac_trie_display (AC_TRIE_t *thiz); int ac_trie_search (AC_TRIE_t *thiz, AC_TEXT_t *text, int keep, AC_MATCH_CALBACK_f callback, void *param); void ac_trie_settext (AC_TRIE_t *thiz, AC_TEXT_t *text, int keep); AC_MATCH_t ac_trie_findnext (AC_TRIE_t *thiz); int multifast_replace (AC_TRIE_t *thiz, AC_TEXT_t *text, MF_REPLACE_MODE_t mode, MF_REPLACE_CALBACK_f callback, void *param); void multifast_rep_flush (AC_TRIE_t *thiz, int keep);
A typical usage of the library involves the following steps:
- Initialize the trie: performed by ac_trie_create() function. It builds and initializes the trie.
- Add patterns to the trie: performed by making multiple call to ac_trie_add() function.
- Finalize the trie: performed by ac_trie_finalize() function after adding the last pattern.
- Search and/or Replace: described in the below.
- Release the trie: performed by ac_trie_release() function.
Search
The library provides two different interfaces for search:
- The _settext/_findnext interface: This method is implemented by ac_trie_settext() and ac_trie_findnext() function pair. See example0 for more details.
- The _search interface: This mathod is implemented by ac_trie_search() and a callback function of AC_MATCH_CALBACK_f type. See example1 for more details.
Note: In this version the _search interface is a bit faster than the _settext/_findnext interface.
Replace
In order to substitute the patterns with their alternatives, the multifast_replace() function must be used. This function receives a function of type MF_REPLACE_CALBACK_f, which must be defined by the user and will be called by the library's internal whenever the replacement buffer of the trie is full or the function multifast_rep_flush() is called. See example2 for more details.
Examples
The package comes with a handful of illustrative examples that can be used as a template in your code. Here are the list of these examples and their contents:
- example0: Describes how to use the _settext/_findnext method.
- example1: Describes how to use the _search method.
- example2: Describes how to use the replacement functionality of the library.
- example3: Describes how to write a C++ wrapper for the library.
- example4: Describes some more advanced usages of the library.
It is recommended to study examples before using the library.