Documentation

Full-Text Search

Next Topic

c-treeACE Full-Text Search

Fast, efficient access to textual data.

Today's applications process vast amounts of textual data. Full-Text Search (FTS) is a great mechanism for fast, efficient access to character-type data elements. c-tree applications with large volumes of text can now be complemented with high-performing text search capabilities.

Similar to a traditional b-tree index over a c-tree data file, you may now define a Full-Text Index (FTI) by specifying which character-type fields to include in this search index. An additional set of FTI files will be maintained on disk.

This support is on a file-by-file basis (same as a typical c-tree index). Once an FTI is defined for a file, it is maintained in “real-time” along with any other b-tree type c-tree indices.

Accessing data records through a full-text search index is simple. Using new API search functions, you provide a word or phrase (text) for which you are searching. All records whose FTI-indexed fields contain this text are returned. It is left up to the developer to utilize this result as needed for the application.


InvertedIndex

The first step in using Full-Text Search is creating an FTS Index (FTI). Functions are available through several different APIs for creating the index, adding a field to the index, setting the default index, and managing the FTI handle. Functions also allow options to be set and other parameters to be controlled.

The FTS indices are maintained as records are added to the table. Internally, a tokenizer divides the text into "tokens" (which are roughly equivalent to a list of categorized words).

When a full-text search is performed, the function is passed one or more words. Each word is compared to the tokens in the FTS Index. The function returns the records that contain those words in their indexed fields.

Additional information can be returned, such as the proximity of multiple tokens. For example, searching for the terms "FairCom" and "database", these terms are in closer proximity in the text "FairCom database" than in "FairCom announces new enhancements to is highly acclaimed database." Statistics about relevance can also be returned.

Full-Text Search is presently available in the following APIs:

  1. c-treeDB C
  2. c-treeDB C++
  3. c-treeDB Java
  4. c-treeDB .NET

Full-Text Search is supported in client/server only.

Functions are available to create a full-text index on the specified fields of a single table, even an existing table. A Full-Text Index can be defined on any text field (as specified in the data file's schema).

As an example, to add full-text search support to a table at the c-treeDB C level, call ctdbAddFTI. This function returns a CTHANDLE to be used with FTS functions to add a field to the list of fields that will be indexed by the FTS Index.

A full-text index can be added to an existing table by calling ctdbAddFTI, setting the various properties, and then calling ctdbAlterTable.

TOCIndex