Product Documentation

Full-Text Search

Previous Topic

Next Topic

Full-Text Query Syntax

In This Chapter

FTS: Quotes, Operators, Examples

FTS: Query Syntax

Easier Full-Text Search (FTS) MATCH Operator Syntax

Previous Topic

Next Topic

FTS: Quotes, Operators, Examples

Optional Quotation Marks

The use of single quotation marks to delimit words in an FTS query string is optional. Words in a query string may be delimited in either of two ways:

  • They may be surrounded by single quotation marks.

    or

  • They may be delimited by a space character (0x32).

A phrase must be surrounded by single quotation marks.

Operators

The following terms are treated as operators when they are not surrounded by single quotes:

and
near
near/x (where x is a numeric value, as explained later under NEAR Queries)
or (treated as an operator but returns an error because we do not currently support this operator)

Examples

The following are examples of valid queries:

To search for documents containing the three words "can", "be", and "found":

"can be found"

To search for documents containing the word "can" near the word "found":

"can NEAR found"

Previous Topic

Next Topic

FTS: Query Syntax

Very First Token Search

The caret (^) character indicates to search from the beginning of the column. Only one caret is allowed in an FTS query.

SQL:

All words from the “document” column from the “docs” table for which "linux" is the first token:

SELECT * FROM docs WHERE document MATCH '^linux';

c-treeDB (pseudo code):

ftSearchOn("document:^linux");

Phrase Queries

A phrase query retrieves all documents that contain a nominated set of terms or term prefixes in a specified order with no intervening tokens. Phrase queries are specified by enclosing a space-separated sequence of terms or term prefixes in double quotes ("). For example:

The following example is a query for all documents that contain the phrase "linux applications":

SELECT * FROM docs WHERE document MATCH '"linux applications"';

c-treeDB (pseudo code):

ftSearchOn("document:'linux applications'");

Token NOT present Queries

It is possible to exclude from filter results, rows that contain a keyword by using a “minus” before the keyword.

For example: find all documents which contain the keyword “linux” but exclude those that also include the keyword “applications”

SELECT * FROM docs WHERE document MATCH '"linux” -“applications"';

Multiple keywords for exclusion can also be used. For example, find rows where the “document” column contains the “linux” keyword but not “applications” or “database”

SELECT * FROM docs WHERE document MATCH '"linux” -“applications" -”database”';

NEAR Queries

A NEAR query returns documents that contain two or more nominated tokens within a specified proximity of each other. NEAR is a binary operator: it accepts two and only two arguments in the form: A NEAR B.

A NEAR query is specified by putting the keyword NEAR between two phrase, term, or prefix queries. By default, NEAR searches for 10 or fewer intervening tokens between the two arguments. To specify a proximity other than the default, an operator of the form NEAR/<N> may be used, where <N> is the maximum number of intervening terms allowed.

SQL:

Use the following query to search for a document that contains the tokens "faircom" and "database" with not more than 6 intervening tokens. This matches the only document in table docs. Note that the order in which the tokens appear in the document does not have to be the same as the order in which they appear in the query.

SELECT * FROM docs WHERE document MATCH 'database NEAR/6 faircom';

c-treeDB (pseudo code):

ftSearchOn("document:database NEAR/6 faircom");

Multiple NEAR Operators

More than one NEAR operator may appear in a single query. In this case each pair of terms or phrases separated by a NEAR operator must appear within the specified proximity of each other in the document. Using the same table and data as in the block of examples above:

The following query selects documents that contain an instance of the term "FairCom" separated by two or fewer terms from an instance of the term "acid", which is separated by two or fewer terms from an instance of the term "relational".

SELECT * FROM docs WHERE document MATCH 'faircom NEAR/2 acid NEAR/2 relational';

This option will require one index lookup for every term in the phrase, saving the doclist and then an intersection of those search results based on the specified distance.

c-treeDB (pseudo code):

ftSearchOn("document:faircom NEAR/2 acid NEAR/2 relational");

Precedence

NEAR takes precedence over AND. For example:

A AND B NEAR C

instructs FTS to search for records that contain token A and that also contain token B in close proximity ("NEAR") to token C.

Full-Text Search Query on Hyphenated Words

The FairCom full-text search supports querying on hyphenated words. Hyphenated words are converted to phrases. For example, searching for the word "self-contained" searches for the phrase 'self contained'.

What is commonly considered a "word" may be ignored either because the tokenizer is written to ignore specific words (e.g., words with fewer than a specified number of characters) or because the sequence of characters is on the stop list of words specifically ignored. In the following example, querying the word "give-and-take" would search for the phrase 'give and take', however the word "and" may be ignored.

If the hyphenated word is in a phrase, it is converted to individual words in that phrase.

Example FTS Queries

The following are some examples of valid queries:

To search for documents containing the three words "can", "be", and "found":

'can be found'

To search for documents containing the word "can" near the word "found":

'can' NEAR 'found'

In the next examples, "near" is used as a search term rather than an operator, which means that it requires single quotes.

To search for the words "cans", "near", and "trash":

'cans' 'near' 'trash'

To search for "near" and "and" near "cans":

'cans' NEAR/2 'near' NEAR/2 'and' NEAR/2 'trash'

To search for usage of "NEAR/2" in a document:

'near/2'

Previous Topic

Next Topic

Easier Full-Text Search (FTS) MATCH Operator Syntax

To perform a phrase match, the phrase must be enclosed in single quotation marks per FTS query syntax defined at the c-treeDB level. In SQL, that mandates two single quotes since the entire query is surrounded by a single quote (because it is a literal). Therefore, quotes need to be escaped in the standard way (that is double them).

(This implies the entire query is a phrase match since the final query is made by 3 single quotes, the phrase and 3 single quotes.)

For convenience, the FairCom DB SQL syntax now allows wrapping phrases within the FTS query literal with double quotes (that do not need to be escaped).

Both the old and the new methods (double quote or double single quotes) of specifying phrase search can be used for SQL FTS queries.

TOCIndex