Product Documentation

FairCom ISAM for C

Previous Topic

Next Topic

Extended Key Segment Definition

This section describes how to define an extended key segment.

The FairCom DB implementation of extended key segments allows a single extended key segment definition to be used by more than one actual key segment. For example, an application may make one call to PutXtdKeySegmentDef() (discussed below) that applies to all of the extended segments used in the application. Therefore, some of the parameters specified in the definition optionally permit their particular values to be determined at run-time for each key segment.

Specify an extended key segment definition using the ctKSEGDEF structure presented in ctport.h:

#define ctKSEGDLEN 32 /* length of desc string */

typedef struct keysegdef {

LONG kseg_stat; /* status (internal use) */

LONG kseg_vrsn; /* version info */

LONG kseg_ssiz; /* source size */

LONG kseg_type; /* segment type */

LONG kseg_styp; /* source type */

LONG kseg_comp; /* comparison options */

LONG kseg_rsv1; /* future use */

LONG kseg_rsv2; /* future use */

TEXT kseg_desc[ctKSEGDLEN]; /* text specification eg, locale string */

} ctKSEGDEF;

The FairCom DB module ctport.h contains defines for all of the constants, beginning with ctKSEG, used to create an extended key segment definition. As extended key segments are currently implemented, the kseg_stat and the kseg_vrsn members are filled in as needed by the extended key segment implementation itself. The kseg_ssiz member specifies the number of bytes of source data to use to derive the actual key segment. In addition to using a specific numeric value for the source size, kseg_ssiz may also be assigned either of these two “values”:

  • ctKSEG_SSIZ_COMPUTED

    The information about the underlying data field will be used to compute how much source data is available.

    For fields without length specifiers (such as CT_STRING or CT_UNICODE) an appropriate version of strlen() will be used to determine data availability. However, this could be very inefficient if the field may hold very long strings since it is likely that only a small portion of the variable-length field will actually contribute to the key segment. An alternative is to specify a fixed source size. If the variable data has less than this size, it will still be handled correctly.

  • ctKSEG_SSIZ_PROVIDED

    The call to create the key segment will provide the particular length of source data available. This option is typically used when an explicit call is made to TransformXtdSegment() (discussed below).

For an ICU Unicode definition, the remaining structure members are specified as follows:

  • kseg_type

    Set to ctKSEG_TYPE_UNICODE.

  • kseg_styp

    Specify the type of source data as follows:

    • ctKSEG_STYP_UTF8
    • ctKSEG_STYP_UTF16
    • ctKSEG_STYP_PROVIDED

    ctKSEG_STYP_PROVIDED means that the type of source data will either be specified in a direct call to TransformXtdSegment() or it will be determined at run-time during ISAM key value construction. (Key value construction consists of one or both of assembling the key value from its component segments and performing transformations to generate a binary sort key.)

    ctKSEG_STYP_PROVIDED can be used during ISAM key value construction (for example, with AddRecord()) provided:

    • A schema (DODA) exists for the associated data file
    • The key segment mode is either VARSEG or SCHSEG.

    In this case, if the data type is one of the conventional FairCom DB string types (e.g., CT_STRING), the source data type is UTF-8; if a Unicode string type is found (that is, CT_UNICODE), then the source data type is UTF-16. However, if the underlying data type does not fall into either of these categories, the data is treated as UTF-16, and used as is.

    Otherwise (that is, if there is no DODA or the segment mode is not VARSEG or SCHSEG) error SSEG_ERR (704) will be generated at the time of key value construction.

  • kseg_desc

    Contains the ICU locale formed as an ordinary, null terminated ASCII string. The format specified by ICU is “xx”, “xx_YY”, or “xx_YY_Variant” where “xx” is the language as specified by ISO-639 (for example, “fr” for French); “YY” is a country as specified by ISO-3166 (e.g., “fr_CA” for French language in Canada); and the “Variant” portion represents system dependent options.

    Note: When ICU uses a locale to access collation rules, it attempts to get rules for the closest match to the locale specified in kseg_desc. By default, there is no restriction on how close the match of locales must be to be acceptable. You can restrict the use of alternative locales by including either ctKSEG_COMPU_FALLBACK_NOTOK or ctKSEG_COMPU_SYSDEFAULT_NOTOK as part of the bitmap comprising kseg_comp discussed below. After a successful call to PutXtdKeySegmentDef(), the GetXtdKeySegmentDef() function can be used to determine the actual ICU locale used during collation.

  • kseg_comp

    This member of the structure permits the full range of ICU collation options to be specified through a bitmap. The details of these options are beyond the scope of this documentation. However the symbolic constants used to form the bitmap are presented in ICU Collation Option Overview.

TOCIndex