FairCom DB API for C

Creating Key Segments based on Unicode Fields

Unicode key segments provide a challenge for two reasons:

Unlike all other key segments previously implemented, the number of bytes stored in the key and the number of bytes of source data used to construct the key are not the same.
The derivation of the binary sort key (segment) stored in the index from the source data is not a simple transformation.

To accommodate both of these challenges, c-tree Plus incorporated "extended key segments." The concept of an extended key segment can be applied to virtually any non-standard key segment. Our first implementation is for Unicode keys.

Because of the complexity of the Unicode collation algorithm, and because of the incredible breadth of language and country support envisaged by Unicode, FairCom has chosen to implement Unicode key segments using the International Components for Unicode (ICU) open- source development project. The ICU implementation of Unicode support is available on a wide variety of platforms, but not every platform. The ICU web site can be accessed at:

IBM International Components for Unicode (ICU)