Product Documentation

FairCom ISAM for C

Previous Topic

Next Topic

Storing UTF-8 Data

Since a UTF-8 encoded string is comprised of ordinary ASCII characters (with code values between 0 and 127), and multi-byte characters (which have the highest-order bit set in each byte), they can be stored normally in a FairCom DB record when a DODA is not present. It is simply up to the application to decipher the record, as with any other data type.

With a DODA present, store UTF-8 encoded strings in any FairCom DB standard string type, such as CT_STRING, CT_4STRING, etc. Since FairCom DB only interprets the contents of a field when the field is part of a key value, storing a UTF-8 string in an “ordinary” FairCom DB string-type field works, provided:

  • Indexing is not required.
  • There is a mechanism to permit FairCom DB’s key assembly routine to properly interpret the string field. FairCom DB support for extended key segment capability deals with this situation.

FairCom DB provides conversion routines between UTF-8 and UTF-16. The input strings are assumed to be terminated by a NULL character. All output buffer sizes are specified in bytes. The conversion routines return NO_ERROR (0) on success, VBSZ_ERR (153) if the output buffer is too small, or BMOD_ERR (446) if there is a problem with the input string.

  • ctu8TOu16() converts an ASCII or UTF-8 encoded string to a UTF-16 Unicode string.

NINT ctu8TOu16(pTEXT u8str,pWCHAR u16str,VRLEN u16byt)

  • ctu16TOu8() converts in the other direction.

NINT ctu16TOu8(pWCHAR u16str,pTEXT u8str,VRLEN u8byt)

Contact FairCom if you require routines to handle UTF-32 conversion.