Product Documentation

Modern RTG Data ManagementUTF-8 Data Stored in FairCom RTG

FairCom RTG COBOL User Guide

Previous Topic

Next Topic

UTF-8 Data Stored in FairCom RTG

UTF-8, Unicode Transformation Format - 8 bit, is a character encoding standard that's used for electric storage and transmission of character data and is the most common character encoding for this purpose. UTF-8 represents each character using one to four bytes. The first 128 UTF-8 characters match the first 128 ASCII characters, so existing ASCII text is also valid UTF-8. UTF-8 is a superset of all characters in common use today, with a capacity for over a million distinct code points. It's also a "variable-width" encoding standard, meaning that commonly used code points are represented with fewer bytes than less frequently used code points.

UTF-8 data can be stored in FairCom RTG tables. This is not a problem with standard RTG COBOL tables. However, querying this data with SQL,our JSON DB API or dbNotify with FairCom MQ requires a Unicode build of the RTG server for the additional character interpretation of the stored bytes in the record. Applications handling data containing extended ASCII character codes, such as international characters, require a FairCom RTG Unicode build for SQL query of this data.

Contact your FairCom account manager should you require UTF-8 data storage for your RTG application.

Configuration

FairCom RTG XDD definitions files can specify exactly which fields to require UTF-8 interpretation. The strCharset attribute can be added to the <field> tag for required fields.

<field name="NAME_UTF" indexed="true" size="25" type="Alphanum" digits="25" scale="0" strCharset="UTF-8"/>

Once the RTG table is sqlized or otherwise added to the RTG Server “catalog” (dictionary) then SQL queries, JSON DB and dbNotify notifications can be retrieved from the database with proper UTF-8 encoding.

Limitations, Interactions and Compatibility

  • Currently, Faircom RTG does not allow specific collation sequences to be defined.
  • UTF-8 encoded data can require more storage space due to multi-byte character encoding. That can mean a string containing X number of characters might require > X storage bytes.

TOCIndex