Product Documentation

c-treeACE V11.0 Update Guide

Previous Topic

Next Topic

Unicode default charset for SQL CHAR and VARCHAR changed from US-ASCII to ISO-8859-1

Prior to V11, the c-treeACE Unicode implementation mandated US-ASCII (7-bit) chars in [VAR]CHAR fields, which may have been too strict for customers who use an 8-bit charset. For this reason, the default charset has been changed to ISO-8859-1 (aka Latin1). According to ICU:

ISO-8859-1 is relatively unproblematic — if its limited character repertoire is sufficient — because it is converted trivially (1:1) to Unicode, avoiding conversion table problems for its small set of characters. (By contrast, proper conversion from US-ASCII requires a check for illegal byte values 0x80..0xff, which is an unnecessary complication for modern systems with 8-bit bytes. ISO-8859-1 is nearly as ubiquitous for modern systems as US-ASCII was for 7-bit systems.)

The modification should introduce no backward compatibility problems as US-ASCII is a subset of ISO-8859-1.

TOCIndex