Product Documentation

c-treeACE V10.0 Update Guide

Previous Topic

Next Topic

Data Compression

Data volumes are exploding, and, as a result, database file sizes are proportionately increasing. Managing large files remains an important parameter in sizing disk arrays, which can be quite expensive for high availability systems. Simply moving large data sets for archiving and backup purposes becomes a challenging task. Compressing data is a valuable technique to reduce this data storage challenge. By directly reducing the data record size, file sizes can be substantially reduced.

To enable support for additional file modes c-treeACE has implemented augmented variable-length records. The first supported augmented feature being data record compression.

c-treeACE Data Compression

c-treeACE now supports Data Compression. Recent challenges of larger HUGE files, downsizing needs, and migration compatibility from other systems that support data compression, as well as specific customer requirements resulted in this core database necessity.


Data Compression

As low-level data records are written to and read from disk, c-treeACE now intervenes just before they are passed to the operating system's file system, and will “compress” before writing and “un-compress” after reading each data record.

The default compression algorithm comes from the standard zlib library, written by Jean-Loup Gailly and Mark Adler and is an abstraction of the DEFLATE compression algorithm used in their gzip file compression program.

c-treeACE also supports a proprietary run-length encoding (RLE) option; yet perhaps even more important is the ability for the user to implement a Call-Back where you can define your own compression technique.

Data Record Compression

As different files and applications may require various types of compression, a c-treeACE resource (CMPRECRES) has been defined and is embedded into files supporting data record compression. This resource contains the compression type, version, custom parameters, and the DLL name (if built-in compression is not used). The calls to the compression routines are handled by function pointers that are automatically initialized for the built-in routines and DLLs.

The compression resource structure supports optional parameters to fine tune the compression operations. For example, zlib has five parameters that control internal processing. The compression structure uses a pVOID pointer and a length parameter to permit different compression DLLs to support different parameter sets. ctSETCOMPRESS() allows an application to set these parameters.

Server Configuration

Default c-treeACE compression can be specified with the following configuration keywords:

CMPREC_TYPE < "ZLIB" | "USER" >

CMPREC_VERSION <a number >= 1>

CMPREC_DLL <name of DLL>

These keywords should be entered in the configuration file in the order shown, and a DLL name is required for CMPREC_TYPE of USER.

A new COMPRESS_FILE keyword can be used to enable compression in files whose names match specified file names (including wildcards). See COMPRESS_FILE.

Compressed Files in c-treeDB

A new c-treeDB file create mode, CTCREATE_COMPRESS, has been introduced to enable compressed record support in c-treeDB. When this mode is used, c-treeDB automatically creates the file as variable length.

Notes

  • Some files will compress better than others (text data vs. binary data for instance). Initial testing revealed up to an 80:1 compression ratio in some cases.
  • At this time, only zlib file compression is available, however, support has been included for multiple compression routines in the future.
  • This feature is available only for ISAM files fully maintained by ISAM updates. Low-level c-treeACE functions will return an error when used on files with the compression attribute enabled.

TOCIndex