FairCom ISAM for C

Cached Data - Writing to Disk

In general, the purpose of the cache is to prevent reading and writing data to/from the disk. The data managed in the cache accesses the disk for a number of reasons:

File Extension Size: If any index or data file has been defined to have a “file extension size” (as defined in the IFIL or IIDX structures), these files are defined to require disk writes. Specifying an extension size tells FairCom DB to extend the file by the given number of bytes each time it runs out of space. To secure this extension, c‑tree forces the disk write, thus ensuring the operating systems file allocation table (FAT) registers this new extended size. The best way to ensure maximum performance is to define a ZERO (0) FILE EXTENSION SIZE for all files. A value of zero suppresses this operating system overhead.
File Header Updates: There are a number of situations where the server updates the control header portion of a c-tree file. Updating the number of active records in the file is a good example. Typically, a file header update does not necessitate a disk write, but some situations cause a required disk write. Perhaps the best example is when adding a new root node to an index file, which requires a file header write. These types of circumstances are very infrequent and should offer no apparent overhead to the overall performance of the system.
Not Enough Cache Space: Of course, if the demand on the cache exceeds its defined limit, data not currently being used will be written to disk. Cached index pages are analyzed using a “most-recently-used” strategy to determine which pages remain in cache.
Starting with V11, a new caching strategy is used for the data cache (index cache is still “most-recently-used”). The data cache gets the next page from one of several lists of pages set up at startup time. All of the cache pages are included in the lists; each page is on only one list. The number of such parallel page lists is specified by the DATA_LRU_LISTS configuration entry.
Middle Portion of Variable-length Records: c-tree does not store the middle portion of variable-length records in cache, only the first and last pages. This prevents large blocks of data from consuming the cache and also alleviates the management of a large number of cache pages for any one particular record. There is increased overhead when managing a large volume of individual cache pages for any one particular record. The cache pages for consecutive segments of a record (where a segment fills a cache page) are completely independent of each other - they are not stored in consecutive memory. I/O is performed separately for each cache page, therefore if a large number of cache pages exist; it is likely to be slower than faster.
File Open and Close Operations: Any action to open or close a file causes disk I/O. Besides the obvious reasons, a number of internal operations take place during these critical operations.
Transaction Control: To manage transactions, there are a number of transaction processing operations that require disk flush activity. Securing transaction logs, issuing checkpoints, and handling transaction cache ageing are a few examples of disk activity that support automatic recovery to ensure data integrity.

Write-Through Cache

Our goal is to offer the most efficient caching scheme while assuring complete data integrity. FairCom does not categorize its cache as what some have called a “write-through” cache. The file-size extension issue discussed above is one reason. Given a zero file size extension, all “adds” to the database should go directly to cache, with no disk activity. The ctWRITETHRU file mode can be used to turn on write-through caching behavior for a file.

Does c-tree cache everything?

Not everything is cached, for example:

As mentioned, the middle portions of variable-length records are not cached.
As defined below, a file may be specified “not to be cached”.
Data “not recently used” when the cache is full will generate disk activity.

For example, when writing/reading a 100K record, do you try to cache it, thus wiping out the utility of the cache? No! Because we only retain the first and last page of a variable-length record in the cache, this problem does not exist.

Size Limits on What to Cache

The approach used for variable-length records alleviates any concern for size limits on what you try to cache.

Caching Variable-Length Records When the Entire Record Does Not in Cache (NO_CACHE)

For a file that has mostly records of length greater than two times the PAGE_SIZE, NO_CACHE might be beneficial because the middle portion of the record is not cached. With a page size of 32KB, this means records up to 64KB in size will be cached (since the first and last page of a variable length record are cached). An example of when to use NO_CACHE is writing variable-length journal files that seldom need to be read, the NO_CACHE could be a good choice for this type of file.