Product Documentation

Database Administrator's Guide

Previous Topic

Next Topic

Windows Resource Error (1450) Configurable Retry Logic

When the Windows kernel has allocated all of its paged-pool memory, it will not be able to perform many tasks and instead returns a STATUS_INSUFFICIENT_RESOURCES (0xC000009A) message. This is a restriction of 32-bit addressing (only 2GB addressable within the kernel), regardless of the amount of memory available in the system.

When the FairCom Server configuration option IO_ERROR_BLOCK_SIZE option is specified in the FairCom Server configuration file, a read or write operation that fails with Windows system error 1450 (ERROR_NO_SYSTEM_RESOURCES) is retried in blocks of the specified size. If any one of those read or write operations fails, the FairCom Server fails the read or write operation.

The FairCom Server supports two additional configuration options that permit additional disk read/write retries and a sleep interval between retries.

IO_ERROR_BLOCK_RETRY <retries> specifies the maximum number of failed IO_ERROR_BLOCK_SIZE-sized I/O operations that must occur before the I/O operation is considered to have failed. If the IO_ERROR_BLOCK_SIZE-sized I/O operations that are being attempted for a particular I/O operation fail more than <retries> times, the FairCom Server writes a READ_ERR (36) or WRITE_ERR (37) message to CTSTATUS.FCS and considers the I/O operation to have failed.

A value of -1 signifies infinite retries. The default is 0, which means that the I/O operation is tried only once in IO_ERROR_BLOCK_SIZE-sized blocks, and if any of these I/O operations fails, the entire I/O operation is considered to have failed. As another example, if IO_ERROR_BLOCK_RETRY is set to 20 and IO_ERROR_BLOCK_SIZE is set to 65536, if a 327680-byte write is retried as 5 65536-byte write operations, then the I/O operation fails if there are 20 failures to perform those 5 write operations.

IO_ERROR_BLOCK_SLEEP <time> specifies a time in milliseconds between retry attempts. The default is zero, which means that retries are attempted immediately.

SNAPSHOT Monitoring of Failed Retires

To permit monitoring the number of I/O error 1450 retries that have occurred, a counter has been added to the system snapshot structure. The sctioblkretry field of the ctGSMS structure is defined as an unsigned long integer that stores the total number of I/O error 1450 retries that have occurred since the FairCom Server started. The snapshot log file SNAPSHOT.FCS displays the I/O error 1450 retry counter value with a description of "I/O ERR(1450) automatic retries:". The system snapshot structure version has been changed from 9 to 10 to note the presence of this new field in the structure and the statistics monitoring utility, ctstat, and ctsnpr utilities have been updated to properly handle the presence of this field in the system snapshot structure and snapshot log.