Record lock error
Retries and diagnostics for record lock error
FairCom replication supports configuration options to specify how many times to attempt to lock a record and how much time to sleep between record lock attempts.
FairCom replication supports configuration options to specify how many times to attempt to lock a record and how much time to sleep between record lock attempts. These options are used when the FairCom replication attempts to update a record on the target FairCom server.
The
lock_retry_count <count>
option that is specified inctreplagent.cfg
indicates that a record read or update that fails with error DLOK_ERR (42, Could not obtain data record lock) is retried up to<count>
times (default 2).The
lock_retry_sleep <sleep_ms>
option that is specified inctreplagent.cfg
indicates that before retrying the operation that failed with error DLOK_ERR, the Replication Agent sleeps for<sleep_ms>
milliseconds (default 100).
When an update fails with DLOK_ERR (after exhausting the retries), the Replication Agent then logs the following message to ctreplagent.log
as this error is not expected due to FairCom replication usage of blocking locks.
ERR: Unexpectedly failed to update record: error code=42 (diag=<diagnostic_code>)
<diagnostic_code>
is one of the following:EQLVREC()
- call failed with error DLOK_ERRRWTVREC()
- call failed with error DLOK_ERREQLREC()
- call failed with error DLOK_ERRRWTREC()
- call failed with error DLOK_ERR
FairCom DB was also modified to add diagnostic log messages in the function that is used to extend a file. That function contains logic that attempts to acquire a lock on a new space. The function tries to acquire a lock up to 100 times, sleeping for 10 milliseconds between each lock attempt. It is hypothesized that a RWTVREC()
operation could be failing with the DLOK_ERR error as this code exhausts its retry attempts. To determine if this is the case, additional logging was added with the following message to CTSTATUS.FCS
when the lock attempt in this function fails.
extfil: Failed to lock offset 0x<offset> when extending file <filename>: 42
A record lock error (for example, error 42), may indicate the record on the target is already locked by another client. Check to see if a rogue replication is accidentally running in the background. You can use FairCom monitor to view current connections and their origins. The command-line ctadmn
utility can also provide this information.