Product Documentation

Knowledgebase

Previous Topic

Next Topic

Automatic Recovery Fails

At startup, FairCom DB examines the transaction logs to determine whether or not it needs to perform automatic recovery. If so, the server initiates recovery and when the recovery successfully completes, the server startup continues as usual. In some cases, however, the automatic recovery process can fail. For example automatic recovery may fail if:

  • The server’s transaction logs are damaged, missing, or inaccessible.
  • TRNLOG FairCom DB data or index files that automatic recovery determines it must process are damaged, missing, or inaccessible.
  • The server configuration file settings are inconsistent with the settings used the last time FairCom DB was run.

In This Section

Recovering from Automatic Recovery Failure

FairCom DB File Open Errors During Recovery

Automatic Recovery Terminates Abnormally

Automatic Recovery Takes Excessive Time

Previous Topic

Next Topic

Recovering from Automatic Recovery Failure

If automatic recovery fails, FairCom DB logs error messages to its status logs and terminates. In the event of an automatic recovery failure, proceed as follows:

  1. Examine the server status log to determine the type of automatic recovery failure.
  2. See the specific failure cases in the following sections for details on each type of recovery failure and if possible correct the problem. Restart FairCom DB and allow automatic recovery to complete successfully.
  3. If the automatic recovery failure cannot be corrected, follow these steps to recover or restore TRNLOG files and to resume FairCom DB operation:
    1. If automatic recovery terminated due to a fatal exception and the system generated a core file, save a copy of the core file for offline analysis.
    2. Save a copy of the server’s transaction logs (*.FCS files), the server configuration file, and if time and disk space permit save a copy of all TRNLOG data and index files. These files can be examined offline to attempt to identify the cause of the automatic recovery failure.
    3. TRNLOG files that were in use at the time of the abnormal server termination may be in an unknown state. To determine the state of each TRNLOG file, attempt to open each TRNLOG file using a c tree file open function. If the file opens successfully, the file is in good shape and did not need to be processed by automatic recovery. If the open fails with error 14, the file must be rebuilt, re-created, or restored from backup.

    Be aware that restoring only some TRNLOG files from backup may violate transaction consistency that an application expects to exist among the set of TRNLOG files. If this is the case, all related TRNLOG files must be treated the same way (all restored from backup for example).

    1. Verify that the server-maintained TRNLOG files FAIRCOM.FCS, SYSLOGDT.FCS, and SYSLOGIX.FCS can be properly opened. If the files fail to open (for example, with FairCom DB error 14 due to the failed automatic recovery) the server will fail to start up.

    FAIRCOM.FCS contains the server’s user and group definitions. If FAIRCOM.FCS fails to open and the server administrator has not changed the default user and group properties, this file can be deleted and FairCom DB will re-create it the next time the server starts up. If the server administrator has defined new users and groups or has changed default user or group properties, this file must be rebuilt or restored from a backup, or the user and group account changes must be re-entered.

    SYSLOGDT.FCS and SYSLOGIX.FCS contain FairCom DB system event log entries. A copy of these files can be saved and examined offline if desired and the original copy of the files can be removed from the server directory, and the server will re-create these files the next time it starts up.

    1. Move existing transaction logs from the server directory to a temporary alternate directory location(*).. Transaction logs consist of the files S0000000.FCS, S0000001.FCS, and all files named L<lognum>.FCS, where <lognum> is a 7-digit number.
      (*) Transaction logs may contain important unrecovered data. We want to retain these existing logs in case further data recovery is required in extreme cases.
    2. Restart FairCom DB. The server creates a new set of transaction logs and is ready for operation again.

The following sections discuss specific automatic recovery failure situations and the options that are available in each case.

Previous Topic

Next Topic

FairCom DB File Open Errors During Recovery

Automatic recovery fails if a TRNLOG file that must be processed during recovery cannot be opened. See the section titled “Errors Occur When Opening FairCom DB Files” for possible errors that may occur when opening FairCom DB files and what can be done in each case.

A special case that can occur during automatic recovery is that a file cannot be opened because an application used a FairCom DB API function to delete or rename the file. In some cases, automatic recovery does not realize that the file should be expected to be missing for this reason, and automatic recovery attempts to open the file and fails with FairCom DB error 12 because a file by the specified name does not exist.

Note: Creating TRNLOG files as transaction-dependent files (in which file creation and deletion are guaranteed to be atomic and these events are indicated by transaction log entries) avoids most occurrences of this type of situation, but does not guarantee that this situation cannot occur.

When this situation occurs, the server logs the following messages to the server status log:

Tue Sep 30 10:44:10 2003

- User# 01 mark.idx: 12

Tue Sep 30 10:44:10 2003

- User# 01

*** Recovery may proceed by adding 'SKIP_MISSING_FILES YES' ***

*** to the server configuration file. ***

Tue Sep 30 10:44:10 2003

- User# 01 Automatic recovery terminated with error: 12

As indicated in the server status log messages, the SKIP_MISSING_FILES server configuration keyword can be added to the server configuration file in order to avoid this error when a file is missing during automatic recovery. Because error 12 can occur for other reasons (for example, a file may be inaccessible to the FairCom DB process due to file permissions or due to an unavailable volume), confirm that the specified file does not exist and that there is a reasonable explanation as to why this file does not exist before adding the SKIP_MISSING_FILES option to the server configuration file and restarting FairCom DB.

Previous Topic

Next Topic

Automatic Recovery Terminates Abnormally

If the FairCom DB process encounters a fatal exception during automatic recovery, causing the system to terminate the server process, the system may produce a core image of the server process at the time of the exception. The FairCom DB status log may contain error messages related to the exception.

In this situation, examine the server status log to see if there are any error messages that point to the cause of the exception. If the status log shows automatic recovery errors, consult the appropriate section above for actions based on the specific error code shown in the status log. FairCom DB may be restarted in order to retry automatic recovery, but if the recovery continues to fail in this manner and the problem cannot be corrected, follow the steps listed in the "Recovering From Automatic Recovery Failure" section.

Previous Topic

Next Topic

Automatic Recovery Takes Excessive Time

Automatic recovery may take a long time to complete for the following reasons:

  • Server configuration settings such as increasing the log size and checkpoint interval may require the server to scan a significant amount of log entries and to process a considerable number of transaction undo and redo operations.
  • TRNLOG indexes that do not use the LOGIDX filemode may need to have their tree structure reconstructed, which can increase recovery time.

In the event of a long automatic recovery, server administrator has the following options:

  • Allow automatic recovery to complete (waiting as long as it takes).
  • Terminating the server process and restarting recovery (if server configuration settings or other system properties can be changed that may improve recovery speed).
  • Terminating the server process and abandoning recovery (re-creating or restoring TRNLOG files from backup).

See the "Server Startup Hangs or Takes Excessive Time" section for details on monitoring automatic recovery progress and the “Recovering from Automatic Recovery Failure” topic above for steps to follow when choosing to abandon automatic recovery and re-create or restore TRNLOG files.

TOCIndex