Skip to main content

Plans

How to replicate

Abstract

A replication plan brings together publications and subscriptions and is the central point of reference defining a specific data replication scenario.

Figure 1. replication plan window
replication plan window


A replication plan brings together publications and subscriptions and is the central point of reference defining a specific data replication scenario. A replication plan can be considered a container with a collection of publications and subscriptions along with directives for executing replication within them.

You can create multiple replication plans and each one operates independently of the others. Plan information is maintained in a central database managed by the Replication Manager. As a plan is data, it can also be replicated for highly redundant availability.

Plan directives:

A replication plan must choose one option in each of the following four categories called directives.

  • Asynchronous or synchronous

  • Serial or parallel

  • Unidirectional or bidirectional

  • Selective or non-selective

For port access in firewalls see Replication network recommendations.

Asynchronous replication

Asynchronous replication replicates data between two servers through a non-blocking process. This allows the primary server to process data as fast as possible without being slowed down by a secondary server or the network. It can simultaneously replicate data to multiple servers. It is ideal for replicating data between data centers.

Synchronous replication

Synchronous replication uses a blocking process to replicate data between servers to ensure both servers have persisted a transaction to their transaction logs before the primary server reports back to the application that the transaction is committed. Synchronous replication is required for high availability because only it can guarantee data is always the same on both servers at all points in time. Because it is blocking, it is not as fast as asynchronous replication.

Note

Synchronous replication is more demanding than asynchronous replication. If your application requires synchronous replication, be sure your hardware is as fast as possible.

Recommendations

Purchase the fastest Ethernet Network Interface Cards (NICs) compatible with your server. For high availability, purchase 2 NICs per server (we recommend 100GbE NICs).

The model of NICs depends on the server hardware:
  • For failover between 2 servers, the NICs can be directly connected to each other.

  • For failover across 3 or more servers, the NICs must be connected to a compatible network switch that runs at the same speed or faster than the NICs.

For Replication Manager add incoming network port rules for:
  • HTTP port 80 or HTTP port 443.

  • Custom TCP port 5532

    This is the FairCom RM ISAM port. It depends on the server name and can be found in the CTSTATUS log file.

  • Custom TCP port 7000

    This is the FairCom RM SQL port. It can be found in the ctsrvr.cfg config file.

For replication clients add incoming port rules for:
  • Custom TCP port 5597

    This is the FairCom DB ISAM port. It depends on the server name and can be found in the CTSTATUS log file.

  • Custom TCP port 6597

    This is the FairCom DB SQL port. It can be found in the ctsrvr.cfg config file.

  • HTTP port 80 or HTTP port 443

    This is only needed if you want to access the web tools.

Serial replication

Serial replication uses one thread to replicate data in the transaction log. Typically, a single thread cannot keep up with many database connections simultaneously writing data to a table. But if a table only has one thread writing to it, serial replication is often preferred.

Parallel replication

Parallel replication uses many threads to replicate data simultaneously. It can keep up with multiple database connections simultaneously while writing data. In many cases, parallel replication is preferable over serial replication. Parallel replication requires more CPU to handle the additional threads it uses to process replication. Because CPU resources are limited, the number of threads is configurable. Parallel replication also has additional overhead because it does dependency checking to ensure data consistency. Dependency checking ensures that changes to the same record occur in the original transaction order. Dependent changes must be serialized. In the unusual case where many transactions have dependencies, parallel replication becomes slower than serial replication because of the overhead of dependency checking combined with the net result of serialized transactions.

Unidirectional replication

Unidirectional replication is one-way replication. It is the most common type of replication and replicates data from a source to a target database server. It is simple, predictable, and has no data conflicts. It is ideal for implementing solutions such as high availability, disaster recovery, offloaded reporting, global distributed read scalability, microservice shared data, and so forth.

Bidirectional replication

Bidirectional replication is two-way replication where data is replicated in both directions between database servers; thus, any data change on one server is replicated to the others. It is great for large-scale global data processing because many servers can process data in parallel and users can modify data on the server that is most closely located to them. This works best when users do not simultaneously change the same data, such as a user’s personal shopping cart. When users can change the same data, such as an application’s lookup data, this creates data conflicts that are hard for applications and users to reconcile.

Selective replication

Selective replication uses a filter to determine which data records in a table to replicate. Selective replication can distribute specific data to specific global data centers for regulatory compliance or to scale an application globally. It is also useful for replicating personal data to user’s mobile devices. It is also useful for sharding data within a data center. Sharding works like table partitioning except it spreads a table across database servers. In other words, it splits the data in a table into buckets and puts each bucket on a different database server. Selective replication can automatically replicate data to the proper shards.

If you are considering sharding within a data center, you will also want to consider selective replication. Sharding works like table partitioning except it spreads a table across database servers. In other words, it splits the data in a table into buckets and puts each bucket on a different database server. Selective replication can automatically replicate data to the proper shards.

Non-selective replication

Non-selective replication replicates all data in a table. This is the obvious choice when all data is necessary. However, where all data is not necessary, you can save time and performance by only replicating the data you need. Or in other cases, you may have specific business logic or even regulations to follow that limit what data can be replicated. There are also data governance advantages to be considered. In these cases, you will want to use selective replication to obtain only the data you want.