This section presents an overview of the system architecture, including a discussion of storage systems and the placement of FairCom DB in your system.
Applications that rely on the c-tree “standalone” architecture use file and data management operations to directly manipulate groups of files and information in those files. The focus is on controlling files, and data, directly from an application. In a multi-user environment, as users and needs grow, the simple data access afforded by the multi-user standalone architecture (sometimes referred to as “FPUTFGET”) become cumbersome, unstable, or impossible to implement and maintain in a practical manner due to the fact that files and user applications reside on physically separate systems. The data integrity of the application is largely dependent on the file locking provided by the operating system -- complexities that are amplified when client machines are not on the exact same version and patch level.
The client/server architecture divides a logical application into a front-end client and a back-end database server. The client interacts with the application user to determine specific needs and only sends relevant requests to the server asking that the needed service be performed. The server accepts this request, performs the needed service, and relays the results back to the client. The client can then present the information to the user through whatever interface is appropriate. The server acts as a central traffic cop and can support many clients simultaneously. In addition, clients may even be different applications for a truly robust application suite.
The key advantage becomes a division of labor between the client application and the centralized database server. Clients focus solely on the tasks they're intended to carry out for the user, while the server is solely dedicated to the most efficient data handling possible. This also eliminates the duplication of valuable resources such as memory and disk space that exists when these data management capabilities are incorporated into each application.
The most visible advantage, however, is pure performance. The complex issues of multiple users storing and retrieving data can be isolated within a single server process. As users increase in a strictly file-based environment, contention for those file resources quickly adds up. The server can consolidate those resources and take advantage of in-memory caching of data and indexes for scalability up to hundreds and even thousands of concurrent client connections. This is simply not possible with the standalone multi-user model.
Client/server and multi-tier computing have become the models of choice in database systems for the most basic of reasons: increased speed, control, and efficiency in data management. FairCom DB brings these benefits over traditional standalone models:
Background
Database systems are by nature designed to store and maintain data in a durable manner. This requires a storage medium, typically a hard disk drive. More data required more storage and as a result, storage has become cheaper and vastly more scalable. Other important considerations of a storage architecture are ease of backup, partitioning, and mirroring of data. As storage systems continue to emerge and diversify, FairCom is frequently asked about support for various storage architectures. The following information is provided to address these most common inquiries.
Database Caching
The FairCom DB database engine provides in-memory caching of updated data and index information for performance. This data is periodically flushed to disk with standard IO filesystem calls.
Filesystem Semantics
Depending on the nature of the data, such as the case with an index node page or a transaction log buffer, these may be requested to go directly to disk to guarantee absolute data integrity. To provide these integrity guarantees, it is important the data storage system provides adequate capability. The database engine must know that a write request returned successfully to ensure the data is secured to the storage medium. Not only must the write request succeed, the ordering of the requests must be respected. This may not be the case with all storage architectures.
Two major categories of shared storage are described below.
Storage Array Networks (SAN)
SAN devices are a proven and powerful data storage technology that can offer impressive performance characteristics. SAN devices share the raw physical storage device. For large organizations requiring sophisticated data storage and fast response times, a SAN architecture is often the recommended approach. In general, SAN technologies satisfy the ordering of writes and guarantees data integrity once a write request returns, even if the SAN device itself provides internal buffering of data for enhanced performance. The most secure SAN devices provide battery backup of their internal caches for the utmost of protection.
FairCom DB is successfully deployed in many SAN environments. While relatively expensive compared to local storage, FairCom recommends and supports SAN storage architectures for large capacity solutions in high throughput systems.
Remote File Systems
Remote filesystems typically share a filesystem. Contrast this to SAN devices which share the physical storage medium. Several varieties of remote filesystems are typically encountered:
NAS systems are shared filesystems attached to a client via a standard network connection, usually TCP/IP.
NFS systems are hierarchically organized filesystems most often found in Unix environments. There are many standard and proprietary implementations, usually available to the system as a remote network mount in the filesystem table. The most notable concept of which is the stateless client.
Microsoft Windows provides a file sharing protocol (SMB, and beginning with Windows Vista, SMB2) that allows users to easily share folders and files between workstations.
While remote file systems are relatively much cheaper than their SAN cousins, in general, they do not provide an adequate level of IO capability for database integrity as they do not fully support the filesystem semantics required. That is, there is no guarantee that a write request has succeeded in most cases and the ordering of IO operations is not respected to guarantee data protection for a database. In addition, there is often a significant network IO latency for these storage systems that can severely impact performance.
NFS systems also implement file locking mechanisms which are not compatible with performance sensitive distributed lock management.
With these considerations in mind, FairCom does not recommend nor support remote filesystems for FairCom DB database storage.
When designing a system, the system architect determines the appropriate positioning of FairCom DB in the system architecture based upon system requirements. In some cases simplicity is the primary requirement. In other cases, scalability and fault tolerance are the primary requirements. This section discusses two options for integrating FairCom DB into a system architecture: a single FairCom DB architectural model and the multiple FairCom DB architectural model.
Single FairCom DB Architecture
The single c-tree Server architectural model is the simpler of the two models. In this model, one c-tree Server serves all clients. The advantage of this model is its simplicity. Because this model involves only one c-tree Server, there is no application routing logic and no synchronization of data among multiple servers. The application simply connects to the server and accesses the data.
The tradeoff for the simplicity of this approach is that the client load and scalability of the database system is limited to the capacity of the machine on which the c-tree Server runs. This choice of system architecture limits the ability of the system to scale to meet increased capacity requirements over time. Also, the availability of the system is determined by the availability of the single c-tree Server process and the machine on which it runs. If a software or hardware component failure renders the c-tree Server process or its data inaccessible, the availability of the system is directly affected.
Multiple FairCom DB Architecture
Given the scalability and fault tolerance implications of a single c-tree Server model, many enterprise-level systems choose to implement a multiple FairCom DB architectural model. In this model, multiple FairCom DB servers serve clients, and the architecture frequently includes load balancing and data synchronization components.
Load balancing components route incoming requests to FairCom DB in such a way as to spread the load evenly among the servers. The use of load balancing enables efficient use of multiple systems, enhancing scalability of the system.
In a system with more than one FairCom DB server, each server manages its own data set. The system architect decides how to partition data sets among FairCom DB. For example, the design could specify that each server maintains a full copy of the data set, or that each server maintains a subset of the data set. Data synchronization components apply changes made to one data set to other data sets as required by the system.
In the event of a failure in one portion of the system, client access is automatically rerouted ensuring continuous operations.