High availability cluster technology based on Linux

xiaoxiao2021-03-06  42

Overview

Technology has developed to this day, Linux has been widely used worldwide. Large systems consisting from stand-alone systems to hundreds of servers can see the shadow of Linux. In some places, Linux has been responsible for the tasks of running those large enterprise applications, which are highly available for high availability, such as:

l Requirement 24 * 7 uninterrupted continuous service, the system cannot show any faults,

This requires the system to have automatic fault recovery

l Be able to withstand high loads

l To have mass storage capabilities

In the next two or three years, with the continued development of the Internet and the growth of e-commerce, this

The LINUX app will grow rapidly.

In the past, the cluster system is mainly based on commercial operating systems. For example, Compaq is based on TRUE64 UNIX cluster products Trucluster Server; Novell's NetWare Cluster Services (NCS), etc. This type of product is very expensive, and the selection remaining to the user is very limited.

Since Linux itself is based on GPL, fully open, but intel-base hardware

Taiwan has a high price / performance ratio, so multiple Linux servers constitute a cluster system to provide high availability enterprise application platforms, which gives you an enterprise application platform with high availability with low prices. It is possible to greatly drive the process of expensive cluster system in the past.

In 1999, there were rarely seen Linux cluster products in the market. But the attention of the Linux cluster products quickly warmed, so far, AppTime Technology, Legato Systems, Lineo, Mission Critical Linux, Motorola, Polyserve, Red Hat, SGI, Steeleye Technology, and TurboLinux have released their Linux cluster products, and HP has also joined this ranks in 2001.

These products are widely used in various fields, including embedded Linux applications, WEB applications with load balancing, and some key enterprise applications with high availability. It also combines NAS or SANs in data storage, constitutes a complete enterprise application solution.

Market leader

Currently, there are three leaders in high availability clusters:

Steeleye's LifeKeeper for Linux

Convolo of Mission Critical Linux

Legato Cluster

LifeKeeper for Linux is private, has its own copyright; convolo is open source;

Legato Cluster is also private. All three companies have specialized clustered product development teams and have experience in developing cluster systems under Linux and other operating systems. Their cluster products are more popular in the market than other companies.

LifeKeeper for Linux originated from NCR's Unix-based LifeKeeper. In 1999, SteeleyE acquired NCR and retained several core developers of Lifekeeper. Subselease, Steeleye made a lot of work and porting LifeKeeper from UNIX to Linux. LifeKeeper focuses on cluster technology and promises to continue to develop Lifekeeper for Linux to meet growing customer needs.

CONVOLO is more advanced in these three products, which is open source and absorbs a lot of nutrients in the open source community. Since there is an open source community support, Mission Critical Linux believes that they know the needs of cluster users, and Convolo provides several important features: data consistency support, NFS fault recovery support. CONVOLO currently hire several technicians who have previously been developed in DEC, which have experience in VMS Cluster and Trucluster. Legato is the market leader of enterprise storage software. Now it will focus on cluster products and hope to integrate it into its cluster products in the storage market. So far, in addition to Legato, there is no cluster product of other companies to integrate storage technology into the cluster product. Legato is the only market-supported cluster product provider on the market.

In addition to these three companies, other companies have successively launched a cluster product under LINUX. Such as HP, Polyserve, SGI, TurboLinux, Redhat, etc.

It is worth mentioning that SGI's Failsafe.

Failsafe has become open source since August 2000. Failsafe has a total of 230,000 line source code that exceeds any of the currently known cluster systems. In addition to Linux Failsafe, SGI retains Irix Failsafe, Irix Failsafe has independent copyright. The source code tree of Linux Failsafe and Irix Failsafe is independent, which means that Irix Failsafe's source code can quickly incorporate the source of Linux Failsafe, so Linux Failsafe can quickly get the new features of Irix Failsafe; GPL restrictions, Linux Failsafe's source code cannot be directly incorporated into the source code tree of Irix Failsafe. However, SGI's IRIX Failsafe developers can get the idea of ​​open source communities, absorb nutrition from the open source community, and quickly react into their products. At present, Failsafe has shown the trend of merge with Linux-ha.org, in the near future, Failsafe is highly likely to be a standard for high availability cluster technology under Linux.

Currently, Redhat's cluster product Redhat ha-server is primarily from LVS, inheriting LVS load balancing. Redhat lags behind the cluster technology in most Linux cluster product providers, and there is no signs of surpassing in the year. However, Redhat has a very ambition plan in the cluster product, but it develops in this regard to seriously depend on the development of open source communities. In addition to the cluster products, Redhat also involves the corporate market and embedded Linux market.

Turbolinux, like AppTime, is the first company that provides cluster products under Linux, about one year more than other companies, in 1999 TurboLinux has been involved in Linux cluster technology. Its TurboLinux Cluster Server 6 is mainly a load balancing cluster, and its HA-Server can form a cluster system with high availability. The entire design idea of ​​TurboLinux Cluster Server 6 is deeply affected by LVS, almost lurked.

Forecast in the next 12 months

The winner of the next 12 months may be redhat. By mid-2002, Redhat is likely to become a leader in cluster technology under Linux. It depends on the rendhystest force to spend how much energy is spent in cluster technology. Once Redhat develops a competitive Linux cluster product, it will use its huge advantage in the Linux distribution market to get a large share of the Linux cluster product market.

AppTime's Watchdog has introduced to the market in the mid-1999, but it has only achieved a small market share in Europe. It will determine whether it can continue to compete with other Linux clusters in the next 12 months. Watchdog is likely to be a regional product. Introduction to cluster and related technology

The cluster is integrated with two or more interconnected computers (we call nodes), externally appearing as a single, unified computing resource with high availability, high performance and easy management.

The most common cluster types currently include high performance science clusters and business clusters, where commercial clusters can be divided into load balancing clusters and high availability clusters.

Scientific cluster

Typically, the scientific cluster is designed to solve complex scientific issues, which require a lot of calculations. Although it does not use a special parallel supercomputer, this supercomputer is composed of ten to 10,000 independent processors. However, it uses a commercial system, such as a single processor or dual processor PC linking through a high-speed connection, and communicating on a public message delivery layer to run a parallel application. Therefore, you will often have a cheap Linux supercomputer to come out. But it is actually a computer cluster, its processing capability is equal to the true supercomputer, usually a set of quaternary cluster configuration overhead more than $ 100,000. This seems too expensive for the average person, but it is still cheap compared with the special supercomputer of millions of dollars.

Some parallel cluster systems can reach a fairly high bandwidth and low delays because they usually bypass common network protocols such as TCP / IP. Although the Internet Agreement is important for WAN, it contains too much overhead, and these overhead is unnecessary in the closed network cluster known to each other. In fact, some of those systems can use direct memory access (DMA) between nodes, similar to graphics cards and other peripherals in a machine. Therefore, across the cluster, one form of distributed shared memory can be directly accessed by any processor on any node. They can also use low overhead messaging systems to communicate between nodes.

Message Transfer Interface (MPI) is the most common implementation of a messaging layer between parallel cluster systems. There are several derivatives of MPI, but in all cases, it provides a public API for developers to access parallel applications so that developers do not have to handle the distribution of code segments between the nodes of the cluster. The Beowulf system first uses the MPI as a common programming interface.

The typical representative of the scientific cluster is Beowulf and SGI Advanced Cluster Environment (ACE).

Load balancing cluster

Load balancing clusters provide more practical systems for business needs. As shown in the name, the system allows the load to be processed as much as possible in the computer cluster. This load may be an application that requires a balanced application to process a load or network traffic load. Such systems are ideal for running a large number of users running the same set of applications. Each node can handle a part of the load and can dynamically allocate a load between nodes to achieve balance. This is also true for network traffic. Typically, the web server application accepts too much access to the traffic, so that it is impossible to process, which requires traffic to send traffic to network server applications running on other nodes. It is also possible to optimize according to the special environment of each node.

The load balancing cluster distributes and handles the load between multi-nodes. In most cases, each node in this cluster is a separate system that runs separate software. However, no matter how direct communication between nodes, or by central load balancing servers to control each node load, there is a public relationship between nodes. Typically, the load is distributed using a specific algorithm.

Network traffic load balancing is a process that checks to a cluster network, and then distributes traffic to individual nodes for proper processing. It is best for large network applications such as web or FTP servers. Load balancing network application service requires the cluster software to check the current load of each node and determine which nodes can accept new jobs. This is ideal for serial and batch jobs such as data analysis. These systems can also be configured to focus on hardware or operating system functions of a particular node: This, the node in the cluster is not necessary. High availability cluster

The appearance of a high availability (HA) cluster is to make the overall service of the cluster as possible in order to consider calculating hardware and software fault tolerance. If the primary node in the high availability cluster has failed, then the secondary node will be replaced by the secondary node. The secondary node is usually the mirror of the primary node, so when it replaces the primary node, it can take over its identity and thus make the system environment consistent with the user.

The HA cluster is committed to allowing the server system's running speed and response speed as fast as possible. They often run redundant nodes and services on multiple machines to track each other. If a node fails, its replacement will take over its responsibilities in a few seconds or less. Therefore, for the user, the cluster will never stop.

Some HA clusters can also maintain redundant applications between nodes. Therefore, even if the node is in use, the user's application will continue to run, and the running application will migrate to another within a few seconds, and all users will only perceive the response slightly slower. However, this application-level redundancy requires software design to have cluster awareness, and know what should be done when the node fails. But for Linux, there is still a certain difficulty in this now. Because the current Linux system does not have a HA cluster standard, there is no public API to build software with cluster consciousness.

The HA cluster can perform load balancing, but usually the primary server runs jobs, and the system keeps the secondary server idle. The auxiliary server is usually the mirror setting of the primary server operating system, although the hardware itself is slightly different. The auxiliary node performs active monitoring or heartbeat to see if it is still running. If the heartbeat timer does not receive a response to the primary server, the secondary node will take over the network and system identity (such as the IP hostname, and address).

The HA cluster usually includes 2 to 8 or more nodes, but the current 80% HA cluster is 2 nodes. Apptime the Watchdog, Hewlett-Packard's MC / Service Guard, Legato Cluster Enterprise, Lineo's Availix Clustering, Mission Critical Linux's Convolo, Motorola's HA-Linux, SGI's FailSafe, SteelEye's LifeKeeper, and Veritas Cluster Server HA clusters are .

Between these three basic types of the cluster, mixing and crossings are often occurred. It can be found that high availability clusters can equilibrate user loads between their nodes while still trying to maintain high availability. Again, a parallel cluster can be found from the cluster to be programmed to the application, which can perform load balancing between nodes.

Automatic failover

The automatic failover refers to the relevant resources automatically transferred to other nodes after a node fails. In some cluster products, the resource of the fault node can even be transferred to a plurality of other norms of normal operation. Resources usually need to be transferred include: physical disks, logical volumes, databases, IP addresses, application processes, print queues, and locks.

Automatic fault takeover usually has two modes. A Active / Passive mode, a Active / Active mode. Active / Passive mode is one or more nodes running the application, and one of the nodes is in a backup state. When the node in the active state is faulty, the backup node immediately takes over. The Active / Active mode is all nodes in an activation state. When there is a node, its work is automatically distributed by other nodes and takes over. As for which mode is used, it is necessary to weigh the cost performance. In Active / Passive mode, a node is always idle, so even if there is a node failure, due to the backup node substitute, the entire system does not have any decrease; in the Active / Active mode, due to the reserve node, When there is a node failure, the task of faulty nodes will be spread to other nodes, so the tasks of other nodes will be aggravated, resulting in a certain attenuation of the entire system performance.

Let's talk about the processing of data in the taking over:

When the cluster uses a shared data structure, the fault node is transferred to other nodes according to the pre-setting of the system administrator. When the cluster does not use a shared data structure, once the fault occurs, the disk belonging to the fault node must be switched to other nodes. In this process, in order to maintain data consistency, the target node must prevent the fault node from accessing the data. Sometimes nodes are only temporarily suspended, but they are not dead, so they must also distinguish this situation. This is a very painful process. In contrast, shared data has a greater advantage, but when using shared data modes, the distributed locking problem must be resolved in normal conditions and faults to ensure data consistency. For clusters, NAS and SAN are two good methods to solve shared data. NAS walks the TCP / IP protocol, while SAN is walking the FC protocol. NAS cost is less expensive, more suitable for SMEs; while SAN must construct a dedicated SAN fiber network and use specialized switches, cost is more expensive, more suitable for large applications.

Regardless of which mode is used, it has increased complexity to the management of clusters, and each method has its advantages, no method is universal. The use of SSI-based management tools can simplify the management of clusters and reduce errors when the fault is taken.

Single System Mirror (SSI)

SSI's focus is to reduce the complexity of cluster system management and increase the ease of use of cluster systems. SSI enables system administrators to manage the entire cluster like a machine. Cluster technology has developed to today, and the unified management tools have been very necessary, because the cluster system is very complex, which is very complex, and very few users can fully grasp it.

The advantages of SSI mainly have the following:

l The operating system in the entire cluster is only installed once.

l The application in the entire cluster is also installed once.

l Single security domain

l Ultimate users don't have to know which machine is run on the cluster, save a lot of trouble

l You can transparently use resources in the cluster without having to know their physical location.

l System management is simple, reducing this part of the cost

Open source on Linux cluster technology

How does the open source community affect the development of Linux cluster technology? The open source field does not limit the development of cluster technology under Linux. Unlike private UNIX cluster product providers, many Linux cluster product providers want to limit their changes to the Linux kernel, because they want their own code to run as many Linux issues and reduce the incompatibility with other products. Sex. This has caused a phenomenon: approximately 50% of Linux cluster product providers provide their kernel patches while publishing their products, and these patches are very simple.

There are some Linux cluster product providers to be a Linux distributor, in order to make its cluster product can be better running, they modify the Linux kernel on its Linux release. Due to GPL restrictions, they must provide the source code for kernel modifications, and placed online for free download with their release. TurboLinux is an example. Turbolinux modified a part of the kernel, such as threads, such as threads, such as threads. At the same time, it will announce the source code of the kernel modification section, put together with its release, for free download. Some Linux cluster product providers submitted their kernel patches to the Linux kernel development team, hoping to incorporate them into standard Linux kernel code. Of course, this has a great impact, especially affecting their competitors. Therefore, its bottom line is that this provider must be very visionary, and due to compatibility, they must be accepted by other providers and other software developers.

Of course, there is different opinions here. Some Linux cluster product providers believe that it is an inevitable trend to modify the Linux kernel. This view is similar to some UNIX cluster product providers, such as Compaq, Compaq has fully integrated TRU64 UNIX and its hardware system. Compaq believes that this is conducive to Trucluster to have more excellent performance in the case of heavy load.

In addition, many cluster project teams currently in the open source circles have a deep impact on Linux cluster product providers. Some Linux cluster product providers reduce development costs, reduce development costs, and integrate it into their own products. TurboLinux Cluster Server and Redhat Ha-Server are deeply affected by LVS. At the same time, some Linux cluster product providers have adopted the way to work closely with the open source, such as SGI's Failsafe, Redhat Ha-Server, and more.

Main Linux Cluster Product Introduction

Here we mainly introduce ten kinds of Linux cluster products in the market, so that you have a general understanding of the development profile of Linux cluster products.

AppTime .watchdog 3.0

Wizard was established in 1992, and 2000 is renamed as AppTime. Watch-dog3.0 is the high availability and load balancing cluster products from the company. Watch-dog transplanted on the Linux platform in 1999. Watch-Dog Light is a simplified version of Watch-Dog, which can only support two nodes and an application service.

Data in Watch-DOG 3.0 is unable to access, each application service that requires data access can only access data separately, and the data is locked during the accessed, which avoids concurrent access to conflicts.

When a node providing a service needs to be updated, the system administrator can manually perform Failover on this node, Watch-Dog 3.0 Linux version does not need to change the kernel, which applies to Red Hat, SUSE, Debian, FreeBSD, Windows NT / 2000, Solaris, HP-UX, IRIX, AIX, and Tru64 UNIX. It starts at $ 1,000 per node. Watch-Dog Light is at $ 250 per node, including ISP, ASP, telecom operators, e-commerce, etc.

The most important feature of Watch-DOG 3.0 is listed below:

l can support up to 32,000 nodes

l A SNMP gateway

l Support multiple application services, including Mysql, Oracle, Sybase, Informix, SQL Server, Samba, Sendmail, etc. L simple management tools

l Support a variety of real servers, including Windows NT / 2000, and Solaris

AppTime's cluster management software contains distributed cluster management based on X500 or LDAP and remote detection and management based on HTTP / XML.

Hewlett-Packard MC / Serviceguard

The product brought by Hewlett-Packard is MC / Serviceguard, which is originally based on HP-UX high availability cluster products, which still maintains its excellent functionality after porting to the Linux platform. The high availability of MC / Serviceguard has been verified, and more than 45,000 licenses have been sold in UNIX environments. MC / ServiceGuard constitutes the basis for high availability solutions for other HP. Including a variety of fault tolerant schemes (Campus Cluster, MetroCluster and Continental-Clusters)

MC / ServiceGuard and Linux kernel have a close relationship. HP's upcoming WatchDog batch driver has been closely combined with the Linux kernel. The initial version of MC / ServiceGuard supports up to 4 nodes, which is expected to increase to 16 in subsequent versions. MC / ServiceGuard is installed in an Intel chip-based network server based on the HP during the start phase, which is then run on a RISC processor-based server. Each node in the cluster is equal, each node provides one or more applications, and when a node is invalid, the application on this node is automatically transferred to other nodes in the cluster. The IP address corresponding to the app is also moved to the same node, so the customer can connect using the same machine name or IP address.

The goal of HP is those enterprise applications that need to protect their critical tasks in the database and application layer. HP simultaneously works with SAP and other companies to provide a group solution in the ERP field. The price of MC / ServiceGuard has not finally developed, and there are three pricing methods based on MC / ServiceGuard under the HP-UX platform, based on each node; functional and based on each processor.

The important features of MC / Serviceguard are:

l For erroneous rapid monitoring and fast recovery.

l Still the effectiveness of the application during system software and hardware maintenance.

l Online weight configuration

l scalable load balancing mechanism

l The application does not need to make any modifications to support high availability.

l Data protection

MC / ServiceGuard management and monitoring software, hardware, and network failures, fault detection and recovery processes are completely automated, and do not require administrator intervention. Application packages can be moved from one node to another from a node through several simple commands. Allow regular maintenance of a node in the cluster, while other nodes continue to work. Add the cluster again when the maintenance of the node is completed. Also alter the operating system of the node is also allowed.

MC / ServiceGuard has a new online weight configuration, which can change the configuration of the cluster when the system is running. This feature allows the user to increase or decrease the application package, modify the properties of the application, modify the entire package, with this cluster and other applications are still running. The user is also allowed to increase or decrease nodes online.

After the node is invalid, the different applications on this node can be transferred to different new nodes. The workload is also assigned to other nodes. The impact of allocation work makes the failure of the cluster to the smallest

Legato Systems Legato Cluster

Legato is a leader in enterprise storage management software. The company uses clustering technology to improve data and application availability. Legato Cluster is the name of the company's series of products. The main components of the series are Legato Cluste Enterprise, which contains basic cluster technology, and solution packages are added to the LEGATO Cluste Enterprise basis. Module, Legato Ecluster is a solution package that contains several additional modules added to Apache, Netscape, and HTTP performance detection. Legato Ecluster must have a web server support. Legato's cluster products do not open source, but they provide a complete Perl development environment to support open source research. The cluster products of Legato are ported to the Linux platform in April 2000.

Legato Cluster Enterprise does not need to change the Linux kernel, and there is no limit to the number of servers, which means that users can deploy as many servers as needed according to the need for the application environment.

The goal of Legato Cluster is to need customers with different platform trading environments. For example, e-commerce, ASP, ISP. Legato Cluster can run on SUSE, CALDERA, Red Hat, and other Linux releases, starting at $ 2,000 per node.

The main features of Legato Cluster Product (LEGATO ECLUSTER) are:

l Single configuration interface makes management and reconfiguring large clusters easier.

l There is no limit to the number of servers, which is very important for business environments with hundreds of web servers.

l Support a variety of platforms, including Linux, UNIX and Windows NT / 2000

l Provides a PERL development environment that allows a cluster solution for open source on the Legato Cluster architecture. And custom solutions based on specific application.

l Legato Cluster provides support modules for Apache.

l Support TCP / IP allows you to connect through a LAN or WAN.

The goal of Legato Ecluster is to use the IP layer load balancing solution. The IP layer load balancing will assign access requests for the specified IP address to all WEB servers running Legato Ecluster software through a Director. If the service is invalid, the Legato Ecluster restarts the service or transfer the machine ID, IP address, and web services to another alternate machine, and connects the service to the content. The content can be stored on SAN, NAS, NFS, or local hard drives. If a copy of the content is invalid, connect the service with the second copy of the content over the network.

Legato Cluster provides data integrity by providing data integrity on the server's isolation detection, providing a mechanism to isolate the Node of Legato Cluster to isolate yourself with other nodes in the cluster. Its main purpose is to prevent two nodes from thinking that the other party has failed, while writing data from the same disk space.

Legato Cluster prevents concurrency access to the data, only one server can write a specific disk area at any time. Legato Cluster does not provide a function of replication data, which is available by a third party.

Legato Cluster provides a single configuration interface that configures the entire cluster without configuring each node one by one.

Mission critical Linux convolo

Mission Critical Linux provides professional Linux application services and consulting services. The service provided by Mission Critical Linux includes pre-configured cluster solutions; security service technology (SST) provides remote management, detection, update to system security; Crash Analysis suite provides system Core Dump capabilities. Convolo is a Linux cluster product for the two nodes provided by Mission Critical Linux. This product is based on the Kimberlite technology of the open source of Mission Critical Linux, follows GPL. Convolo meets business users with high availability and data integrity. CONVOLO began to sell in July 2000 and priced for $ 995 per node. Convolo is developed by a multi-year development team developing experience in Linux, and most of the other highly available products are transplanted from UNIX platforms. Convolo is suitable for all major Linux distributions and IA-32 and IA-64 structures. Convolo's goal is those Internet service providers that require high availability databases and the basis of dynamic content; require high availability file servers, mail servers, database enterprise users, and users who customize services.

Below is the main features of Convolo:

l Open source

l Shared storage structure, support SCSI and FIBRE CHANNEL, and any node can directly access all shared disks.

l is independent of the version of Linux and the hardware platform.

l Support multiple applications, including Oracle, Sendmail, MySQL, NFS, etc.

Convolo needs to make some changes to the kernel, all changes are public code.

Convolo is designed to maintain data integrity after a wide range of system failures. In Convolo's design contains support for disk sharing through the SCSI or FIBRE CHANNEL bus interconnect. Multiple nodes can simultaneously access files stored on the shared disk partition to get the configuration and status information of the cluster, while the process lock mechanism makes only one node in a time to modify this information. Convolo does not use DLM to control access to the data, so only one node can run a specified service and access the service data, if the node is invalid, the service, and the access to the service data is moved to another node.

In order to ensure the integrity of data, the entire NFS Failover feature is provided in Convolo 1.2, including NFS lock protocols, and a series of authentication mechanisms.

Convolo includes a Web-based GUI administrative tool and a command line management tool, a simple installation / configuration script, and a complete document.

Polyserve Understudy and LocalCluster

Polyserve provides two series of Linux cluster products. The first system has now been released, which contains Failover and application recovery features that are small and medium-sized interconnect applications, such as web servers, firewalls, VPNs, proxy servers, FTP servers, etc. The second series is still under development. It provides a range of complete tools to support large internet applications.

The first series of products include Understudy and LocalCluster

Understudy is a high availability and load-balanced cluster product started from November 1999, and Understudy is a low-cost pure software solution. Understudy requires the data in the cluster to be independent, and the node is deployed in the same subnet. Understudy does not provide data replication, so the data of the node needs to be manually updated.

LocalCluster is a product that supports multi-node and starts on September 2000. LocalCluster is a real distributed solution that is equal to all nodes. LocalCluster can replicate web data between nodes. Because data is copied to all nodes in the cluster, the data will be automatically updated after a failure of the node is restored. Now LocalCluster does not copy all products of the database or dynamic web content Polyserve all products using Group communication technologies. This technique provides a replicable database that can be configured for the entire cluster. Allow administrators to manage the entire cluster instead of modifying each node.

Understudy and other Polyserve cluster products can run on Linux, FreeBSD / BSD, Windows NT / 2000, and Solaris. Polyserve makes its products as much as possible from Linux kernel changes, and Polyserve supports a variety of Linux distributions, including Red Hat, SUSE, Debian, Slackware, etc., two-node's understudy Linux and BSD versions are $ 999, Windows and Solaris version is more expensive. The local cluster is $ 1999 per node, but the ten-node local cluster is $ 6999, or $ 699 per node.

The main features of the Polyserve series cluster products are described below:

l Extension: A cluster can be extended to 128 nodes, and can also construct a cluster that can accommodate thousands of nodes.

l Cluster file system: Concurrently accessible file system composed of physically shared memory.

l Configuration tools for the entire cluster

l You can specify the priority of the node when Failover, and different services on one node can be transferred to different backup nodes.

l Multi-system management capabilities, manage users, printers, security, applications, etc., manage a single system, subset, or entire cluster under a console.

Polyserve future cluster products will be a shared SSI cluster, its goal is Internet data centers, ASP, ISP e-commerce sites, and content providers. The name and details of the product will be announced in 2001.

Red Hat High Availability Server

Red Hat is a leading Linux publisher worldwide, accounting for more than 50% of the market share. Red Hat High Good product is an open source cluster technology that provides dynamic load balancing and Failover support for TCP / IP, UDP services. High Availability Server starts selling in June 2000, and the two-node load balancing clusters are priced at $ 1995. This price includes technical support, including support for installation and configuration through telephone or web within a year, which can then add nodes at a price of $ 995 per node.

High Availability Server Based on Red Hat Linux 6.2 that supports other Linux versions, Solaris, and Windows NT / 2000 form a heterogeneous network environment.

High Availability Server's goal is an IP-based application that requires load balancing and high availability based web servers, FTP servers, mail gateways, firewalls.

High Availability Server is the main characteristics

l Open source issuance

l Simple installation

l High performance and high retracte

l High adaptability

l Enhanced security

l is easy to manage

High Availability Server needs to make some modifications to the Linux kernel, all changes are open source.

A dedicated installer is responsible for installing the package required by the cluster, and the number of nodes only depends on the user's hardware and network environment. High Availability Server provides enhanced security for web servers, providing enhanced security configurations for Web servers working in an open network environment. High Availability Server can be configured as two working modes of FOS and LVS. In FOS mode, the system is configured as a hot backup cluster of two nodes to provide redundancy for applications. In LVS mode, the system is configured as a multi-node cluster, which contains two servers responsible for load balancing, which are responsible for orienting user requests to one or more IP-based services, load balancing algorithms have rotation , Weighted rotation adjustment, minimum connection scheduling and weighted minimum connection scheduling, load balancing technology has network address translation, IP tunnel, and direct routing.

SGI Linux Failsafe

SGI has two cluster products, ACE and FAILSAFE. ACE is the technology of R & D nature, and Failsafe is a cluster technology that has been introduced to the market. This article only introduces Failsafe technology. Failsafe is released in 1995 and runs on the IRIX UNIX operating system in SGI. SGI and SUSE transplant them on Linux, SGI opens the source code of Failsafe in August 2000. (http://oss.sgi.com/projects/failsafe/)

Linux Failsafe is a highly available cluster product that supports up to 8 nodes per cluster. Each node can access shared RAID or mirror disks.

Linux Failsafe provides users with a set of system recovery tools and allows users to write their own recovery tools for specific applications. Linux Failsafe provides high availability support for multiple important applications. Includes NFS, Samba, Apache, and Oracle. Linux Failsafe does not need to make any modifications.

Although Linux Failsafe has all characteristics of commercial cluster products, it is mainly facing or those with creative professional users. Some applications (such as modeling) use a large amount of data, all of the operations hours or even days, so they need high availability support. Although Linux Failsafe is suitable for commercial applications,

The main goal of SGI is such as CAD / CAM, chemical and biological, data management, engineering analysis, scientific calculations.

The main features of Linux Failsafe cluster technology are listed below:

l High availability service

l Dynamic Cluster Configuration: A node can dynamically join the cluster or remove from the cluster without causing the application interrupt.

l Java-based cluster management

l In the case of uninterrupted, the node is maintained: the user can delete the node from the cluster, upgrade it, and then re-add it to the cluster, and the cluster's service will not be affected.

l Protection of data integrity

Linux Failsafe does not need to modify the Linux kernel, Linux Failsafe provides protection for data integrity by ensuring that a failed node will not write a file system or database to a file system or database. When a node is found to fail, other nodes in the cluster remove them from the cluster.

Linux Failsafe has Java-based GUI. The Java-based GUI includes a FailSafe cluster observe that is used to configure a cluster and display a cluster status in a dynamic graphic.

Steeleye Technology LifeKeeper for Linux

Steeleye Technology acquired Life-Keeper technology from NCR in December 1999, which also has a series of cluster products that use this technology. Before this, NCR has been deployed Life-Keeper on thousands of servers. They run in Solaris, Windows NT, and MP-RAS environments, and Steeleye has made some repairs, such as the modification of Java GUI, but Essentially it is still a cluster technology that is favored by NCR. LifeKeeper starts on June 2000, starting at $ 1,500 per node, Steeleye can provide nodes configure and full-day support services, but additional charges. LifeKeeper 3.01 now supports Red Hat Linux and Caldera OpenLinUX, which will support SUSE and other Linux releases. LifeKeeper also supports Windows NT, Solaris, and MP-RAS, which support Windows 2000 will be implemented within 2001.

LifeKeeper's goals are ISP, ASP, and customers who are partially e-commerce applications and databases on the Intel platform.

The most important features of LifeKeeper are listed below:

l Provides the longest runtime for key resources for a good protection of critical resources.

l Scalable configuration options.

l Java-based GUI provides good ease of use and easy configuration.

l has been confirmed on thousands of servers

l Support a variety of platforms, including Linux, UNIX, and Windows NT / 2000

LifeKeeper's Linux version uses Linux's API, do not need any changes to the kernel. Applications also do not need to be changed. The application recovery tools provided by Steeleye can automatically switch when applying failed. The currently supported applications are Oracle, Apache, Sendmail, Samba, Lotus Domino, and Informix. If you pay some expenses, you can also get the application recovery tool customized as needed. To ensure that customers can continuous access to applications and data, LifeKeeper automatically transfers the application to another node when the node is faunabarily.

LifeKeepe's architecture does not limit the upper limit of the number of nodes, but due to the limitations of Linux-based storage solutions, only two servers can only be connected to each SCSI bus, which limits the solution to the shared SCSI hard drive, in Only two nodes can only be connected to a shared SCSI hard disk.

LifeKeeper provides a static data replication feature for Apache servers, and generic data replication features will be available this year.

To achieve data integrity, LifeKeeper uses SCSI to protect data accessed by the shared SCSI bus by other nodes. When LifeKeeper is installed, an SCSI predetermined patch is also installed at the same time. This patch is now included in Red Hat Linux 6.2, and other Linux publishers such as Caldera are also included in their own release.

LifeKeeper can manage from anywhere on the network through a Java-based centralized GUI. This GUI allows administrators to manage all Linux-based applications, databases, and servers. Steeleye is now working with Caldera, Compaq, Dell, IBM, Intel, Red Hat, and other cluster technology providers. I hope to provide an easy-to-configure environment that integrates network and system configuration, similar to Hewlett-Packard's OpenView, IBM NetView and Tivoli, and Computer

Associates' unicenter / TNG.

TurboLinux Cluster Server 6

Cluster Server 6 is the third generation cluster technology of TurboLinux, which is from TurboCluster 4.0. Cluster Server 6 is mainly a load balance and scalable solution, and Turbolinux also claims to be a highly available cluster scheme. This means that the same application is available on all nodes available. Once the service on one node invalidates, it can restart on another node and will not cause interruption of the service. TurboLinux itself follows GPL, but the Cluster Server 6 above it follows GPL. The nodes in the cluster can run Turbolinux, Red Hat Linux, Solaris, or Windows NT / 2000 operating system. TurboLinux claims that Cluster Server 6 can make the network application's non-fault run time to 99.995%. Cluster Server 6 does not require specific hardware, nor does it need to modify the application.

The CLUSTER Server 6 cluster including two control nodes and two server nodes is priced at $ 995, including two control nodes and the cluster Server 6 clusters of ten server nodes priced at $ 1995.

The goal of Cluster Server 6 is a business application, e-commerce site, requires cheap load balancing software solutions for medium-purpose Web services and intranet services.

The main features of Cluster Server 6 are:

l load balancing solution

l Almost no restrictions on the number of nodes

l Useful cluster management interface

l Security connection between nodes

l Nods are running on Turbolinux, Red Hat Linux, Solaris or Windows NT / 2000 operating systems.

Cluster Server 6 modifications to the Linux kernel. All changes to the kernel are open source, which can be downloaded, and the CLUSTER Server 6 is not included in the kernel's patch because these modifications have joined the TurboLinux release.

The accelerated connection feature of Cluster Server 6 improves the performance of packet routing and transmission management, adding three load balancing technologies: network address translation, IP tunnel, and direct route. A new cluster management console (CMC) enables administrators to observe the status, load, and performance of the cluster in real time. CMC enables administrators to manage cluster settings and maintenance. Administrators only need to click on the mouse to make one node from the cluster to maintain it, and will not affect the performance of the cluster.

Veritas Cluster Server for Linux

Veritas Software's main products include Veritas Cluster Server and Veritas Global Cluster Manager, which is mainly running on Windows NT / 2000, HP-UX and Solaris. Cluster Server for Linux is based on Linux kernel 2.4, no source code. The technology is still tested, and the future will be released in Red Hat Linux, which is expected to not modify the Linux kernel, and its price is undecided.

Cluster Server for Linux does not require specific hardware, and its support does not need to make any modifications.

Cluster Server for Linux is more focused on management of the service rather than the management of nodes. An application consists of multiple resources, some is hardware, and others are software-based. For example, a database service includes one or more logical network identities, like IP addresses; a relational database management system (RDBMS); a file system; a logical disk manager; multiple physical disks. When the node is invalid, all resources need to be transferred to another node to rebuild the service.

At the bottom, Cluster Server for Linux monitors the resources used by a service. When the service is discovered, the service is automatically restarted, including local restart or movement to another node, restart, depending on the type of failure, if a single resource is invalid, no need to restart the entire service and only need to restart the failure The part.

The goal of Cluster Server for Linux is the Internet, telecommunications and financial applications.

The main features of Cluster Server for Linux have

l Scalable, 2 ~ 32 nodes

l can run on HP-UX, Solaris, and Windows NT / 2000.

l System management of consolidation.

Cluster Server for Linux is a cluster technology that is independent of architecture.

in conclusion

Today, in order to enhance system availability, enhance application performance, provide load balancing, reduce soft / hardware failures, protect critical applications, which is necessary to deploy a cluster system, thus making end users more satisfying. But on the UNIX platform, there are many shortcomings in this regard: too complicated, the deployment cost is too high, not flexible enough, choose room small, hardware is expensive, technology is not yet mature.

A new and more attractive solution is a Linux cluster. The existing most Linux cluster products are not complicated. There is no need for too many departments. There is no special requirement for hardware, although they have no cluster technology under UNIX mature, but with Linux into high availability with unstopable momentum These key applications, Linux cluster products will be more perfect. The challenges of Linux cluster product providers are to maintain lower complexity, high flexibility and low cost relative to UNIX cluster technology.

The clusters required for Internet data centers typically include hundreds of servers, so they need scalable resource configurations rather than the high availability modes provided by UNIX clusters. So the large cluster system will have more sets on the Linux platform instead of the traditional UNIX platform, and the management of clusters will be more concerned.

In the next period, the main challenges faced by the Linux cluster product provider are: better combination of existing technologies to form a better Linux-based cluster technology than the current UNIX cluster technology. To meet future needs.

转载请注明原文地址:https://www.9cbs.com/read-77967.html

New Post(0)