IA architecture high performance cluster system technology

xiaoxiao2021-03-06  15

1. System Overview

The data center area has always been the world of high-end RISC servers. In many years, people have only selected small machines such as AS400, E10000, HP9000, etc., the price is expensive, and the maintenance cost is high. The IA architecture server is low, easy to use and maintain, and the supercomputer can be constructed by Cluster technology. Its superb processing capability can replace the expensive medium and large machines, and open up new directions for the industry's high-end applications.

For large users in growth, the company operated in the data center or data warehouse is very amazing, and these data is very important to the role of large users. The data under the development of several years is a valuable wealth. By analyzing the data of these Hao Ruohai, the operators can get an intuitive business chart, curve, can provide powerful for future development of large users. policy support. However, because such data continues to expand over time, it has brought huge pressure to the IT system management of large users.

What kind of server does the user need to meet the needs of current and future development?

First of all, there must be super-computing power, you can withstand a long time, and the large number of users can be accessed.

Second, the high availability of the server system is extremely important for users. If the system fails, the interruption of the service, or the loss of important information, it will cause the user to save the loss. Therefore, users must consider highly available system scenarios when selecting a server system.

Third, with the continuous accumulation of data, the query and statistics of the data will make the system becomes slower and slower, and the update of hardware devices can be said to be a developed large user must be indispensable.

Using the high-performance server cluster system [1] of the latest 4, 8-way IA server architecture, the leading VI (Visual Interface) technology effectively eliminates the bottleneck communication between the nodes in the system; at the same time, the load used by the system Balancing technology allows the user equipment to be fully utilized, and reaches 4 9 reliability, which has a very high product cost price. Since the launch of users since 1999, users have provided a strong database server for domestic users. platform.

2. System principle

The high-performance server cluster system is a cluster of 2 nodes or 4 nodes, up to 32 CPUs, and memory maximum supports 32GB. 4 nodes constitute a working unit, up to 16 working units.

Each node is an IA server, which supports 4 or 8 Pentium III Xeon CPU parallel processing. Insert a high-speed exchange card for a Gigabit NIC or VI structure on each server, connect to a high-speed switch (which can be a Gigabit Ethernet card switch or some special high-speed switches, such as high-speed switches of VI structures) as servers Data exchange, called SAN (Server Area Network) switches.

Each server is inserted with a 100 megaphone or Gigabit Ethernet card, which is connected to a switch or hub for a local area network to provide a connection service for the client's access.

Four servers share a Fiber Channel disk array cabinet. Each server has two Fiber Channel cards, which are connected to the hubs of the two Fiber Channels, each connected to two controllers of the Fiber Channel disk array cabinet. For Fiber Channel Disk Array Cabinets, as long as one controller can work normally, the entire array cabinet will work properly, so this configuration is redundant to prevent single point failure. For the most important data stored in the disk array cabinet, you can also take the cluster system to ensure data security, and the Fiber Channel allows the cluster system to 10 km from the disk array.

Each server has a hard disk area for mounting the management section of the boot system of this unit and database system. User data is stored in a shared disk array cabinet.

In the LAN, there is a client as a management console. It is installed on the management console to manage parallel databases, which can monitor the database instance on the four nodes, start, stop, monitor the operation performance. Waiting for functions. In addition, there is a network management system, the management console, the management console, the UPS management console, etc. of the disk cabinet, and the unified management of the cluster system is implemented. Some management functions require only the TCP / IP protocol to be installed, and some need to install the SNMP protocol to work properly.

In addition to excellent performance indicators, a good cluster system requires support for operating systems and databases, our current cluster system supports Windows NT 4.0 and Windows 2000 operating systems [2], support Oracle and DB2 in databases . It does not run on a single machine, and its performance can only be displayed when multiple nodes work, and the system can truly load balancers.

2.1 Two-Node Cluster System

From the configuration of the program, the user can do different options as needed, you can use two high-end servers to implement a virtual host. At this time, high-speed switching devices with VI structures are more superior, and there is no need to use VI switches to directly connect high-speed data exchange between servers with high-speed switching cards on each server. If you use a Gigabit Ethernet card, you also need a Gigabit switch with a higher cost.

2.2 Four Node Cluster System

The design goal of the cluster system scheme is to support the 4 node cluster, and the corresponding storage device and switching device constitute a work unit in four nodes. Each work unit consists of four separate 4 or 8-way servers works as a virtual fault tolerant host, and four servers share a Fiber Channel disk array cabinet. Each server has two Fiber Channel cards, which are connected to the hubs of the two Fiber Channels, each connected to two controllers of the Fiber Channel disk array cabinet. For Fiber Channel Disk Array Cabinets, as long as one controller can work normally, the entire array cabinet will work properly, so this configuration is redundant to prevent single point failure.

3. System features

High-performance server cluster system solutions have enabled 4 nodes of clusters, exceeding the limits of traditional Cluster two knots. If each node is used in the cluster, the 4 node's cluster system can support 32 processors, which can be competent with traditional RISC miniatles and medium-sized machines.

3.1 Load Balancing:

Load balancing concept: Multiple servers are symmetrical methods, each server has an equivalent position, can provide services separately without providing other servers. Then, through some kind of load sharing, the external transmission request is evenly assigned to a certain server in the symmetrical structure, and the requested server is independently responding to the client's request.

1. Puzzle access to a large number of users is parallel to multiple node machines, shorten the time of the user waiting for a response, and the processing capability of the system is improved, and more user-friendly access can be accepted. When the client applies for a database connection, which node is automatically allocated by the database to which node is connected to implement the load balancing of this method does not need to modify existing applications.

2. Sharing a single user heavy load to multiple node machines to do parallel processing, each node machine has multiple CPUs, also parallel processing, after processing, returns the result, return to the user. On a large number of users, a large user query is over multiple nodes, and then the result is merged, then the result is given to the user, and the system processing capacity is greatly improved. A load balancing of this way requires modifying existing applications, but only needs to modify the SQL query statement.

3.2 High availability:

High availability means that the maximized server is turned on, which is the minimized server-planned downtime, so that critical data can be protected to improve production efficiency. 99% of high availability means that downtime outside the year is 5,000 minutes, while 99.99% means 50 minutes. High Performance Server Cluster System Solution is a True Cluster system. When a node machine fails, other node machines still work normally, and the client's access will not be interrupted.

A large number of redundant devices in the high-performance server cluster system, such as multiple servers, multiple UPS, multiple switches, disk redundancy (RAID), and independent two sets of CPU / memory subsystems, even configured in storage devices. Two sets of power systems for disk cabinet buffer buffer, effectively shielding the risk of single point failure, so that the reliability of the entire system has reached 99.99%.

The advantage of this program is that in the normal operation, four server parallel processing tasks, no resource idle, and parallel processing, greatly improve the overall system processing power; and when there is one or several failures, other servers have to execute themselves Outside the task, it also takes over the work of other nodes, so the load is increased, but the entire cluster system is still working properly, and the client's access will not be interrupted, because this is specifically designed for users who cannot tolerant shutdown, fundamentally eliminating Stop factor. A failed server simply removes it from the array. When this server is repaired, you can re-add to the server array without manual intervention. This is the meaning of our parallelism and never stop.

3.3 Centralized management performance

In the LAN, there is a client as a management console. It is installed on the management console to manage parallel databases, which can monitor the database instance on the four nodes, start, stop, monitor the operation performance. Waiting for functions. In addition, there is a network management system, the management console, the UPS management console, etc., and realize unified management of the cluster system. Some management functions require only the TCP / IP protocol to be installed, and some need to install the SNMP protocol to work properly.

3.4 Scalable performance.

The high-performance server cluster system is a cluster of 2 nodes or 4 nodes, up to 16 or 32 CPUs, and the memory maximum supports 32GB. 2 nodes or 4 nodes constitute a work unit.

First use a work unit to perform data processing, with the development of the business, when the discovery ability is insufficient, the second work unit can be reacted, and the processing performance can be improved. This reflects that the Cluster system is well scalable.

Can the four-node server parallel processing power to reach four times that of single server processing?

No. The two-node server parallel processing capability can basically reach twice the single server processing power, but the four-node server parallel processing capability can only reach three times more than the single server processing power. Since the number of resources used for the CLUSTER system management is increased, the total processing power of the system is increased.

3.5 VI high-speed transmission performance.

VI is the abbreviation of Visual Interface, which is directly translated into "virtual interface", is a communication technology between servers and servers. Strictly speaking, VI is an industrial standard that supports at least 100 companies and institutions, including internationally renowned manufacturers such as Intel, Compaq, IBM, HP, Microsoft. It does not mean some hardware or software equipment, but a special network communication protocol, where we are called VI devices, which is in line with this communication protocol. Sometimes this device is also known as the VIA (Visual Interface Architecture) device, the "Virtual Interface Architecture" device.

The VI device is characterized by high transmission rate - greater than 1.25Gb / s, even higher than the current Gigabit switch network device (the transmission rate of Gigabit switch is 1Gb / s), and because it uses a special communication protocol, Not a traditional OSI7 layer transfer protocol, which is much higher than the normal Gigabit switching device on the transmission speed of the data. The high-performance server cluster system uses the latest communication technology to exchange data between the servers, that is, the VI structure is used to implement high-speed data exchange between servers. Traditional data transmission (UDP / IP) adopts VI technology

Of course, you can also use Gigabit NIC and Gigabit Switch to implement data exchange between servers, but the most important difference is that the data exchange rate between the servers is much slower than the VI structure. Gigabit network card and Gigabit switch implementation data exchange must pass TCP / IP's seven-layer protocol, and the VI structure of the communication card can bypass the TCP / IP's seven-layer protocol, implement the application directly accessing the VI structure communication card, ie direct Data exchange with hardware devices, very high, 1.25Gb / s, greatly reduce CPU resource occupation. In the Cluster system, the speed of data exchange between the servers directly affects the overall system performance. In the case of the number of nodes, the more significant, even becomes the bottle diameter of the system performance, because the process is occupied between multiple nodes Most of the system resources.

A communication card currently supporting a VI structure has Oracle and DB2 for applications that directly perform data exchange in application layers.

When implementing two nodes, the communication card of the VI structure is more superior, because the communication cards of the two VI structures can be directly connected, and there is no need to switch, good performance, cheap. And if you use Gigabit Ethernet, you must use a Gigabit switch.

Internet era, domestic users have more and more demand for high-end applications of servers, using the continuous availability of the IA server architecture, online expansion, remote management, etc. A series of excellent features such as many users have changed many users. may.

转载请注明原文地址:https://www.9cbs.com/read-46155.html

New Post(0)