Chen Xun Software Studio R & D Manager
This article is the third article "The scalability of the cluster and its distributed architecture". It mainly introduces the hierarchical model of the cluster's hardware and software structure, the main classification methods and the four major elements that determine the cluster design: HA, SSI, job management and communication. The author aims to construct a cluster abstract model through several different entry points, enabling readers to analyze and design clusters in reality.
Layered model
First, let's take a look at the main components of the cluster computer system:
Multiple high-performance computers (PCs, workstations or SMP servers) excellent operating systems (hierarchical structures or micro-kernels) High-performance network switches (Gigabit Ethernet or similar MYRINET's proprietary switch network) network interface card NIC Quick communication protocols and services (such as event messages and fast messages) cluster middleware (single system image and availability support)
Hardware, such as DEC's memory channel, hardware DSM, and SMP technology operating system kernel or adhesive layer, such as MC and GLUNIX applications and subsystems
Applications, System Administration Tools and Spreadsheet Real-time systems, software DSM, and parallel file system resource manager and scheduling software, such as LSF load balancing and CODINE distributed network environment computing parallel programming environment and tools, compilers, PVM, MPI applications Program: Serial and Parallel
These components are not necessary in each type of cluster system, and most systems often implement several of them, mainly based on specific needs. However, if viewed by the reference model, the establishment of hierarchical descriptions for understanding the positions and their effects of each component are contributed to the overall understanding of the cluster.
Here we found that the composition of the cluster covers almost every aspect of software to hardware. If the cluster is like a hierarchical structure like an OSI interconnection reference model, then from the bottom to the top, all levels of existing hardware constructs, network design, operating systems, middleware, and application environments are covered. So, what kind of technique chooses every layer is important. With existing mature product technology, it can greatly reduce the technical and financial risks of construction clusters, and choose the appropriate level and technical point as breakthroughs, and it is often the key to resolving performance, security or other special needs.
With regard to the design purposes of the cluster, the benefits we have described in the past, and it will not be discussed here. Simply put, relative to low expenses, the cluster must have the following characteristics:
High-performance scalability high throughput easy to use high availability
Classification method of cluster
In fact, products that are often reality will be a comprehensive integrated feature. For these features, from different reference factors, the cluster can have the following centralized classification:
Classification of clusters
Application purpose
When I have classified in the cluster, I usually divide the cluster into three according to the use of the cluster:
Emphasize the high-performance calculation (HP) cluster of computing power, the famous BeaWulf cluster is an excellent example. Emphasize the availability of HA commercial clusters, the MON project in the open source community and the redhat's Piranaha is a cheap-friendly HA cluster plan. Of course, don't forget SP2, Trucluster, Solaris MC. There are both HA's ability, and you can implement HP clusters, emphasize high throughput integrated clusters, such as MOSIX, LVS.
Applications are always changing. The earliest cluster is only born for solving calculation problems. As demand, there is a HA commercial cluster, as well as high-throughput systems with a strong comprehensive nature, and will have a cluster of update types in the future.
Node belongs
If you look from the node's home case, you can be divided into: dedicated cluster and "part-time" cluster (also known as a single-use and enterprise).
Specialized clusters are often used for supercapacity missions or combined with low-cost PCs into large workstations, with the following features: Most of the racks installed on the machine room are commonly combined by the same type of node, generally through a front end.
This cluster is mainly to replace traditional hosts or supercomputers. Specialized clusters can be installed, used and managed as a single computer. Allow multiple users to log in to the cluster for interactive jobs or batch jobs. It can greatly improve throughput, and shorten the response time.
Constructing the "part-time" cluster mainly in order to save costs and make full use of idle resources, the characteristics are as follows:
The node must be a complete SMP, workstation or PC, which is connected to all geographic nodes of peripherals, which are distributed. It is not necessary to be in the same space, and there are multiple "home" (which is owner). . Cluster administrators are limited to the management permissions of nodes. Moreover, the owner's local task priority is often higher than the cluster's task cluster is heterogeneous, and the interconnection is based on standard communication network.
From the above comparative we are not difficult to find that the use of node resources has led to the emergence of two different home clusters. In a proprietary cluster, a particular individual does not have a workstation, or a node, the resource is shared within the cluster range, and the parallel computing can be performed on the entire cluster. Individual workstations in the "part-time" cluster, the application rely on "stealing" CPU time to operate. This is the main reason that the CPU time of most workstations is idle, even in the peak period, there are few more than 50%. We also call the parallel program running on this non-specialized dynamic change in the cluster of this non-specialized dynamic change. The high-throughput cluster in some articles is also here.
Assembly method
The assembly method mainly depends on the interconnection technology and computer space technology. In a loosely coupled cluster, nodes are generally relatively independent PCs or workstations with complete peripherals: keyboard, mouse, display. Take each other through the LAN connection, can be in one machine room, can also cross the building, can even be extended to a park (such as a campus network). With the development of bandwidth technologies, the current loose cluster is now able to integrate across local scope. For example, in a network load balancing cluster environment, there are many solutions that spanning several cities for clustering, like NetEase site servers, 263 mail servers, can do load balancing between different cities. However, it can exhibit strict "consistency" and form a unified resource.
Tight coupling clusters tend to consider the cluster interconnection from the perspective of spatial utilization, effective bandwidth. Everyone knows that some extent, the distance and bandwidth of the network are inverseby. The more short distance technology can achieve higher bandwidth. Therefore, there are special clusters tend to use high-bandwidth and low delay communication networks, and remove the unnecessary peripherals of nodes, just reserve the necessary hosts (CPU, memory, hard drive), placed in one or close to each other. In the frame. In this way, not only make full use of the effective bandwidth of short-range communication, such as Gigabit Ethernet, or even 10 Gigabit Ethernet; can also save the space occupied by nodes, but also facilitate central management. In the development of tight coupling technology, even in a chassis, the product - Blade Cluster Server, also known as the blade server.
The blade server is a low-cost server platform for HAHD (High Availability High Density, high-density) is designed for special application industries and high-density computer environments. Each of them "blade" is actually an enhanced system motherboard. They can launch their operating systems through local hard drives, such as Windows NT / 2000, Linux, Solaris, etc., similar to a separate server node. In this mode, each motherboard runs its own system, serving the specified different user groups, there is no association between each other. However, you can use SSI software to set these motherboards into a single cluster image. In cluster mode, all motherboards can be connected to provide high-speed backplane bandwidth, which can share resources for the same user group service. Insert a new "blade" in the cluster, you can improve overall performance. Since each "blade" is hot-swap, it is as convenient to pull the graphics card, so the system can be easily replaced and the maintenance time is reduced to the minimum. While convenient management, this cluster also saves the valuable space of the machine room rack, and can make full use of short-distance high-performance communication technology, it can be described as much as possible.
This is two typical cluster assembly methods. The loose coupling cluster on the left is installed within a local area network, which is usually covered by a GB level Ethernet; the right side belongs to the cluster, mounted in a rack, can use higher bandwidth communication technology, and high electrical Combination, as for blade servers, it is more compact. In addition, we can see that these two clusters should be a dedicated cluster.
control method
Control is divided into two types of centralized control and dispersion control. Most of the clusters with centralized control are tightly coupling clusters. For space and management convenience, the administrator is allowed to control all nodes, which can be a character terminal or a graphic GUI. The Beowulf cluster for parallel calculation is a centralized control method, and the administrator manipulates the primary server through the shell tool or the X interface, and the specific calculation node is not directly accessible.
For loosely coupled clusters, a method of dispersing and centralizing mixing is used. Due to the concentration control of the loosely control of the loose structure, the difficulty of achieving is large. The mature management protocols such as SNMP can be used for resource allocation and scheduling in the environment. In addition, in a loosely coupled non-dedicated cluster ("part-time" cluster), daily control is still performed by its respective "home", and the idle calculation time is given to the controller.
Parallelism
The same is relative, complete group is only the ideal of theoretical model. In the previous chapter, we mention that in various distributed systems, SMP is best, directly reflected in the important indicators of a single system image. Most of the cases, the cluster node uses the same operating system or a compatible hardware platform to ensure portability of binary code as much as possible. For example, in the Beowulf cluster, the cluster node and server are all Linux core operating systems, with standard PVM and MPI2 interfaces, making computing tasks to cross the address space, code, and data expression of each node, It can be smoothly migrated.
Heterogeneous properties are increasingly important in the development of clusters. Through the enhanced OS extended API or middleware layer software, the task can be free to move between the heterogeneous nodes to implement a certain level of SSI. In the load balancing environment and usability support, a certain degree of SSI is necessary. However, since the binary code and data structure are not compatible, the performance of the intermediate code, the interpretation, or the extended dynamic link library is intended to achieve the "same", Java language, PVM in parallel library. There is a good application in a field. With the development of WebService and XML technology, it is possible to achieve satisfactory SSI capabilities in a variety of different programming languages and operating environments. safety
Security depends on the interconnection of the cluster node and the outside world. If the cluster's nodes are physically connected or IP addresses, they are exposed, and the communication inside the cluster has not protected, and we believe that such cluster systems do not have security. The damage of hackers or malicious users can cause unavailability of clusters. However, this cluster is more easy to implement due to excessive considerations in planning.
If the cluster node is concealed through the protection technology such as firewall, the nodes inside the cluster cannot be illegally accessed from the outside, or to strengthen the security capabilities of the node operating system itself, the cluster system has certain security. Because many factors in the safety environment must be considered, it is necessary to increase the difficulty of the construction system. Currently, most commercial cluster products either use a proprietary internal communication protocol to achieve efficient and security, or integrate with existing security products, extension security features in the system or network protocol.
About the Author:
Lin Fan, is now engaged in the University of Xiamen engaged in Linux related research. The cluster technology is greatly interested in communicating with like-minded friends. You can contact him via email iamafan@21cn.com.