High-end Server Technology Sources Not detailed on September 21, 2004 Server performance indicators are represented by system response speed and job throughput. The response speed refers to the time that the user gives the task from the input information to the server. The job throughput is the amount of task completed throughout the server in unit time. Assuming that the user is not interpretially input request, in the case of abundant system resources, the throughput of a single user is inversely proportional, that is, the shorter the response time, the greater the throughput. In order to shorten the response time of a certain user or service, you can assign more resources. Performance adjustment is based on application requirements and server to run environment and status, changing system resources allocated by each user and service program, and give full play to system capabilities, with as little resources to meet the user's requirements, and reach the purpose of more user services. The high scalability, high availability, easy management, high reliability required by the technical target server is not only the technical goals of manufacturers, but also the needs of users. The scalability is specifically manifested in two aspects: First, there is a surplus chassis available space, and the second is a plenty of I / O bandwidth. As the processor calculation speed increases and the increase in the number of parallel processors, the bottleneck of server performance will be attributed to PCI and their affiliates. High scalability is that users can increase the relevant components at any time as needed to meet the system operation requirements, and protect investment. The availability is the time ratio of the device as a measurement indicator, for example, 99.9% availability indicates that an 8-hour time device does not function properly, and 99.99% availability indicates that a 5-minute time device does not work normally. Component redundancy is the basic method of improving usability, usually adding redundant configurations to those components (such as power, hard drives, fans, fans, and PCI cards) that have hazards to the system, and design convenient replacement structures (such as hot plugging) ), Thereby ensuring that these devices do not affect the normal operation of the system even if the fault occurs. Manageability is designed to use specific technologies and products to increase system reliability, reduce system purchase, use, deployment, and support costs. The most significant role is reflected in reducing the work of maintenance staff and avoiding the loss of system shutdown. The management performance of the server directly affects the ease of use of the server. Manageability is the largest proportion in TCO's various costs. Studies have shown that the deployment and support cost of the system far exceeds the cost of the first time, and paying the management and support personnel is the highest share. In addition, the reduction in work efficiency, the financial loss brought about the loss of business opportunities and the decline in business income can not be ignored. Therefore, the management of the system is both an urgent requirement of the IT department and a very critical role in business efficiency. Manageable products and tools can achieve the purpose of simplifying system management by providing information within the system. Remote management through the network, technical support staff can solve problems on their desktops, do not have to go to the fault site. The system components can automatically monitor their work state. If the failure can be found, it can make a warning at any time, reminding the maintenance personnel to take immediate steps to protect enterprise data assets, and the operation of the faulty component is also very simple and convenient. Speaking of reliability, simply means that the server must run stable, which is low in down. The key is to cooperate with the hardware device. If the resource to be processed is controlled on the CPU and the operating system, it will avoid the system unable to run due to an error in an error, and the server downtime will be great. Reduce, and this is precisely one of the advantages of UNIX / Linux system. The interruption of daily maintenance work is: host upgrade, hardware maintenance, or installation, operating system upgrade, application / file upgrade, or maintenance, file reorganization, full system backup, etc. Accidental disasters include hard disk damage, system failure, software failure, user error, power supply, human damage, and natural disasters. SMP SMP (Symmetrical Multi-Processor) is a symmetrical multiprocessor.
In a symmetrical structure, the status of each processor in the machine is the same, and they are connected together to share a memory. There is an operating system in the memory, each computer can run this operating system, can respond to the requirements of the external device, ie the position of each memory is equal, symmetrical. The processor of such models in the domestic market is generally 4 or 8, and there are a small number of 16 processors. However, in general, the Machine scalability of the SMP structure is poor, it is difficult to do more than 100 multi-processors, and conventional generally 8 to 16, but this is already enough for most users. The advantage of such a machine is that its use method and the different differences of the microcomputer or workstation. The programming changes is relatively small. If the program written by the microcomputer workstation is to be used to use it to the SMP machine, it is relatively easy. . The model availability of the SMP structure is relatively poor. Because 4 or 8 processors share an operating system and a memory, once the operating system has problems, the entire machine is completely paralyzed. And because this machine is scalable, users' investment is not easy to protect users. However, this type of model technology is mature, and the corresponding software is also more, so the parallel machine launched in the domestic market is now this kind. The cluster technology says that the cluster is such a technique: it connects to at least two systems together, so that the two servers can work as one machine or seem like a machine. Using a cluster system is usually in order to improve system stability and data processing capabilities and service capabilities of the network center. Since the 1980s, various forms of cluster technology have emerged. Because the cluster can provide high availability and scalability, it rapidly becomes the pillars of enterprises and ISP calculations. Common Cluster Technology 1. Server Mirroring Technology Server Mirroring Technology is to make two servers to mirror the hard drives of two servers over software or other special network devices (such as mirror cards) over software or other special network devices (such as mirror cards). Among them, a server is designated as the primary server, and the other is from the server. Customers can only read and write on the mirrored volume on the primary server, that is, only the primary server provides services to the user through the network, locked from the server, locked to prevent access of the data. The master / slave servers monitors each other's operating state through heartbeat monitoring lines. When the primary server is downtime, the main server will take over the primary server in a short period of time. Server mirroring techniques are low cost, improve system availability, ensuring that the system is still available in the case of a server downtime, but this technique is limited to clusters of two servers, and the system does not have scalability. 2. Application Error Tube Cluster Technology Error Tube Cluster Technology is to connect two or more servers established in the same network through cluster technology, each server in the cluster node each runs different applications, with its own broadcast Address, providing services for front-end users, while each server monitors the operational state of other servers, providing a hot backup role for the specified server. When a node is downtime, the server specified in the cluster system will take over the data and applications of the faulty machine in a short period of time, and continue to serve the front-end users. Error Tube cluster technology typically requires external storage devices - disk array cabinets, two or more servers are connected to disk array with disk arrays via SCSI cable or fiber, and data is stored on disk array cabinets. In this cluster system, two nodes are typically backed up, rather than several servers simultaneously, and the nodes in the cluster system via the serial port, shared disk partitions or internal networks to monitor each other's heartbeat. Error Take Overcrow cluster technology is often used in a cluster of database servers, Mail servers, and the like. This cluster technology has increased peripheral costs due to shared storage devices. It can realize the cluster of 32 machines, greatly improves the availability and scalability of the system. 3. A typical application of fault-tolerant cluster technology tolerance cluster technology, in fault tolerant machines, each component has redundant design.