Selection and implementation of load balancing scheme for web cluster services
The web application server cluster system is a cluster system consisting of a server that runs the same web application at the same time, in the outside world, is like a server. In order to balance the load of the cluster server, the purpose of optimizing system performance is achieved, the cluster server disperse a wide range of access requests to different nodes in the system. Thereby achieving higher effectiveness and stability, which is also the characteristics necessary to Web-based enterprise applications.
High reliability can be seen as a redundant setting for the system. For a specific request, if the server is not processed, then other servers can be effectively processed? For an efficient system, if a web server fails, other servers can immediately replace its location, process the request applied, and this process is as transparent to the user, the user is not aware Arrive!
Stability determines whether the application supports growing user request quantity, which is an application itself. Stability is an effective measurement means of many factors affecting system performance, including the maximum number of users that can support the system, and the time required to process a request.
In the process of existing equilibrium server load, the following two methods are used and used:
DNS load balancing method RR-DNS (Round-Robin Domain Name System)
Load balancer
Hereinafter, we will discuss these two methods.
DNS turning RR-DNS (Round-Robin Domain Name System)
Data file in the Domain Name Server maps the host name to its IP address. When you type a URL in your browser (for example, www.loadbalancedsite.com), the browser sends the request to the DNS and requires it to return the IP address of the appropriate site, which is called DNS query. When the browser gets the IP address of the site, connect the page to the site to be accessed through the IP address.
Domain Name Server (DNS) typically contains a list of a single IP address with the name of the site mapped to the IP address. In the example of our description, www.loadbalancedsite.com The map of the site of this site is 203.24.23.3.
In order to use the load of the DNS equalization server, for the same site, there are several different IP addresses at the same time in the DNS server. These IP addresses represent different machines in the cluster and are logically mapped to the same site name. Through our example, you can better understand this, www.loadbalancedsite.com will release the three machines in a cluster through the following three IP addresses:
203.34.23.3
203.34.23.4
203.34.23.5
In this example, the DNS server contains the following mapping table:
Www.loadbalancedsite.com 203.34.23.3
Www.loadbalancedsite.com 203.34.23.4
Www.loadbalancedsite.com 203.34.23.5
When the first request arrives at the DNS server, the return is the IP address of the first machine 203.34.23.3; when the second request arrives, the second machine is returned to the IP address 203.34.23.4 of the second machine, and push it. When the fourth request arrives, the IP address of the first machine will be returned again, cyclic calls.
With the above DNS ROBIN technology, all requests for a certain site will be allocated on the machine in the group. Therefore, in this technique, all nodes in the cluster are visible for the network. Advantages of DNS Turning Scheduling
The biggest advantage of DNS Round Robin is to be easy to achieve and cost:
The cost is low, easy to establish. In order to support turnt scandal, the system administrator only needs to make some changes on the DNS server, and this function has been added to many of the new version of DNS servers. For web applications, there is no need to make any changes to the code; in fact, the web app itself will not realize load balancing configuration, even in front of it.
Simple. No network experts are required to set it, or maintain it when there is a problem.
Disadvantages of DNS turntable
This software-based load balancing method mainly has two shortcomings, one is the association between the service period does not real - time, one is not high reliability.
• Do not support consistency between servers. Server consistency is a capability that load balancing systems. By it, the system can be dependent on the server-side, or the underlying database level, which relays the user's request to the corresponding server. The DNS turnt scheme does not have this intelligent characteristic. It is a similar judgment through a cookie, hidden domain, and rewritten in the three methods of URL. When the user establishes a connection with the server through the above-described text logo, all of its subsequent access is connected to the same server. The problem is that the server's IP is temporarily stored in the browser. Once the expiration is recorded, then the connection is required, then the request of the same user is likely to be processed by different servers, and all previous session information will be lost. .
High reliability is not supported. Imagine a cluster with n nodes. If one of the nodes are destroyed, the request for all accessing the node will not respond, this is what anyone is unwilling to see. Comparing advanced routers can check the nodes, if there is a destroyed node, the method is removed from the list, solves this problem. However, since ISPS is stored in the cache on the Internet, the updates of DNS will become very slow, so that some users may access some sites that have not existed, or Some new sites are not accessible. Therefore, although the DNS turnt scheid solves the load balancing problem to some extent, this situation is not very optimistic and effective.
In addition to the turning method described above, there are three DNS load balancing processing allocation methods, and these four methods are listed below:
Ø ROUND ROBIN (RRS): Assign the average of the work to the server (used for actual service host performance)
Ø LeaSt-Connections (LCS): Allocate more work to less connected servers (IPVS table stores all activities. Used for actual service host performance.)
Ø Weighted Round Robin (WRRS): Allocate more work to larger capacity servers. You can adjust or down according to the load information. (The actual service host performance is inconsistent)
Ø Weighted Least-Connections (WLC): Considering their capacity to allocate more work to less connected servers. The capacity will be described by the user-specified weight, and can be adjusted up or down according to the loading information. (The actual service host performance is inconsistent)
Load balancer
The load balancer solves many of the problems faced by the turnt scandal through the virtual IP address method. Using the load balancer cluster system, in the outside, it is like a single server with an IP address, of course, this IP address is virtual, which maps the address of each machine in the cluster. Therefore, in some extent, the load balancer is a leakage of the IP address of the entire cluster to the external network. When the request is requested to reach the load balancer, it rewrites the header file of the request and specifies it to the machine in the cluster. If a machine is removed from the cluster, the request will not be sent to the server that has not existed, because all the machine surface has the same IP address, even if a node in the cluster is removed, This address does not change. Moreover, the DNS entry buffered on the Internet is no longer a problem. When a response is returned, the client sees only the result returned from the load balancer. That is, the object of the client operation is a load balancer, and for its link operation, it is completely transparent to the client.
Advantages of load balancer
• Server consistency. Interpretation of the cookies or URL interpretation included in each requested by the load balancer reads. Based on the read information, the load balancer can rewrite the header and send the request to the appropriate node in the cluster, which maintains the session information requested by the corresponding client. In HTTP communication, the load balancer can provide server consistency, but not to provide this service through a secure pathway (e.g., https). When the message is encrypted (SSL), the load balancer cannot read the session information hidden in it.
• Get high reliability by fault recovery mechanism. Fault recovery occurs when a request in a cluster cannot handle requests, and you need to re-directed requests to other nodes. There are two main fault recovery:
• Request grade failure recovery. When a node in the cluster cannot handle the request (usually due to the DOWN machine), the request is sent to other nodes. Of course, while the session information saved on the original node is lost while being guided to other nodes.
• Transparent session failure recovery. When a reference fails, the load balancer will send it to other nodes in the cluster to complete the operation, which is transparent to the user. Since the transparent session fault recovery requires the node to have the corresponding operation information, in order to implement this function, all nodes in the cluster must have a public storage area or a universal database, store session information data to provide each node in performing separate process session failure recovery The operation information required is required.
• Statistical metering. Since all WEB applications must be loaded equalization systems, the system can determine the number of event sessions, the number of event sessions in any instance, the number of answers, the number of peak loads, and the peak period and low valley The number of sessions, there are more more. All of these statistics can be well used to adjust the performance of the entire system.
Disadvantages of load balancer
The disadvantage of hardware routing is that the cost, complexity, and single point failure. Since all requests are passed through a single hardware load balancer, any fault on the load balancer will result in the crash of the entire site.
HTTPS request load balancing
As mentioned above, it is difficult to carry out load balancing and session information maintenance processing on those requests from HTTPS. Because the information in these requests has been encrypted. Load balancers have no ability to process such requests. However, there are two ways to solve this problem:
Agent web server
Hardware SSL decoder
Before the proxy server is located before the server cluster, first accept all the requests and decrypt it, and then re-issue these processes according to the header information, this method does not require support on the hardware, but will Increase the additional burden of the proxy server.