Increase Socket performance on Linux

xiaoxiao2021-03-19  254

Increase Socket performance on Linux

Available in the four ways of accelerating web applications M. Tim Jones, senior software engineers, Emulex Tim Jones is an embedded software engineer, he is a GNU / Linux Application Programming, Ai Application Programming, and BSD Sockets Programming from a Multilanguage Perspective et al. Author. His project is very wide, developing from the kernel of the synchronous spacecraft to the embedded architecture design, and then go to the development of the network protocol. TIM is a senior software engineer Emulex Corp.. February 13, 2006

Using the Sockets API, we can develop clients and server applications that can communicate on your local network or communicate globally over the Internet. Like other APIs, you can use the Sockets API through some ways to increase the performance of the socket, or limit the performance of the socket. This article explores four methods using the Sockets API to get the maximum performance of the application and optimize the GNU / Linux® environment to achieve the best results.

When developing a Socket application, the primary task is usually ensured reliability and meeting some specific needs. With the four prompts given in this article, you can design and develop the Socket program from the beginning to implement the best performance. The contents of this article include the use of the Sockets API, two Socket options that can improve performance and GNU / Linux optimization.

In order to develop performance-excellent applications, please follow the following tips:

Minimize the latency of the message transmission. Minimize the load of the system call. Adjust the TCP window for the Bandwidth Delay Product. Dynamic optimization GNU / Linux TCP / IP stack.

Tips 1. Minimize the latency of the message transmission

When communicating via TCP socket, data is split into data blocks so that they can package the TCP PayLoad (referring to the payload in the TCP packet) to a given connection. The size of TCP PayLoad depends on several factors (such as maximum packet lengths and paths), but these factors are known when connecting. In order to achieve the best performance, our goal is to populate each message using as many available data as possible. When there is not enough data to populate PayLoad (also known as maximum message length (Maximum segment size) or MSS), TCP will automatically connect some small buffers to a message segment using Nagle algorithm. This can improve the efficiency of the application by minimizing the number of sent messages, and alleviate the overall network congestion problem.

Although John Nagle's algorithm can minimize the number of sent packets by connecting these data into larger packets, sometimes you may want to send only some smaller messages. A simple example is a Telnet program that allows users to interact with the remote system, which is usually made through a shell. If the user is required to fill a message segment before the character entered before sending packets, then this method is definitely unable to meet our needs.

Another example is an HTTP protocol. Typically, a client browser generates a small request (an HTTP request message), then the web server will return a bigger response (web page).

solution

The first thing you should consider is that the Nagle algorithm meets a need. Since this algorithm merges the data, it is attempted to form a complete TCP message section, so it introduces some delays. However, this algorithm minimizes the number of messages sent on the line, so that the problem can be minimized. But in the case where you need to minimize transmission delays, the Sockets API provides a solution. To disable the Nagle algorithm, you can set the TCP_NodeLay Socket option, as shown in Listing 1.

Listing 1. Disable Nagle Algorithm for TCP Socket

Int Sock, Flag, Ret;

/ * CREATE New Stream Socket * /

SOCK = Socket (AF_INET, SOCK_STREAM, 0);

/ * Disable the nagle (tcp no delay) algorithm * /

FLAG = 1;

Ret = setsockopt (Sock, Ipproto_TCP, TCP_Nodelay, (Char *) & flag, sizeof (flag));

IF (RET == -1) {

Printf ("COULDN't setsockopt (tcp_nodelay) / n");

EXIT (-1);

}

Tip: Using Samba experiments show that when reading data from the Samba drive from the Microsoft® Windows® server, disabled Nagle algorithms can almost double improve read performance.

Skill 2. Minimize the load of the system call

When you read and write data at any time, you are using a system call. This call (such as READ or WRITE) spans the boundaries of the user space application and the kernel. In addition, your call will enter a general function in the kernel through the C library (System_Call ()). From system_call (), this call will enter the file system layer, and the kernel will determine which type of device being processed here. Finally, the call will enter the Socket layer, and the data is to read or queue here to transfer (this involve copy) by Socket.

This procedure shows that system calls are not only in the application and kernel, but also many levels in the application and kernel. This process consumes high, so the more calls, the longer the time required to work through this call chain, the lower the performance of the application.

Since we can't avoid these system calls, the only choice is to minimize the number of times that uses these calls. Fortunately, we can control this process.

solution

When writing data to a socket, try to write all the data, not the operation of performing multiple write data. For read operations, it is best to introduce the maximum buffer that can be supported, because if there is not enough data, the kernel will also try to fill the entire buffer (there is also a notification window that holds TCP is open). This allows you to minimize the number of calls, and you can achieve better overall performance.

Tip 3. Adjust the TCP window for the Bandwidth Delay Productu

The performance of TCP depends on several factors. Two most important factors are link bandwidth (speed rates transmitted on the network) and round-trip time) or RTT (send packets and responses received between the other end) ). These two values ​​determine what is called BANDWIDTH DELAY Product (BDP). After a given link bandwidth and RTT, you can calculate the value of the BDP, but what is the meaning? BDP gives a simple way to calculate the theoretical optimal TCP Socket buffer size (where the queue is waiting for transmission and the data received by the application). If the buffer is too small, the TCP window cannot be fully opened, which will restrict performance. If the buffer is too big, the valuable memory resources will cause waste. If you set the buffer size, you can use the available bandwidth. Let's take a look at one example:

BDP = LINK_BANDWIDTH * RTT

If the application communicates through a 100Mbps LAN, its RRT is 50 ms, then the BDP is:

100Mbps * 0.050 sec / 8 = 0.625MB = 625KB

Note: This is divided by 8 to convert the bit to communication using the bit.

Therefore, we can set the TCP window to BDP or 1.25MB. However, default TCP window size on Linux 2.6 is 110KB, which limits the bandwidth of the connection to 2.2Mbps, the calculation method is as follows:

THROUGHPUT = WINDOW_SIZE / RTT110KB / 0.050 = 2.2Mbps

If you use the window size calculated above, the bandwidth we get is 12.5Mbps, the calculation method is as follows:

625kb / 0.050 = 12.5Mbps

The difference is indeed, and you can provide greater throughput for Socket. So now you know how to calculate the optimal buffer size for your socket. But what should I change?

solution

The Sockets API provides several Socket options, two of which can be used to modify the size of the Socket to send and receive buffers. Listing 2 shows how to use the SO_SNDBUF and SO_RCVBUF options to adjust the size of the send and receive buffers.

Note: Although the Socket buffer determines the size of the Notification TCP window, TCP maintains a congestion window in the announcement window. Therefore, due to the presence of this congestion window, a given Socket may never take advantage of the largest annotation window.

Listing 2. Manually set the send and receive the Socket buffer size

Int Ret, SOCK, SOCK_BUF_SIZE;

SOCK = Socket (AF_INET, SOCK_STREAM, 0);

SOCK_BUF_SIZE = BDP;

Ret = setsockopt (Sock, Sol_Socket, SO_SNDBUF,

(char *) & SOCK_BUF_SIZE, SIZEOF (SOCK_BUF_SIZE));

Ret = setsockopt (SOCK, SOL_SOCKET, SO_RCVBUF,

(char *) & SOCK_BUF_SIZE, SIZEOF (SOCK_BUF_SIZE));

In the Linux 2.6 core, the size of the send buffer is defined by the calling user, but the receiving buffer will double. You can perform a getSockOpt call to verify the size of each buffer. Jumbo Frame We can also consider converting the size of the package from 1,500 bytes to 9,000 bytes (referred to as a giant frame). In the local network, you can set the maximum transmission unit (Maximum Transmit Unit, MTU), which can greatly improve performance.

For Window Scaling, TCP can initially support up to 64KB (using a 16-bit value to define the size of the window). After using Window Scaling (RFC 1323), you can use 32-bit values ​​to indicate the size of the window. The TCP / IP stack provided in GNU / Linux can support this option (and some other options).

Tip: The Linux kernel also includes automatic ability to optimize these Socket buffers (see TCP_RMEM and TCP_WMEM in Table 1 below), but these options have an impact on the entire stack. If you only need to adjust the size of a connection or a class connection, this mechanism may not meet your needs.

Tip 4. Dynamic optimization GNU / Linux TCP / IP stack

Standard GNU / Linux distribution is trying to optimize various deployments. This means that the standard release may not be specially optimized for your environment.

solution

GNU / Linux provides a lot of adjustable kernel parameters, you can use these parameters to dynamically configure your own use. Let's take a look at some more important options that affect Socket performance.

There are some adjustable kernel parameters in the / proc virtual file system. Each file in this file system represents one or more parameters, which can be read by the CAT tool or modified using the echo command. Listing 3 shows how to query or enable an adjustable parameter (in this case, IP forwarding can be enabled in the TCP / IP stack).

Listing 3. Tuning: Enable IP forwarding in TCP / IP stack

[root @ camus] # cat / proc / sys / net / ipv4 / ip_forward

0

[root @ camus] # echo "1"> / poC / sys / net / ipv4 / ip_forward

[root @ camus] # cat / proc / sys / net / ipv4 / ip_forward

1

[root @ camus] #

Table 1 gives several adjustable parameters that can help you improve the performance of the Linux TCP / IP stack.

Table 1. Adjustable kernel parameters for TCP / IP stack performance

Adjustable Parameter Default Option Description / Proc / Sys / Net / Core / RMEM_DEFAULT "110592" Defines the default receiving window size; for larger BDP, this size should be larger. / proc / sys / net / core / RMEM_MAX "110592" Defines the maximum size of the receiving window; for larger BDP, this size should be larger. / proc / sys / net / core / wmem_default "110592" Defines the default send window size; for larger BDP, this size should be larger. / proc / sys / net / core / wmem_max "110592" defines the maximum size of the send window; for larger BDP, this size should be larger. / proc / sys / net / ipv4 / tcp_window_scaling "1" Enables the Window Scaling defined by RFC 1323; to support more than 64KB windows, this value must be enabled. / proc / sys / net / ipv4 / tcp_sack "1" Enables Selective Acknowledgment, which can improve performance by selecting packets to be sequentially received (so that the sender only transmits only lost) The message segment); (For WAN communication) This option should be enabled, but this will increase the occupation of the CPU. / proc / sys / net / ipv4 / tcp_fack "1" Enables forward Acknowledgment, which can be selected (SACK) to reduce the occurrence of congestion; this option should also be enabled. / proc / sys / net / ipv4 / tcp_timestamps "1" enables the calculation of RTT with a more accurate method (see RFC 1323) for a more reduction (see RFC 1323); this option should be enabled to achieve better performance. / Proc / Sys / Net / IPv4 / TCP_MEM "24576 32768 49152" Determine how the TCP stack reflects memory usage; each value of each value is a memory page (usually 4KB). The first value is the lower limit of memory usage. The second value is that the memory pressure mode begins to use the upper limit of the application pressure using the buffer. The third value is the upper limit of memory. Packets can be discarded at this level, thereby reducing the use of memory. These values ​​can be increased for larger BDP (but to remember, the unit is a memory page, not bytes). / Proc / Sys / Net / IPv4 / TCP_WMEM "4096 16384 131072" For automatic tuning defines the memory used by each socket. The first value is the minimum number of bytes assigned to the send buffer of Socket. The second value is the default value (this value will be overwritten by the WMEM_DEFAULT), and the buffer can grow to this value without heavy system load. The third value is the maximum number of bytes that transmit buffer space (this value is overwritten by WMEM_MAX). / Proc / Sys / Net / IPv4 / TCP_RMEM "4096 87380 174760" Similar to TCP_WMEM, but it represents the value of the received buffer used for automatic tuning. / Proc / Sys / Net / IPv4 / TCP_LOW_LATENCY "0" allows TCP / IP stacks to adapt to the situation where the delay in high throughput is; this option should be disabled.

/ proc / sys / net / ipv4 / tcp_westwood "0" Enables the congestion control algorithm for the sender's side, which can maintain the evaluation of throughput and try to optimize the overall use of bandwidth; for WAN communication, this option should be enabled . / proc / sys / net / ipv4 / tcp_bic "1" enables Binary Increase Congestion for fast long distance network; this can better utilize links that operate in GB; this option should be enabled for WAN communication. Like any tuning effort, the best way is actually experimenting. The behavior of your application, the speed of the processor, and how much the available memory will affect the way these parameters affect performance. In some cases, what you think is beneficial to act is especially harmful (vice versa). Therefore, we need to test each option one by one and then check the results of each option. In other words, we need to believe in our own experience, but verify each modification.

Tip: The following is a question about permanent configuration. Note that if you restart the GNU / Linux system, then any adjustable kernel parameters you need will return to the default value. In order to make the value you set as the default value of these parameters, you can use /etc/sysctl.conf to configure these parameters to set these parameters to the value you set.

GNU / Linux tool

GNU / Linux is very attractive to me because there are many tools that can be used. Although most of them are command line tools, they are very useful and very intuitive. GNU / Linux provides several tools - some are GNU / Linux yourself, some are open source software - used to debug web applications, measure bandwidth / throughput, and check links.

Table 2 lists the most useful GNU / Linux tools, as well as their use. Table 3 lists several useful tools that have not been provided in the GNU / Linux release. For more information on tools in Table 3, please refer to the reference.

Table 2. Tools that can be found in any GNU / Linux release

GNU / Linux Tool Use PING This is the most commonly used tool for checking the availability of the host, but can also be used to identify the RTT calculated by the bandwidth. Traceroute prints a path (routing) that is connected to the network host (routing) to determine the delay between each HOP. NetStat determines various statistics about the network subsystem, protocol, and connections. TCPDUMP displays one or more connected protocol level packet tracking information; which includes time information, you can use this information to study the message time of different protocol services.

Table 3. Useful Performance Tools provided in the GNU / Linux release

GNU / Linux Tools Use NetLog provides applications with some information about network performance. NetTimer generates a metric for the bottleneck chain bandwidth; it can be used for automatic optimization of the protocol. Ethereal provides TCPump (packet trace) characteristics with an easy-to-use graphical interface. IPERF measures network performance of TCP and UDP; measured maximum bandwidth, and reports the loss of delay and datagram.

Conclude

Try using the skills and techniques described in this article to improve the performance of the Socket application, including the reduction of the Nagle algorithm to reduce the transmission delay, improve the use of the Socket bandwidth by setting the size of the buffer, by minimizing the number of system calls Reduce the load of the system call, and use the adjustable kernel parameters to optimize Linux TCP / IP stack.

The characteristics of the application are also required when optimized. For example, your application is based on whether or communication with the Internet? If your application is only within the LAN within the LAN, the size of the increase of the Socket buffer may not bring too much improvement, but the enabled giant frame must greatly improve performance! Finally, use TCPDUMP or EThereal to check the results after optimization. The changes you see in the packet level can help show the successful effects obtained after using these technologies.

转载请注明原文地址:https://www.9cbs.com/read-130237.html

New Post(0)