Organization: China Interactive Publishing Network (http://www.china-pub.com/)
RFC Document Chinese Translation Program (http://www.china-pub.com/compters/emook/aboutemook.htm)
E-mail: Ouyang@china-pub.com
Translator: ()
Translation time: 2001-12-28
Copyright: This Chinese translation copyright belongs to China Interactive Publishing Network. Can be used for non-commercial use free reprint, but must
Keep the translation and copyright information of this document.
Network Working Group John Nagle
REQUEST for Comments: 896 6 January 1984
Ford Aerospace and Communications Corporation
Congestion control on TCP / IP Internet
(RFC896 - Control IN IP / TCP Internetworks)
This document discusses some of the issues of congestion control on TCP / IP Internet. It aims to stimulate
People think about this issue and further discussions. In order to achieve improved congestion control, some have
This document does not specifically develop any standards when proposed.
Speech
Congestion Control is recognized in a complex network. We found that the Ministry of Defense Network Agreement (IP),
A pure datagram protocol, and Transmission Control Protocol (TCP), a transport layer protocol, which makes them
It is easy to suffer from unusual congestion problems, which is the interaction between the transport layer and the data newspaper.
caused. In particular, IP gateways are fragile for we called "congestion crash",
In particular, when this gateway is connected to a wide range of networks of different bandwidths. We studied the prevention of congestion
Crash solution.
Since these protocols are frequent on the Internet based on ARPANET IMP technology, these issues are not available.
Go to a universal understanding. Network based on ARPANET IMP usually has a consistent bandwidth and exact same exchange festival
Point, and the capacity is large. For most TCP / IP hosts and networks, the capacity of surplus and IMP
The ability to control the main transmission amount is enough to handle congestion. However, with recent Arpanet into two mutual
Ending network and other networks with different characteristics on Arpanet, IMP system benign
The reliability in the characteristics is not sufficient to allow for a rapid and reliable communication. In order to make the network successful operation,
Congestion control must be improved.
Ford Aerospace and Communication Co., Ltd., and its head office, Ford Motor Company,
The only private TCP / IP long distance network is actually existing. This network is connected to four outlets (one
In Michigan, two in Galifornia, another in england, some of them have some big regulations
The local network of the mold. This network cross is connected to the Arpanet but uses its own long distance line. blessing
The private rental lines are transmitted between the outlets of the company, including a special crossing the Atlantic
Star communication line. All exchange nodes are exchanged in pure IP reports that do not point to point flow control, and
Software with host run is written by Ford or its subsidiary or has a large number of modified soft
Part. The link bandwidth on this network has changed large, from 1200 to 10,000,000bps. Usually, I
They have no capacity to buy an extra long distance bandwidth like expensive arpanet, and our long distance
The link is overloaded during the peak period. A few seconds of transmission time is so common in our network.
Since our pure data is reported, we have to solve the large-scale changes in the load and bandwidth.
ARPANET / MILNET organization has just begun to recognize the problem. Our network is very sensitive to the host's TCP implementation, including connecting or disconnecting our network. We try to check under different conditions
TCP performance, and has resolved some of the problems that TCP generally existed. Here we present two
Problem and its solution. Many TCP implementations have these problems; if you implement a given TCP,
The throughput through the Arpanet / Milnet gateway is like a single network, so it is likely this
The TCP implementation is one or both of these issues.
Congestion collapse
Before we start discussing these two specific issues and their solutions, describe these problems
What happens when there is a solution. Pure data network with heavy-end-to-end retransmission mechanism
In the network, when the exchange node congestion, the round-trip time on the network increases, the number of datographers transmitted on the network.
The amount also increased. This is normal under light load. As long as there is only one copy of each datagon in the transmission,
Congestion is in control. Once the datagram has not been delivered, it will be retransmitted, and there is a potential serious problem.
Can appear.
The implementation of the host TCP is expected to retransmit the dataset in the increased time interval until the retransmission
A certain time limit has arrived. Typically, this mechanism is sufficient to prevent serious congestion problems. Although it is better
Adaptive host recontutement algorithm, but accidental load on the network makes the growth rate of round-trip time than sender
The estimated round trip time is faster. This load is generated when a new large amount of data is transmitted.
Such files are transmitted to fill a large window. If this round-trip time exceeds all hosts
Big retransmission interval, the host will start to generate a copy of the same data report to the network. At this time,
This network has a serious problem. Eventually all available buffers in the switching node will saturate, so
Lost datagram. At this time, the round trip time transmitted is reached its maximum. Host multiple times
Send each message, and a copy of each message reaches its destination. This is the congestion crash.
This state is stable. Once you reach a saturation point, if you choose the package to discard the algorithm, the network will follow
Continued operation in a state where performance is reduced. In this state, each package is transmitted several times and the throughput will drop
One-third of the normal situation is low. We experimentally forced our network in such a state and
Review its stability. The round trip time may become greatly caused by the interrupt due to the host timeout.
Congestion crashes and abnormal congestions often do not appear in the Arpanet / Milnet system, because this
Some networks have sufficient overall capacity. As long as the connection does not pass the IP gateway, the enhanced host traffic control
Works often prevent congestion crash, especially since the time constant associated with pure Arpanet network
It has been well adjusted to the TCP implementation. However, when TCP is running on ARPANET / MILNET
When the gateway is discarded, in addition to the ICMP source suppression packets, there is no basic mechanism to prevent the support.
The plug collapses. Some hosts running through themselves congestion and block other hosts from passing.
Value. We already on some hosts on Arpanet (we have privately managed with these hosts)
Exchange) Repeatedly observed this issue.
Adding additional memory to the gateway cannot solve this problem. The more memory added, the data report is lost
The longer the round trip time before discarding. In this way, the occurrence of congestion crash will be delayed, but when crash occurs,
The larger data score in the network will be copied, and throughput will become worse.
Two problems
Two key issues related to TCP implementation have been observed; we call them short numbers
According to reports, issues and source suppression issues. The second question is being solved by several implementation programs, the first question
Often (incorrect) is considered to be solved. We found that once the short data report problem is solved, the source is suppressed
The problem becomes more easier to handle. Therefore, we first propose short data newspapers and their solutions.
Short data report
Here is a specific problem associated with short dysfunction. When the TCP is used to transmit a single-character information from the keyboard, the typical result is to transmit 41 bytes of useful data to transmit one byte.
Report (1 byte of data, 40 bytes of header file). This 4000% overhead is annoying, but
It can be tolerated in light loaded networks. However, in the negative network, it is caused by this overhead
The congestion can lead to the loss and retransmission of the datagram, as well as due to congestion in the exchange node and gateway.
The transmission time is too large. In fact, throughput may decrease in TCP connections.
This typical problem was first proposed in the TymNet network in the second half of the 1960s.
Wide understanding. The solution used there is to force the number of datagrams generated for each unit.
A given a limit. This limit is a short time (200-500MS) by delaying a short dialogue.
The post-transfer is implemented, in order to arrive at the other or two characters before the timer arrived and attached
In the same datam. In order to increase the user's acceptability, an additional feature is when a control character
(For example, the return character) arrives, the time delay is prohibited.
This technology has been used in NCP Telnet, X.25 Pads, and TCP Telnet. Its advantage is
Easy to understand and do not implement. Its disadvantage is that it is difficult to give a time limit that satisfies everyone. One
The time limit for providing high-speed response services on 10M BPS Ethernet is too short to prevent 5 seconds.
The congestion crash on the high load network; the opposite, the time limit for processing high-load networks will be given
Taiwanese users causing frustration.
Solution for short data report
Obviously, an adaptive approach is not difficult to think. We expect to be based on the time limit for adaptive interaction
A suggested plan, this time is based on the round trip time delay observed by TCP. however
Although such a mechanism can be implemented, it is unnecessary. We found a simple and optimized
s solution.
This solution is that if any data transmitted over the connection is still not confirmed, then
When the user's output data arrives, the TCP data segment is prohibited. This limit is unconditional, not fixed
Timer, there is no need to test the size of the data received, and other conditions do not require other conditions. Typical implementation only requires TCP
One or two lines in the program.
At first glance, this solution seems to mean the dramatic changes in TCP behavior. But not. Ultimate
Work well. Let us see why it is.
When a user writes data to a TCP connection, TCP receives some data. It can keep these
Data for later delivery or can also send a packet immediately. If it does not transfer immediately, it will
An incoming packet arrives and changes the system status. Can have one of two ways
Change this state; this arrival confirmation of the remote host to receive data, or notify the remote host
The available buffer space size provided by the new data. (The latter refers to "Update Window"). Each time, this connection number
According to the arrival, TCP must retrieve its current status and may send some packets. In this way, when
We ignore the data from the user-ended data, we just delay the transfer until the next come from far
The new host of the host is coming. The message is always coming soon, unless this connection is idle or
The communication of the other end is lost. In the first case, i.e., idle connection, our solution will make the user at any time
Watch the data package when the TCP is connected. In this way, we will not be dead when they are idle. second
The situation, the remote host fails, and more data is transmitted. Note that we have not taken any
Measures to prohibit normal TCP retransmission logic, so loss messages are not a problem.
The performance test under different conditions indicates that this solution can work in all situations. First
A test is to solve the character-oriented Telnet connection problem. Let us imagine it, the user outputs a new character every 200ms, and this connection is subject to Ethernet, this Ethernet
The round trip time includes 50ms software processing time. If there is no mechanism to prevent short datastance
Congestion, the packet corresponding to each character will be sent, the response is the best. The overhead will be 4000%,
But you can accept in Ethernet. The typical timer solution, the limitations of two packets per second will make
Two or three characters are transmitted in a packet. Responsive performance will be reduced, even in high bandwidth
It is also useless online. The overhead dropped to 1500%, but this is a uncomfortable replacement on Ethernet.
And our program, each character tapped by the user will find an idle TCP connection, the character will immediately
Transfer, just as there is no control. Users will not feel delay. This way, our solution
It can be used as a non-controlled scheme to provide a better response to a timer solution.
The second test is the same TELNET test, but the test is in 5 seconds round trip time.
Long-distance connection. If there is no mechanism to prevent short packets from congest, 25 packets will be within 5 seconds.
Send it. The overhead of this is 4000%. Use a typical timer solution and the same limit of 2 packets per second
System will still have 10 packets that cannot be handled and cause congestion. Of course, the round trip time will not be much transmitted.
The packet is improved; usually, it will become worse due to the purchase of the packet competition. At this time
At 1500%. However, use our solution, the first character from the user will find the idle TCP connection
Extra transfer immediately. The next 24 characters, from the user end from the user, will wait far
The message from the host. When a confirmation Ack for the first datagram is arriving at the end of 5 seconds, these 24
Waiting for character encapsulation into the packet being transmitted. In this way, our solutions have led to no loss
The cost is reduced to 320% in the case of time. Response time with our program is usually due to packet overhead
Decreased and improved. The congestion will decrease and the round trip time delay will be significantly reduced. In this case, we
The solution is significantly better than other methods.
We use our scheme for all TCP connections, not just a Telnet connection. Let us see
See what will happen when you use our way for file transfer to establish a data connection. We will once again take into account
These two extreme situations.
As mentioned earlier, we first consider the situation of Ethernet. Users now with 512-word blocks
The size is written to TCP according to the speed that TCP can accept. The user first writes data written to TCP;
Our first datagram will be 512 40 bytes or 552 bytes. User's second write data
Will not cause transmission but will make this block buffer. Imagine that the user fills the TCP before the first confirmation
Output buffer. Then, when the ACK arrives, all the queuing data that meet the window size will be sent, from
At that time, the window was kept full, and each ACK started a transfer loop, and the data waiting is sent. such,
After a round-trip initial cycle, only one block is to transmit, our solution solves the maximum swallowing
The situation of spitting. Since the start latency is only 50ms on the Ethernet, the start-up instantaneous delay is
Off. The three programs provide an equivalent performance for this situation.
Finally, let's take a look at the file transfer on the connection between 5 seconds. In the first confirmation
There is only one packet before coming, and the window will be filled and filled. Because the round trip time is 5 seconds, only
512 bytes of data are sent in the first 5 seconds. Assume that there is a 2K window, once the first inde
Recognizing that the data of 2K will be sent, then the stable rate of 2k is maintained. Only in this
In the case of our scheme, the timer is poor, and the difference is only starting instantaneous. Throughput in a steady state
Are the same. The simple program and timer solution spend 250 seconds in the case of 250 seconds to transmit 100K bytes, and our solution will spend 254 seconds, 1.6% difference.
Congestion control with ICMP
Solve short datagram issues and the extreme short dysfunction in our own network, I
They turn attention to the usual congestion control. Since our network is a pure number without point-to-point flow control
According to the network, for us, the only mechanism available under IP standard is ICMP source suppression packet. in
With careful control, we found this to prevent serious congestion problems. We found a host or exchange
The behavior of the node source suppression packet is necessary.
When to send ICMP source suppression packets
The current ICMP standard specifies, whether it is lost, ICMP source suppressing packets should be sent,
In addition, when the gateway finds that it is insufficient resources, the source suppression message should be sent. Here is some secondary, but very
Obviously, there is no ICMP message to discard the standard for the data package.
Our basic hypothesis is that the data packet should not be discarded when the network is running. So we want to
The sender suppresses the transfer node and the gateway overload before suppressing the sender retransmission. Our exchange nodes in the buffer
ICMP source suppression packets can be sent well before exhausting; until they are sending ICMP source suppression packets
Wait when you have discarded the packet. As we demonstrated in the analysis of short data newspapers, only provide large
The amount of buffer cannot solve the problem. Usually, our experience is that when used in half of the buffer,
Source suppression packets should be sent; this is not based on a wide range of experiments, but it seems to be a reasonable technical decision.
An adaptive plan can be discussed, and this program can adjust the border based on inhibition of recent experience;
We have not found this necessity so far.
There are other gateway implementation algorithms, only generate source suppression packets only after the not only one package is discarded.
We think this method is not good, because any package-discarded congestion control system waste bandwidth, and
It may be easily affected by the congestion crash at the weight of the load. We understand that passive generating source suppression packets
The program is based on such concerns, fear confirming that transmission will be suppressed and this will result in failure. As follows
This possibilities are excluded from the appropriate source suppression control in the host implementation.
What to do when receiving ICMP source suppression messages
When ICMP receives a source suppression message, we notify TCP or any other protocol on this layer.
The basic behavior of our TCP concrete implementation is to reduce the connection of the host indicated in the source suppression packet.
The number of unprocessed data. The reaction of the sender TCP is like a distal host window size has been reduced, from
This control is implemented. Our first realization is too simplified but is effective; once the source suppression packet is received,
As long as the window is not empty, our TCP thinks that the window size is 0 and processed accordingly. This behavior will hold
Continued to receive a certain amount (now 10), until then, TCP returns to normal operational status
. Linkabit's David Mills implements a similar but more detailed detail in his DCN system
The throttling control of the untreated packet. This additional complexity is like providing a throughput
The way, but we didn't have a formal test. Both implementations effectively prevent the congestion of the exchange nodes
collapse.
In this way, the source suppression method effectively limits the connection of unprocessed packets (possibility 1).
Pick up. Therefore, communication can continue but the rate is lowered, which is the desired effect.
This scenario has an important nature that cannot be confirmed and retransmitted. Source suppression
The full implementation on the IP layer is usually unsuccessful, because IP lacks sufficient information to properly control one
Connect traffic. Inhibition confirmation information often produces retransmission and unnecessary transmission. Inhibition retransmission may be returned
Timeout is lost. Our solution will remain active under the server overload but reduce each connection
The bandwidth is connected.
Other protocols as the same layer and TCP should also respond source suppression. In each case, we build
The new transmission traffic should be controlled but normal treatment. The only serious problem comes from the user datagram protocol, and the usual main transmission generator. We have not implemented any traffic control in these agreements.
Defense of the gateway
As we have already displayed, the gateway is easily attacked by the host of congestion management. Due to excessive production
Host error behavior caused by traffic not only prevents the host's own transmission and can affect others
Related transmission. This problem can be handled in the host level, but since a failure host can affect other
Host, future gateways should have defense capabilities to make themselves not by those rumored hosts.
For the impact. We offer some basic self-defense technology.
Once in the second half of 1983, a TCP fault in the Arpant host made the host with arpanet
The speed that can be accepted madly produce the re-issuance of the same data. Connect to our network via Arpanet
The gateway is saturated, since this gateway to the bandwidth of Arpanet, a small amount of useful
Transmission can pass. The gateway is busy sending a source suppression message but the failure host ignores them. This lasted a few small
When the failure host crashes until the host. During this time, our network effectively disconnected from the Arpanet.
When the gateway is forced to discard a packet, the gateway carefully selects the packet to be discarded. Make this decision
The typical technology is to discard the recently received packet, or the data packet at the end of the longest output queue.
We propose a worthwhile practical method to discard the correspondence of the queue that produces the most data packets in the gateway.
The most recent packet generated by the machine, this strategy helps balance the throughput between the hosts of this gateway
the amount. We haven't tried this strategy yet, but it seems to be a reasonable start point of the gateway self-protection.
Another strategy is to discard the newly arrived packet, if this packet has been made in the queue
copy. If you use hash technology, it is not a problem to implement this check. This check is not
It can prevent vicious hosts from attacking, but provide some protection measures to prevent TCP in low inferior transmission control.
Now. If the local host is well coordinated with the local network, then in the fast local network and slow long distance
The gateway between the network can find that this check is valuable.
Ideally, the gateway should detect a fail-ended host and suppress them; this test is on pure datagram
It is difficult in the system. Although the response to ICMP source suppression packets should be considered a gateway and the master
The basis for the disconnection. Detecting such invalid is unusual but it is a leader worth further study.
area.
in conclusion
Congestion control issues related to pure data newspapers are difficult, but effective solution is to exist.
of. If TCP / IP is running under heavy load, the TCP implementation algorithm must be described with less and described here.
Declaration is the same way to solve these critical issues.
There is no problem in the pure Arpanet network, because when there is no processed packet, the IMP mechanism will block the main
Machine, but in this case, it involves pure data newspapers (such as Ethernet) or a pure data newspaper gateway (such as
A large amount of small datastance is not processed by ARPANET / MILNET Gateway.
ArpaNet RFC 792 is current standard. National Defense Communication Department notices us, in Mil-STD-1777
The ICMP description is incomplete and will be deleted in this standard future revision.
This point of view of the control is "Don't be proportionally controlled unless it doesn't work."
RFC896 - Control IN IP / TCP InternetWorks TCP / IP Internet Congestion Control
1
RFC Document Chinese Translation Program