11 UDP: User Data News Agreement
11.1 Introduction UDP is a simple datagramic transport layer protocol: each output operation of the process just produces a UDP datagram, and is assembled into a IP datagram to be sent. This is different from the agglomerates for streaming characters, such as TCP, all data generated by the application and the real-sent single IP datagram may not contact. The UDP data is packaged into a format of an IP datagram, as shown in Figure 11.1.
Figure 11.1 UDP package
RFC 768 [Postel 1980] is a formal specification for UDP. UDP does not provide reliability: it transmits the application to the IP layer to the IP layer, but does not guarantee that they can reach the destination. Due to lack of reliability, we seem to feel that avoid using UDP and uses a reliable protocol such as TCP. We will return to this topic after discussing TCP in Chapter 17, see what kind of app can use UDP. The application must care about the length of IP datagram. If it exceeds the network's MTU (2.8), then the IP datagram is filed. If necessary, each network between the source to the destination is slidably, not only the sender host is connected to the first network. (We have defined the concept of path MTU in Section 2.9.) In Section 11.5, we will discuss the IP slice mechanism.
11.2 The fields of the UDP header UDP head are shown in Figure 11.2.
Figure 11.2 UDP head
The port number indicates the transmission process and the reception process. In Figure 1.8, we draw TCP and UDP with destination port numbers to divide the process from the IP layer. Since the IP layer has assigned the IP datagram to TCP or UDP (depending on the protocol field value in the IP header), the TCP port number is viewed by TCP, and the UDP port number is viewed by UDP. The TCP port number and the UDP port number are independent of each other.
(The following is the original book P.1441) Although independent of each other, if TCP and UDP provide some well-known services, two protocols typically select the same port number. This is purely for easy use, not the requirements of the agreement itself.
The UDP length field refers to the byte length of the UDP header and UDP data. The minimum value of this field is 8 bytes. (Send a 0-byte UDP dataginary is OK.) This UDP length is redundant. The IP datagram length refers to the full length of the data (Fig. 3.1), so the UDP datagram length is the length of the full length minus the length of the IP header (this value is specified in the head length field, as shown in Figure 3.1).
11.3 UDP inspection and UDP testing and cover UDP headers and UDP data. Recall the IP header inspection and it only covers the head of IP ---- does not overwrite any data in the IP datagram. UDP and TCP have inspectors that cover their head and data in the header. The UDP test and is optional, and TCP test is required. Although the UDP inspection and basic calculation method is similar to the IP header test and calculation method we described in Section 3.2 (the binary inverse of the 16 bit word and the binary and), there is a different place. First, the length of the UDP datagram can be an odd number, but the inspection and algorithm add a number of 16 bit words. The solution is necessary when it is necessary to increase the filling byte 0, which is just for the calculation. (That is, the population byte that may be increased is not transferred.) Secondly, the UDP datagram and TCP segments contain a 12-byte long pseudo-head, which is set to calculate the inspection and. The false header contains some fields in the IP. Its purpose is to let UDP inspect the data correctly to reach the destination (for example, IP does not accept the address is not the host's datagram, and the IP does not pass the data reported to another high-level data to UDP). The pseudo-headed format in the UDP datagram is shown in Figure 11.3.
Figure 11.3 Various fields used during UDP inspection and calculation
In this figure, we specially raised a odd length datagram, so it is necessary to add a padding byte when calculating the inspection and. Note that the length of the UDP datagram appears twice during the inspection and calculation. If the test results are 0, the value thereof is all 1 (65535), which is equivalent to its arithmetic binary counter. If the test is transmitted and 0, it means that the transmitting end does not calculate the inspection and. If the sender does not calculate the inspection and the receiving end detects the inspection and error, then the UDP datagram will be quietly discarded. No error messages are generated. (This is also done when the IP layer detects IP first inspection and error.) UDP test and is an end-to-end inspection. It is calculated by the sender and then verified by the receiving end. Its purpose is to find any changes that the UDP header and the data taken between the transmitting end to the receiving end. Although the UDP test is optional, they should always use it. In the 1980s, some computer companies turned off the UDP inspection and features under default conditions to improve the speed of NFS (Network file system) using the UDP protocol. This may be acceptable in a single local area network, but when the datagnet passes the router, most of the errors can be detected by cycling the link layer data frame (such as the Ethernet or token loop data frame). Leading the transmission failed. Regardless of whether it is believed, there are also software and hardware errors in the router, so that the data in the datagram is modified. If the end-to-end UDP inspection and feature are closed, these errors cannot be detected in the UDP datagram. In addition, some data link layer protocols (such as SLIP) do not have any form of data link inspection and. (The following is the original book P.1461) Host Requirements RFC declaration, UDP inspection and options are open under default conditions. It also declares that if the transmitting end has calculated the inspection and, the receiving end must verify the received inspection (such as receiving the inspection and not 0). However, many systems have not observed this, but verify the received inspections when the exit checks and options are opened.
TCPDUMP output is difficult to know if a particular system opens UDP inspection and options. Applications usually impossible to get the inspections in the received UDP header. In order to get this, the author adds an option in the TCPDump program to print the received UDP inspection and. If the value of the print is 0, it means that the transmitting end does not calculate the inspection and. We test the output of three different systems on the network as shown in Figure 11.4 (see Cover 2). Run our self-contained SOCK programs (Appendix C), send a UDP Data Remove the standard echo server that contains 9 byte data.
Figure 11.4 TCPDUMP output, observe whether other hosts open UDP inspection and options
We can see here that there are two UDP inspection and options in the three systems. Also note that in this simple example, the sent datagram has the same inspection and value (third and fourth lines, fifth and 6th lines). From Figure 11.3 We can see that the two IP addresses are exchanged, just as two port numbers. The other fields in the puppet head and the UDP head are the same, just like data. This again shows the UDP inspection and (in fact, all inspections in the TCP / IP protocol cluster) are simple 16 bits and. They detect no errors in the two 16th bits.
(Below is the original book P.1471) Author's DNS query in eight domain names in Section 14.2. DNS mainly uses UDP, and the result is only two servers open UDP inspection and options.
Some statistical results [Mogul 1992] provide statistical results of different inspections and errors on a busy NFS (Network file system) server, and the time lasted for 40 days. The statistics are shown in Figure 11.5. Figure 11.5 Detecting different inspection and error packet statistics
The last column is the total number of each line, as the Ethernet and IP layers use other protocols. For example, not all Ethernet data frames are IP datagram, at least Ethernet, the ARP protocol is used. Not all IP datagrams are UDP or TCP data because ICMP also uses IP to transfer data. Note that the ratio of TCP test and error is much higher than UDP. This is probably because TCP connections in the system are often "remote" connections (after many routers, bridges, etc. ", and UDP is generally local communication. As can be seen from the last line, do not fully believe in the CRC inspection of the data link (such as Ethernet, token, etc.). You should always open the end-to-end inspection and functionality. Moreover, if your data is valuable, do not fully believe in the UDP or TCP inspection, because these are just simple inspections and cannot detect all possible errors.
11.4 A simple example generates some UDP datagram that can be observed by TCPDUMP with our own SOCK program:
BSDI% SOCK -V -U -I -N4 SVR4 DiscardConnected on 140.252.13.35.1108 to 140.252.13.34.9
BSDI% SOCK -V -U -I -N4 -w0 SVR4 DiscardConnected on 140.252.13.35.1110 to 140.252.13.34.9
When you perform this program for the first time, we specify the Verbose mode (-V) to observe the Ephemeral port number, specify UDP (-U) instead of the default TCP, and specify the source mode (-i) to send data, not reading and writing standards Input and output. -N4 Option Indicates 4 Dataset (1024 under default), and the destination host is SVR4. We describe discarding services in Section 1.12. The output length of each write is default 1024. When we run the program for the second time, we specify -w0, meaning that the write length is 0 datagram. The TCPDUMP output of the two commands is shown in Figure 11.6.
Figure 11.6 TCPDUMP output when sending UDP data reporting in a direction
The output shows that there are four 1024 bytes of datagram, and then there are four parts with a length of 0. Each data is intervals several milliseconds. (Input the second command spent 41 seconds.) There is no communication between the sender and the receiving end before sending the first data report. (In Chapter 17, we will see that the TCP must establish a connection with the other end before sending the first byte of the data.) In addition, the receiving end does not have any confirmation when receiving the data. In this example, the sender does not know if the other end has received these datagrams. Finally, it is to be pointed out that each time the program is running, the source of the UDP port number changes. The first is 1108, then 110. At 1.9 we have already mentioned that the client program uses the Ephemeral port number to be between 1024 and 5000, as we see now.
11.5 IP slice is as described in Section 2.8, the physical network layer generally limits the maximum length of each transmission data frame. At any time, the IP layer receives an IP datag report to be sent, it is to determine which interface to local (routing) and query the interface to get its MTU. IP compares the MTU with the data report length if needed. Split can occur on the original transmit-end host, or on the intermediate router. After a piece of IP data is divided, only the destination is re-assembled. (The reassembly here is different from other network protocols, which requires reassembly in the next stop, rather than in the final destination.) Reissue the IP layer of the destination to complete, the purpose is to make fragmentation and reassembly The process is transparent to transportation layers (TCP and UDP), except for some possible levels of operation. Data that has been sliced may be shard again (possibly more than once). The data contained in the IP header provides sufficient information for fragmentation and reassembly. Recall the IP header (Figure 3.1), which is used below for fragmentation processes. For each IP datagram file sent by the sender, its identity field contains a unique value. This value is copied to each piece when the data is filed. (We have now seen this field use.) The flag field uses one of the bits to represent "more slides". In addition to the last one, the other of each of the other components of the data is set to 1. The slice offset field refers to the position of the film offset the original datagram. Further, after the data is reported, the total length value of each sheet is changed to the length value of the sheet. Finally, there is a bit called "no slice" bit in the flag field. If this ratio is set, IP will not slide the datagram. In contrast, the datagram is discarded and transmits an ICMP error message ("need to be sharded but set up no sliver bit", Figure 6.3) gives the initiator. In the next section we will see examples of this error. When IP datagram is sliced, each piece is a group, with its own IP head, and independently in selecting routes. Thus, it is possible to sequel when the sheets of the data report arrive at the destination, but there is sufficient information in the IP header to make the receiving end correctly assemble these data messages. Although the IP fragmentation processes look transparent, there is a point that people don't want to use it: even if only one piece of data is lost, it is necessary to retransmit the entire datagram. Why do this happen? Because the IP layer itself does not have timeout retransmission mechanisms - by higher levels are responsible for timeout and retransmission. (TCP has timeout and retransmission mechanisms, but UDP is not. Some UDP applications themselves also perform timeout and retransmission.) When a piece from the TCP report segment, TCP will retransmit the entire TCP message after the timeout The report section corresponds to an IP datagram. There is no way only to retransmit a data report in the datagram. In fact, if the data is divided into an intermediate router, instead of initiated end systems, then the initiated end system cannot know how the duplicate is slid. This reason is often avoided. Document [Kent and Mogul 1987] discusses the separation of fragmentation. Using UDP can easily lead to IP fragmentation. (In the following, we will see that TCP tries to avoid fragmentation, but for the application, it is almost impossible to force TCP to send a long report text that needs to perform shards.) We can use the Sock program to increase the length of the datagram, Until the fragmentation occurred. On an Ethernet, the maximum length of the data frame is 1500 bytes (Fig. 2.1), where 1472 bytes are left to data, assuming that the IP header is 20 bytes, the UDP head is 8 bytes. We run the SOCK program with a data length of 1471, 1472, 1473, and 1474 bytes. Fragmentation should occur last twice:
BSDI% SOCK -U -I-NL-W1471 SVR4 DiscardBSDI% SOCK -U -I-NL-W1472 SVR4 DiscardBSDI% SOCK -U -I-NL -W1473 SVR4 DiscardBSDI% SOCK -U -I-NL-W1474 SVR4 Discard The TCPDUMP output is shown in Figure 11.7.
Figure 11.7 Observation UDP Data Refrigeration Split
The first two UDP datagrams (first lines and second lines) can be loaded into Ethernet data frames and are not fragmented. However, corresponding to the IP data report length of the Write 1473 bytes is 1501, it is necessary to separate (line 3 and 4). Similarly, the data report generated by 1474 bytes is 1502, which also needs to be shard (line 5 and 6). When the IP datagram is framed, TCPDUMP prints other information. First, Frag 26304 (line 3 and 4) and FRAG 26313 (line 5 and 6) refer to the value of the IP header identification field. The next number in the fragmentation information, that is, 1480 between the colon and @ 号 in the third row, is the femur of the IP head. The length of the first piece of data is 1480: UDP header accounts for 8 bytes, and user data accounts for 1472 bytes. (Plus the 20-byte packet length of the IP head is just 1500 bytes.) The second piece (line 4) of the first data report contains only 1 byte data - the remaining user data. The second piece of data reported (line 6) contains the remaining 2-byte user data. When fragmentation, in addition to the last piece, the data portion in each of the other pieces (except the remainder outside the IP head) must be an integer multiple of 8 bytes. In this example, 1480 is an integer multiple of 8. The number after the @ symbol is the film offset value calculated from the beginning of the datagram. The two-piece data reported the first piece of offset value of 0 (line 3 and 5), the second piece of the offset value of 1480 (line 4 and 6). The plus number behind the offset value corresponds to "more" bits in the 3 Bit flag field in the IP header. Setting this bit is to let the receiving end know when doing all slice assemblies. Finally, pay attention to the 4th and 6th rows (not the first one) omitted the protocol name (UDP), source port number, and destination port number. The protocol name is printed because it is in the IP header and is copied into each piece. However, the port number is in the UDP header and can only be discovered in the first piece. The third data report (973 bytes of user data) sent is shown in Figure 11.8. It is to be reaffirmed that any transport layer is only in the first data. It is additionally desirable to explain several terms: IP datagram refers to a transmission unit end-to-end in an IP layer (before and re-assembly), the packet refers to the data unit transmitted between the IP layer and the link layer. A packet can be a complete IP datagram, or a fragmentation of an IP datagram.
Figure 11.8 UDP fragment example
11.6 ICMP is not reachable error (requires fragmentation) Another case that occurs if ICMP is not reachable error is that when the router receives a data report that requires fragmentation, the IP head is also set up a sign (DF) flag. Bit. If a program needs to judge the minimum MTU on the road to the destination, how much-called path MTU discovery mechanism (2.9), then this error can be used by the program. In this case, ICMP does not reach the error message format is shown in Figure 11.9. The format here is different from Figure 6.10, because in the second 32 Bit word, 16-31 bit can provide the next station MTU, no longer 0.
Figure 11.9 ICMP is required to separate the slide but also set up the ICMP of the slice sign bits Format Format If the router does not provide this new ICMP error message format, the MTU of the next station is set to 0.
(The following is the original book P.1511 translation) The new version of the router demand RFC [Almquist 1993] declares that the router must generate this new format when this ICMP does not reach the error.
Examples about the separation of fragmentation have encountered a problem, ICMP error attempts to determine the MTU from the router NetB to the mainframe SUN. We know the MTU from Sun to Netb's link: When SLIP is installed to host Sun, this is part of the SLIP configuration process, plus we have observed through the netstat command in Section 3.9. Now, we want to judge its MTU from another direction. (In Chapter 25, we will discuss how to use SNMP to determine.) In the link of point-to-point, the MTUs do not require the same value in the two directions. The technique used is to run the PING program on the host Solaris to host BSDI, increasing the length of the data packet until you see the incoming packets. As shown in Figure 11.10.
Figure 11.10 System used to determine the SLIP link MTU from Netb to Sun
Run TCPDUMP on the host Sun to view the SLIP link to see when fragmentation. At the beginning, the shard was not observed, everything was normal until the ping packet data length increased from 500 to 600 bytes. You can see the received echo request (still no fragmentation), but you won't see the answer. In order to track, TCPDUMP is run on the host BSDI, and the packets it receives and transmits. The output is shown in Figure 11.11.
Figure 11.11 TCPDUMP output when IP data is 600 bytes from the Solaris host ping to the BSDI host
First, the tag (DF) in each row illustrates that no slide bits are set in the IP header. This means that Solaris 2.2 generally transmits no sliberation ratio 1, as part of the implementation of the path MTU discovery mechanism. The first line shows that the request is sent to the Sun host via the router NetB, not fragmentation, and set DF bit, so we know that NetB's SLIP MTU has not yet reached NetB. Next, at the 2nd line pay attention to the DF flag is copied to the echo answer message. This brings a problem. Election The answer is the same length (more than 600 bytes), but the SLIP interface MTU is 552. Therefore, the remarkable answer needs to be shard, but the DF flag has been set again. In this way, Sun generates an ICMP that cannot be reached to BSDI (packets are discarded at BSDI). This is why we didn't see any return answer on the host Solaris. These answered can never pass Sun. The path of the packet is shown in Figure 11.12.
Figure 11.12 Packet exchange in the example
Finally, in line 3 and 6 of Figure 11.11, MTU = 0 indicates that the host Sun is not returned to the outlet MTU value in the ICMP, as shown in Figure 11.9. (In Section 25.9, we will return to this problem, and use SNMP to determine the SLIP interface MTU value on Netb is 1500.)
11.7 Determining the path MTU Despite the Most of the System does not support the path MTU discovery function, we can easily modify the TraceRoute program (Chapter 8), use it to determine the path MTU. What we have to do is to send a group and set the "no slide" flag bits. The length of the first packet we sent is just equal to the export MTU. Each time I receive ICMP "I can't sharpen" error (we discussed in the previous section) we reduce the length of the packet. If the ICMP error message sent by the router is a new format, the MTU containing the exit, then we use the MTU value to send, otherwise you will send it with the next minimum MTU value. As the RFC 1191 [Mogul and Deering 1990] declares, the number of MTU values is limited, so there are some tables consisting of approximation in our program, and take a minimum MTU value to send. First, we tried to judge the path MTU from the host Sun to the host SLIP, know the MTU of the SLIP link is 296. (See the original book P.154 1)
In this example, the router BSDI does not return an exit MTU in the ICMP error message, so we choose another MTU approximation. TTL is 2, the first line outputs the printed host name BSDI, but this is because it is a router that returns an ICMP error message. The last line of TTL is 2 is what we are looking for. Modify the ICMP code on the BSDI to return the exit MTU value is not difficult, if we do, run the program again, get the following output results:
(See the original book P.154 2)
At this time, we don't have to try 8 different MTU values one by one before finding the correct MTU value - the router returns the correct MTU value.
As an experiment, we have repeatedly running the TRAUTE program in the future, and the destination is the host around the world. You can reach fifteen countries (including Antarcticia), using multiple links across the Atlantic and cross the Pacific. However, before doing so, the first SLIP link MTU (Fig. 11.12) between the subnet and the router NETB is increased to 1500, the same as the Ethernet. In 18 operations, only the path MTU discovered by 2 times is less than 1500. One of the link MTU values across the Atlantic is 572 (which is not listed even in RFC 1191), and the router returns to the new format ICMP error message. Another link, between the two routers of Japan, cannot handle 1500 bytes of data frames, and the router does not return ICMP error messages in the new format. Setting the MTU value to 1006 can work. From this experiment, we can conclude that many but not all wide area networks can handle grouped gaps greater than 512 bytes. With the path MTU discovery mechanism, the application can make full use of larger MTUs to send messages.
11.8 Path UDP MTU Discovery The following allows us to study the interaction between applications that use UDP and path MTU discovery mechanism. Let's see what happens if the app writes a too long data report for some intermediate links.
Example Since the unique system of the support path MTU discovery mechanism we used is Solaris 2.x, we will use it as a source station to send a 650-byte data report to SLIP. Since our SLIP host is located after the SLIP link of 296, any UDP data longer than 268 bytes (296-20-8) and "no sliced" bit is 1, will cause the BSDI router to generate ICMP " Split "error message. Figure 11.13 shows the topology and MTU.
Figure 11.13 Using UDP for path MTU discovery
You can use the following command line to generate 650 byte UDP datagram, and the interval between the two UDP datagrams is 5 seconds:
Solaris% SOCK -U -I-N10 -W650 -P5 SLIP Discard Figure 11.14 is the output result of TCPDUMP. When running this example, set the BSDI to the ICMP "cannot slide" error, not returning the next hop MTU information. In the first datagram sent, the DF ratio is set (first line), and the result is the result of us to guess from the BSDI router (line 2). The order is not unfolbened, and the DBR 1 of the DF ratio is sent (line 3), and the result is the same ICMP error (line 4). We expect this datagram to set the DF ratio 0 when sending. The 5th line shows that the IP already knows the datagram that is sent to the address of the destination cannot set the DF ratio 1. Therefore, IP will further sharpen the datagram on the source station host. This is in the previous example, the IP sends a UDP-based datagram that allows the router with a smaller MTU (BSDI in this example) to slide it. Since ICMP "Cannot Split" message does not point out the next hop MTU, it seems that IP guessing MTU is 576. The first slice (line 5) contains 544 bytes of UDP data, 8-byte UDP headers and 20-byte IP headers, therefore, the total IP datagram is 572 bytes. The second slice (line 6) contains the remaining 106-byte UDP data and 20-byte IP headers.
Figure 11.14 Using UDP Path MTU Discovery
Unfortunately, the next datagram of Chain 7 puts its DF ratio 1, so BSDI discards it and returns ICMP error. At this time, the IP timer has occurred, and the notification IP view is not because the path MTU increases the DF bit again. We can see this result from line 19 and 20 lines. Comparing the Chain 7 and 19, it seems that IP will set the DF ratio 1 for every 30 seconds to view the path MTU to increase.
(The following is the original book P.1561) This 30-second timer value is too short. RFC 1191 recommends its value for 10 minutes. This value can be changed by modifying the IP_IRE_PATHMTU_INTERVALVAL (E.4 section) parameter. At the same time, Solaris 2.2 cannot close the path MTU discovery for a single UDP application or all UDP applications. You can only open or close it at the system level by modifying the IP_PATH_MTU_DISCOVERY parameter. As we can see in this example, if the path MTU is allowed to discover, the datagram will be discarded when the UDP application writes may be shardded.
The maximum data report length (576 bytes) assumed by the Solaris IP layer is incorrect. In Figure 11.13, we see that the actual MTU value is 296 bytes. This means that the Data reported by Solaris will also be shard by BSDI. Figure 11.15 shows the output result collected on the destination host (SLIP) for the first to reach the data report (Chapter 5 and 6 of Figure 11.14).
Figure 11.15 First Dataset from Solaris to SLIP
In this example, Solaris should not subtract on the outgoing data, it should set the DF ratio 0, allowing the router with the minimum MTU to complete the fragmentation. Now we run the same example, just modify the router BSDI to return to the next hop MTU in the ICMP "Cannot Split" error. Figure 11.16 shows the top 6 lines of TCPDUMP output results.
Figure 11.16 Using UDP Path MTU Discovery
As in Fig. 11.14, the first two dataslines also transmit the DF ratio 1. But after I know the next hop MTU, only 3 data messages are generated, while the BSDI router in Figure 11.15 produces four data messages.
11.9 Interaction between UDP and ARP Use UDP, we can see interesting between UDP and ARP typical implementation (and often not mentioned) interaction. We use the SOCK program to generate a UDP datag report that contains 8192 byte data. We predict that this will produce 6 data messages on Ethernet (see exercises 11.3). We also ensure that the ARP cache is emptied before running the program, so that ARP requests and answers must be exchanged before sending the first data message. BSDI% arp -a -------------- Verify the ARP cache is empty BSDI% SOCK -U -I-NL-W8192 SVR4 Discard
We expect to send an ARP request before sending the first data message. IP will also generate five data messages, which proposes two issues we must answer with tcpdump: Does the remaining data messages have been sent to send preparation before receiving the ARP answer? If this is the case, how will it handle multiple messages sent to give a given purpose when ARP waits for answers? Figure 11.17 shows the output of TCPDUMP.
Figure 11.17 Communication exchange of 8192 byte UDP data reports on Ethernet
There are some surprising results in this output. First, the first ARP replied returned to the glue, a total of 6 ARP requests were generated. We believe that the reason is that IP quickly produces six data messages, and each data report has triggered an ARP request. Second, when receiving the first ARP answer (Chapter 7), only the last data message (9th line) is sent! It seems that it seems that the top 5 data messages will be discarded. In fact, this is the normal operation of ARP. In most implementations, when waiting for an ARP to answer, only the last packet is sent to a particular purpose host.
(Below is the original book P.1581) Host Requirements RFC requires the implementation of this type of ARP flooding (ARP FLOODING, which is repeatedly sent to the same IP address). The highest rate is recommended once a second. Here, six ARP requests were issued within 4.3 ms. Host Requirements RFC specifies that ARP should keep at least one message, and this message must be the last message. This is the result we see here.
Another incompetent phenomenon that cannot be explained is, SVR4 sends seven, not 6 ARP answers. Finally, it is to point out that after the last ARP answer returns, continue to run the TCPDUMP program for 5 minutes to see if SVR4 will return ICMP "assembly timeout" error. There is no ICMP error. (We give the format of the message in Figure 8.2. The Code field is 1 indicates that timeout occurs when the reassembled datagram is resembled.) When the first data message appears, the IP layer must start a timer. Here "The first" indicates the first arrival data message of a given datagram, not the first data message (data block offset is 0). The normal timer value is 30 or 60 seconds. If all data reports of the datagram have not arrived at the timer timeout, the data report is discarded. If you don't do this, those that never arrive will never be reached (as we see in this example) will cause the receiving end to cache. Here we did not see the reasons for ICMP messages. First, most achievements from Berkeley derived never produce this error! These implementations will set the timer, which will discard the data report when the timer overflows, but does not generate ICMP errors. Second, the first data message containing the UDP header is not received. (This is the first one of the five packets discarded by ARP.) Unless the first data message is received, no ICMP error is generated. The reason is that because there is no transport layer, the receiver of ICMP error cannot distinguish the datagram that is sent by which process is discarded. Here, it is assumed that the upper layer (TCP or application using UDP) will eventually turn out and retransmit. In this section, we use IP data messages to view interactions between UDP and ARP. This interaction process can also be seen if the sender quickly sends multiple UDP datagrams. We choose a method of using a slice, because IP can generate the speed of the message, which is faster than a user process to generate multiple datagrams. Although this example is unlikely, it does happen. The UDP data reported by NFS exceeds 8192 bytes. In Ethernet, these datagrams are divided by the way we point out. If the appropriate ARP cache is timeout, then you can see the phenomenon we have here. NFS will transfer and retransmit, but due to the limited queue of ARP, the first IP datagram may still be discarded. 11.10 Maximum UDP Data Report Theoretically, the maximum length of IP datagram is 65535 bytes, which is limited by the IP header (Figure 3.1) 16 bit total length field. Remove 20-byte IP headers and 8 bytes of UDP headers, the maximum length of user data in the UDP datagram is 65507 bytes. However, most implementations are available than this maximum. We will encounter two limitations. First, the application may be limited by its program interface. The Socket API provides a function that can be called to set up the length of the reception and sending a cache. For UDP Socket, this length is directly related to the length of the maximum UDP datagram that the application can read and write. Most of the systems now provide UDP datagram that can read and write more than 8192 bytes by default. (Use this default because 8192 is the default value of NFS read and write user data.) The second limit is from the kernel implementation of TCP / IP. There may be some implementation characteristics (or errors), so that the IP datagram is less than 65535 bytes.
(The following is the original book P.1591) Author uses the SOCK program to experiment with different UDP data report lengths. The maximum IP data report for using a loopback interface in SunOS 1.1.3 is 32767 bytes. An error occurs than its large value. However, in the case of BSD / 386 to SunOS 4.1.3, Sun can receive the maximum IP datagram length of 32786 bytes (ie 32758 byte user data). Using a loopback interface in Solaris 2.2, the maximum acceptable IP datagram is 65535 bytes. The maximum IP datagram from Solaris 2.2 to AIX 3.2.2 can be 65535 bytes. Obviously, this restriction is related to the source of the source and destination. We mentioned in Section 3.2, requiring hosts to receive IP datagrams for the shortest 576 bytes. In many UDP applications, their application data is limited to 512 bytes or less, so smaller than this limit value. For example, we see in Section 10.4, and the path information protocol always sends data for each data to be less than 512 bytes. We will encounter this limit in other UDP applications such as DNS (Chapter 14), TFTP (Chapter 15), Bootp (Chapter 16), and SNMP (Chapter 25).
Data Repairing Datasters Since the IP can send or receive a specific length of a specific length does not mean that the receiving application can read the length of the length. Therefore, the UDP programming interface allows the application to specify the maximum number of bytes returned each time. What happens if the received data report length is greater than the length of the application you can process? Unfortunately, the answer to the question depends on the programming interface and implementation.
(Below is the original book P.1601 translation) The typical Berkeley version of the Socket API truncates the datagram and discards any extra data. When can applications know that they are related to the versions. (4.3BSD RENO and its subsequent versions can be notified to the application datagram being truncated.) Socket API (including Solaris 2.x) under SVR4 does not truncate the datagram. Exceeding part of the data returns in the back of the read. It does not notify the application to read operations multiple times from a single UDP datagram. The TLI API does not discard the data. Conversely, it returns a logo that you can get more data, and the read operation behind the application will return the rest of the datagram.
When we discuss TCP, we found that it provides a continuous byte stream for the application without any information boundaries. TCP transmits data in the length of application read operation, so, under this interface, data loss will not occur.
11.11 ICMP Source Stations Inhibit error We can also use UDP to generate ICMP "Source quench" errors. This error may occur when a system (router or host) receives the datagram than its processing speed. Pay attention to the limit word "possible". Even if a system has no cache and discarding the datagram, it is not required to send a source suppression message. Figure 11.18 shows the format of the ICMP source suppression error message. We have a good solution to generate this error message in our test network. We can send datagrams to the router Sun from the BSDI to Ethernet of the Dial-up SLIP link. Since the speed of the SLIP link is only one thousandth of the Ethernet, we can easily make it cache. The following command line sends 100 1024-byte long data to Solaris from the host BSDI through the router Sun. We send datagram to standard discarding services, which will be ignored:
BSDI% SOCK -U -I -W1024-N100 Solaris Discard
Figure 11.18 ICMP Source Station Suppressing Error Packet Format
Figure 11.19 shows the TCPDUMP output result corresponding to this command line.
Figure 11.19 ICMP source station in the router Sun is suppressed in this output, we have deleted a lot of lines, this is just a model. No error occurred while receiving the first 26 data reports; we only give the results of the first datagram. However, starting from our 27th datagram, we receive a source station suppression error message every time we send a datam. There are a total of 26 (74 × 2) = 174 line output results. From our 2.10, the parallel line throughput calculation results can be known that the 1024-byte datagram with a 9600 B / S rate is only 1 second. (Since the MTU from Sun to NetB's SLIP link is 552 bytes, in our example, 20 8 1024 byte datagram will sharpen, so the time will be slightly longer.) But we It can be seen from Figure 11.19, and the Sun router processes all 100 datagrams in less than 1 second, and this time, the first datagna has not passed the SLIP link. Therefore, we are not quite unbelful after using his cache.
(Below is the original book P.1611) Although the RFC 1009 [Braden and Postel 1987] requires the router to generate source stations to suppress error packets when there is no cache, but new Router Requirements RFC [ALMQUIST 1993] modified this. Proposing the router should not generate source stations to suppress error messages. Since the source station suppresss network bandwidth and is an invalid and unfair adjustment for congestion, it is now not supported by people's attitude towards the source station suppression error.
In this example, it is also important to point out that our SOCK program either does not receive source stations to suppress errors, or they are neglected. The result is that if the UDP protocol is used, the BSD implementation usually ignores the source station suppression packets it receives. (As we discussed in Section 21.10, TCP Acceptive Source Station suppresses error packets and will slow down the data transfer speed on this connection.) Part of the reason is that when receiving the source station suppressing error messages, The process that causes the source station suppression may have been aborted. In fact, if we use UNIX Time programs to measure the time running in our SOCK program, the result is that it is only running about 0.5 seconds. However, from Figure 11.19 we can see that some source stations are received in 0.71 seconds after sending the first data report, and the process has been aborted. The reason is that our program is written to 100 datagrams and then stop. But all 100 datagrams have been sent - there are some datagrams in the output queue. This example reaffirms that UDP is a non-reliable protocol that illustrates the traffic control of end-to-end. Although our SOCK program successfully writes 100 datagrams into its network, only 26 datagrams are truly transmitted to the destination. Other 74 datagrams may be discarded by the intermediate router. Unless we create some answer mechanisms in the application, the sender does not know if the receiving end has received this data.
11.12 UDP Server Design Use some of UDP's implies to design and implement the server. Typically, the client's design and implementation is easier than the server side, which is why we want to discuss the design of the server, not the reasons for discussing the design of the client. Typical servers interact with the operating system, and most of them need to process multiple customers at the same time. Usually a customer is in direct communication with a single server, then end. For the server, it is in a sleep state after starting, waiting for the arrival of the customer request. For UDP, when the customer datagram arrives, the server wakes up, and the datagram may contain some form of request message from the customer. Here we are interested in the programming of customers and servers ([Stevens 1990] discusses the details of these aspects), but the UDP's protocol features that affect the design and implementation of the server using the protocol. (We described the design of the TCP server in Section 18.11.) Although some of the features we describe depends on the implementation of the UDP used, these features are common for most implementations. The client IP address and port number are from the customer's UDP datagram. The IP header contains the source and destination IP addresses, and the UDP header contains the UDP port number of the source and destination. When an application receives a UDP datagram, the operating system must tell us who sent this message, that is, the source IP address and port number. This feature allows an interaction UDP server to process multiple customers. Reply back to the client sent a request.
Destination IP Address Some applications need to know that the datagram is sent to who is sent, ie the purpose IP address. For example, the Host Requirements RFC specifies that the TFTP server must ignore the data reported to the broadcast address. (We describe broadcasting and TFTP in Chapter 12 and 15).) This requires the operating system to pass the destination IP address to the application from the received UDP datagram. Unfortunately, not all implementations provide this feature.
(The following is the original book P.1631) The Socket API provides this feature with IP_RECVDSTADDR Socket options. For the systems used herein, only BSD / 386, 4.4BSD and AIX 3.2.2 support this option. This option is not supported by SVR4, SunOS 4.x and Solaris 2.x.
UDP input queue We said in the 1.8 section, most UDP servers are interactive servers. This means that a single server process processes all customer requests for a single UDP port (name port on the server). Each UDP port used in the usually used is associated with a limited size input queue. This means that the request from different customers will arrive automatically by UDP. The received UDP datagram is given to the application in its receiving order (when the application is required to deliver the next datagram). However, the possibility of queuing overflow causes the UDP module in the kernel to discard the data report. We can do the following tests. We run the SOCK program on the BSDI host as a UDP server:
BSDI% SOCK -S -U -V -E -R256 -P30 6666From 140.252.13.33, TO 140.252.13.63: 1111111111 From SUN to broadcast address from 140.252.13.34, to 140.252.13.35: 44444444444 Send from SVR4 to single address
We indicate the following flag: -s Represents as a server, -u means UDP, -v represents the IP address of the print client, and the IP address of the printing (the system supports this feature). In addition, we set this port's UDP receiving cache to 256 bytes (-r), and the size of each app read is also this number (-r). Sign-P30 means that after the UDP port is created, then pause after 30 seconds, read the first datagram. In this way, we have time to start the customer program on the other two hosts, send some datagrams to see how the receiving queue works. The server is started at the beginning of its 30 second pause time, we launched a customer on the Sun host and send three datics: Sun% SOCK -U -V 140.252.13.63 6666 to Ethernet broadcast address Connected on 140.252 . 13.33.1252 to 140.252.13.63.66661111111111 ---------- 11 bytes of data (new row) 222222222 ---------- 10 bytes of data ( New row) 33333333333 ------------ 12 bytes of data (new line)
The destination address is a broadcast address (140.252.13.63). We also start the second customer on the host SVR4, and send another three datagrams:
SVR4% SOCK -U -V BSDI 6666 Connected on 0.0.0.0.042 to 140.252.13.35.6664444444444444444444444444 ---------- 14-bytes of data (new line) 5555555555555555555555555555555555555555555555555555555555555555 ------ ------ 16 bytes of data (new row) 66666666 -------- 9 bytes of data (new row)
First of all, we have seen the results seen earlier in BSDI indicate that only 2 datots are received: the first full 1 packet from Sun, and the first full 4 packet from SVR4. Other 4 datagrams appear to be discarded. Figure 11.20 gives the TCPDUMP output show that all 6 datagrams are sent to the destination host. The data reported by the two customers is typed in alternating order: the first from Sun, then from SVR4, so push it. We can also see that all six datagons are sent over 12 seconds, which is completed within 30 seconds of the server.
Figure 11.20 Two customers send UDP datagrams TCPDUMP output results
We can also see that the server's -e option allows it to know the destination IP address of each datagram. If needed, it can choose how to handle the first datagram that it receives, the address of this datagram is a broadcast address. We can see the following points from this example. First, the application does not know when its input queue overflows. It is only dropped by UDP to discard the datagram. At the same time, from the TCPDUMP output, we see, did not send any information to tell the customer its datagram is discarded. There is no message such as an ICMP source station suppression such a transmission end. Finally, it seems that the UDP output queue is FIFO (advanced first out), and the ARP input we see in Section 11.9 is LIFO (afterwards).
Limiting Local IP Addresses Most UDP Servers have the characteristics of wildcards (Wildcard) when creating UDP endpoints. This shows that the UDP datagram is entered if its destination is the server port, then it can be received in any local interface. For example, we launched a UDP server with port number 777:
Sun% SOCK -U -S 7777
Then we use the netstat command to observe the status of the endpoint:
Sun% netstat -a -n -f inetactive internet connections (incruding Servers) Proto Recv-q de Local Address Foreign Address (State) UDP 0 0 * .7777 *. *
Here, we have deleted a lot of lines, only the things we are interested in. The -a option indicates the status of all network endpoints. The -n option indicates that the IP address is printed in a point format without converting the address into a name, printing a digital port number instead of a service name. The -f inet option indicates that only the TCP and UDP endpoints are reported. The local address is printed in * .7777 format, the asterisk represents any local IP address. When the server creates an endpoint, it can specify the local IP address of the host local IP address as the endpoint. The UDP datagram that is entered can only be sent to this endpoint when the purpose IP address matches the specified address. With our SOCK program, if we specify an IP address before the port number, the IP address becomes the local IP address of the endpoint. E.g:
Sun% SOCK -U -S 140.252.1.29 7777
The restriction server receives the datagram at the SLIP interface (140.252.1.29). The NetStat output is shown below:
Proto Recv-q Send-Q Local Address Foreign Address (State) UDP 0 0 140.252.1.29.7777 *. *
If we try to send a datagram to the server at the host BSDI on the Ethernet, then the returning an ICMP port is not available. The server will never see this datagram. This situation is shown in Figure 11.21.
Figure 11.21 Server local address binding results in rejecting UDP datagram
It is possible to start different servers on the same port, each with a different local IP address. However, it is generally necessary to tell the system application to reuse the same port number without problems.
(The following is the original book P.1651 translation) When using the Sockets API, you must specify the SO_REUSEADDR Socket option. It is done by the -A option in our SOCK program.
On our host sun, we can launch 5 different servers on the same port number (8888):
(See the original book P.165 2)
In addition to the first, other servers must start with the -a option, telling the system to reuse the same port number. The NetStat output of 5 servers is as follows:
(See the original book P.165 3)
In this case, in this case, only the local IP address with an asterisk is 140.252.1.255, because the other four servers occupy all other possible IP addresses. If there is a star-containing IP address, a priority relationship is implicit. If a specific IP address is specified for the endpoint, then the IP address is always prioritized when a matching address is matched. The endpoints containing star numbers are only used when the match is unsuccessful.
Limiting the external IP address in front of all NetStat results output, the external IP address and the external port number are displayed *. *, Which means that the endpoint will accept UDP datagram from any IP address and any port number. Most systems allow UDP endpoints to limit external addresses. This shows that the endpoint will only receive the UDP datagram of a particular IP address and port number. Our SOCK program uses the -f option to specify the external IP address and port number:
Sun% SOCK -U -S -F 140.252.13.35.4444 5555
This is set to the external IP address 140.252.13.35 (ie, host BSDI) and external port number 4444. The name port number of the server is 5555. If we run the netstat command, we found that the local IP address is also set, although we are not specified. Proto Recv-Q Send-Q Local Address Foreign Address (State) UDP 0 0 140.252.13.33.5555 140.252.13.35.4444
This is the side effects brought by specified the external IP address and port number in the Berkeley derived system: If the local address is not selected when specifies the external address, the site address will be automatically selected. Its value is the interface IP address that will select when choosing an external IP address route. In fact, in this example, Sun is connected to external addresses 140.252.13.33 on the IP address of Ethernet. Figure 11.22 summarizes the three types of address bindings that the UDP server itself can create.
(The following is a partial translation of Figure 11.22) Local address external address description localip.lportForeignip.fport is limited to one customer localip.lport *. * Limited to the Detailed Data Received by a local interface: localip * .lport *. * Receive to LPORT All datagrams 11.22 Specify local and external IP addresses and port numbers for UDP servers
In all cases, LPORT refers to the server's port number, and localip must be the IP address of the local interface. The sorting of these three rows in the table is the order in which the UDP module receives the data report when it is determined which endpoint is used. The most determined address (first line) is first matched, the least uncertain address (the last row IP address is two aster numbers), finally matches.
Each port has multiple recipients despite not indicated in the RFC, but most systems only allow one program endpoint to be associated with a local IP address and UDP port number at a certain time. When the destination is the UDP datagram of the IP address and the port number, copy it to the endpoint. The IP address of the endpoint can contain an asterisk, as we discussed earlier. For example, in SunOS 4.1.3, we start a server with a port number 9999, the local IP address contains an asterisk:
Sun% SOCK -U -S 9999
Next, if we start another server with the same local address and port number, it will not run, although we specify -A option:
Sun% SOCK -U -S 9999 We expect it to fail to CAN't Bind Local Address: Address Already in Uses
Sun% SOCK -U -S -A 9999 So this trial -a parameter can't Bind Local Address: Address Already in Uses
In a system that supports multicast (Chapter 12), this situation will change. Multiple endpoints can use the same IP address and UDP port number, although applications must usually tell APIs are feasible (eg, with our -a flag to indicate the SO_REUSEADDR Socket option).
(Below is the original book P.1671 translation) 4.4BSD supports multicoliography, requires an application to set a different socket option (SO_REUSEPORT) to allow multiple endpoints to share the same port. In addition, each endpoint must specify this option, including the first endpoint using the port.
When the destination IP address arriving by the UDP datag report is a broadcast address or a multicast address, there is a plurality of endpoints at the destination IP address and the port number, then a data rendering is transmitted to each endpoint. (The local IP address of the endpoint can contain an asterisk, which can match any destination IP address.) However, if the UDP datagram arrives, only one of the endpoints transmits a data retribution. Which endpoint transmission data is selected depends on a different system implementation. 11.13 Small knot UDP is a simple protocol. Its formal protocol is RFC 768 [Postel 1980], only three pages of content. It is located on the IP layer, including port numbers and optional inspections to the service provided by the user process. We use UDP to check the inspection and observe how fragmentation is made. Then, our discussion of ICMP does not reach an error, it is a new path MTU discovery function (Section 2.9). We use Traceroute and UDP to observe the path MTU discovery process. We also look at the interface between UDP and ARP, most of the ARP implements only the recently transmitted datagical report when waiting for the ARP answer. When the rate of the system receives the IP datagram exceeds the rate of these datagram, the system may send an ICMP source to suppress error messages. This type of ICMP error is easily generated when using UDP.
Exercise 11.1 In Section 11.5, we write 1473 byte user data to the UDP datagram, resulting in the occurrence of Ethernet data messages. When using the Ethernet IEEE 802 package format, how much is the minimum user data length of the fragmentation? 11.2 Read RFC 791 [Postel 1981a], understand why the data length in other slices is required to be 8 bytes of integer multiple in addition to the last piece? 11.3 Assume that there is an Ethernet and a 8192-byte UDP datagram, then how many data messages need to be divided, how much is the offset and length of each data message? 11.4 Continue the previous one, assuming that these data messages should pass through a MTU 552 SLIP link. It is necessary to remember the data in each data message (except for IP first) to 8 bytes of integer times. So how many data messages will be divided, how much is the offset and length of each data message? 11.5 A application that sends a datagram with UDP, which divides the datagram into 4 data messages. Assume that the first and second reach the destination, while the third and 4th is lost. The application reuses the UDP datagram after 10 seconds and is divided into the same four sheets (the same offset and length). Assume that this receiving host is re-assembled for 60 seconds, then the first and second pieces of the original received when retransmitted, the first and second pieces of the original received have not been discarded. Can the receiving end re-assemble these 4 data into a IP datagram? 11.6 How do you know that the tablets in Figure 11.15 actually correspond to the 5th and 6th lines in Figure 11.14? 11.7 After 33 days of host Gemini, the NetStat program displays 48,000,000 IP datagrams. Due to the first inspection and errors, 20 of the 30,000,000 TCP segments are discarded 20 due to TCP inspection and errors. However, in approximately 18,000,000 UDP datagrams, there is no data reported by UDP inspection and error. Please indicate two reasons. (Tip: See Figure 11.4.) 11.8 We did not mention any options in the IP head when discussing the segmentation - they also have to be copied to each data message, or only in the first datagram ? We have discussed these IP options: Record route (ses.3), timestamp (7.4), strict and loose source selection (8.5). How do you want to handle these options? Check your answer to RFC 791. 11.9 In Figure 1.8, we say that the UDP datagram is allocated according to the UDP port number. Is this correct? 11-1