Chapter 10 Network
Networks and Linux are closely related. From a sense, Linux is a product for Internet and WWW. Its developers and users use web to exchange information ideas, program code, and Linux itself is often used to support network needs of various organizations. This chapter tells how Linux supports network protocols such as TCP / IP.
The TCP / IP protocol was originally designed to support the computer communication on ARPANET (a US government funded research network). ARPANET proposes some network concepts such as package exchange and protocol layers (a service provided by another protocol). ArpaNet retired in 1988, but its successor (NSF1 NET and Internet) became bigger. The World Word Wide Web we know now is evolved from Arpanet, which supports the TCP / IP protocol. UNIXTM is widely used in Arpanet, and its first network version is 4.3 BSD. Linux's network implementation is in 4.3 BSD, which supports BSD Sockets (and some extensions) and all TCP / IP networks. This programming interface is because it is very popular and contributes to the application to port from the Linux platform to other UNIX TM platforms.
10.1 TCP / IP Network Introduction This section briefly introduces the main principle of the TCP / IP network, rather than talking in detail. In the IP network, each machine has an IP address, a 32-bit number, which identifies the machine. WWW is a very huge and rapidly growing network that must have a stand-alone IP address. The IP address is represented by a number separated by a point, such as 16.42.0.9. This IP address is actually divided into two parts: network addresses and host addresses, each part of the length (with several types of IP addresses). Take 16.42.0.9 as an example, the network address is 16.42, and the host address is 0.9. The host address is further divided into subnet addresses and host addresses. Still taking 16.42.0.9 as an example, the subnet address is 16.42.0, and the host address is 16.42.0.9. Such subsets can allow a sector to divide their own subnet. For example, if 16.42 is an ACME computer company's network address, 16.42.0 may be subnet 0, 16.42.1 may be subnet 1. These subnets can be established separately, may rent telephone lines or communicate with each other with microwaves. The IP address is allocated by the network administrator, and the network can be managed well with IP subnets. The administrator of the IP subnet can freely allocate IP addresses in the subnet.
Usually, the IP address is more difficult, and the name is much easier, and it is better than 16.42.0.9 than 16.42.0.9. However, there must be some machines to transform the network name to an IP address. These names are static defined in the / etc / hosts file or Linux can request domain name server (DNS) to resolve it. In this case, the local host must know one or more DNS servers and these servers should specify their name to /etc/resolv.conf.
When you want to connect to another computer, for example, you want to read a web page, your IP address will be used to exchange data with that machine. These data are included in some IP packets, each IP package has an IP header to include IP addresses, checksums, and other IP addresses, checksums, and other useful information of the source machine. The checksum of the IP package is used to determine if the IP packet is incorrect whether the IP packet is errors during transmission, such as an error caused by the problem of telephone lines. The data that the application wants to transmit may be divided into many easy-to-handed packets. The size of the IP packet is different depending on the change of the transmission medium; the Ethernet package is usually larger than the PPP package. The destination host needs to redemple these packages before the data is given to the receiver application. If you access a web page with a large number of images from a relatively slow site, you will see the split and reorganization of the data.
The IP packets can be sent directly between the hosts in the same subnet, while other IP packs will be sent to a specific host: gateway. The gateway (or router) is used to connect multiple IP subnets, which will transfer the IP packets from the subnet. For example, if there is a gateway between the subnets 16.42.1.0 and 16.42.0.0, any package from the subnet 0 to the subnet 1 must be guided by the gateway, the gateway can help these packages find the correct route. The local host builds a routing table to find the correct machine for the IP package. Every destination IP has an entry in the routing table to tell Linux which host sent to the IP package. These routing tables dynamically change with the topology of the network.
Figure 10.1: TCP / IP protocol layer
The IP protocol is a protocol of a transport layer, and other protocols can be used to transfer data. Transmission Control Protocol (TCP) is a reliable end-to-end protocol that uses IP to deliver and receive its own packages. As IP has its own head, TCP also has its own head. TCP is a connection-oriented protocol, two network applications are connected by a virtual connection, even though they may be connected with many subnets, gateways, and routers. TCP reliably transmits and receives data between the two applications, and ensures that the data will not be lost. When using IP to transmit the TCP package, the data segment of the IP package is the TCP package. The IP layer of each communication host is responsible for transmitting and receiving IP packets. User Data News Agreement (UDP) also uses IP layers to transmit its package, not like TCP, UDP is not a reliable protocol, but it provides a datagram. There are multiple protocols to use the IP layer, and if you receive the IP packet, you must know which upper layer protocol in the IP packet is required, so there is a one byte in the IP header contains the protocol identifier. OE Floor. When the application communicates via TCP / IP, they not only specify the IP address of the target, but also specify the port address of the application. A port address uniquely identifies an application, standard web application uses standard port addresses; if web services use 80 ports. These registered port addresses can be seen in / etc / services.
The protocol of this layer is more than just TCP, UDP, and IP. The IP protocol layer itself uses many physical media to pass the IP package from one host from one host. These media can join their own protocol headers. The Ethernet layer is an example, but PPP and SLIP are not like this. An Ethernet network allows very hosts to connect to the same physical cable at the same time. Each Ethernet frame in the transfer can be seen by all hosts, so each Ethernet device has a unique address. Any Ethernet frame transmitted to the address is received by the Ethernet device of the address, while other hosts ignore the frame. This unique address is built into each Ethernet device, usually written in SROM2 when the NIC is shipped. The Ethernet address has 6 bytes long, such as: 08-00-2B-00-49-A4. Some Ethernet addresses are reserved for multi-point transmission, and the Ethernet frame sent to these addresses will be received by all hosts online. Ethernet frames can carry a variety of protocols (as data), such as IP packets, and also include protocol identifiers in their heads. This makes the Ethernet layer correctly receives the IP package and passes them to the IP layer.
In order to transfer IP packets through multi-connection protocols like Ethernet, the IP layer must find Ethernet addresses for each IP host. IP addresses are just an address concept, and Ethernet devices have their own physical address. From another aspect, the IP address can be assigned and redistributed by a network administrator as needed, while network hardware responds to an Ethernet frame containing their own physical address or multi-point transmission address. Linux With Addresin Response Protocol (ARP) to allow the machine to convert the IP address into a true hardware address, such as an Ethernet address. If a host wants to know the hardware address corresponding to an IP address, it uses a multi-point transmission address to send a ARP request package that contains the IP address to all nodes, and the target host with the IP address responds to one. A ARP response containing physical hardware addresses. ARP is not limited to Ethernet equipment, which can be used to parse IP addresses on other physical media, such as FDDI. The network devices that do not support ARP will be marked, and Linux will not use ARP. There is also a reverse address parsing protocol (RARP) that provides the opposite function to convert the physical network address to an IP address. This protocol is often used by the gateway to respond to ARP requests that contain remote network IP addresses. 10.2 Linux TCP / IP Network Layer
Figure 10.2: Linux network layer
As the network protocol itself, Figure 10.2 shows Linux to implement the Internet protocol address family with a series of software-connected software. BSD sockets (BSD sockets) are processed by dedicated BSD Sockets Universal Socket Management Software. It is supported by the INET Sockets layer, which is an IP-based protocol TCP and UDP management transfer endpoints. UDP (User Dataset Protocol) is a connectionless protocol and TCP (Transfer Control Protocol) is a reliable end-to-end protocol. When transferring UDP packages, Linux does not know if they are not concerned about whether they are safe to reach destination. The TCP package is connected to both end numbers by the TCP to ensure that the transmitted data is received correctly. The IP layer contains code that implements Internet protocols. These codes add IP headers to data to be transmitted, and know how to send incoming IP packages to TCP or UDP. In the IP layer, it is a network device to support all Linux network work, such as PPP and Ethernet. Network devices are not always physical equipment; some devices like loopback are pure software devices. The standard Linux device is established with the mknod command, and the network device is discovered with the underlying software and initializes it. After establishing a kernel with an appropriate Ethernet device, you can see / dev / eth0. The ARP protocol is located between the IP layer and the protocol that supports ARP.
10.3 BSD Socket Interface
This is a universal interface that not only supports various network works, but also a interactive communication mechanism. A socket describes one end of a communication connection, each having a socket in the two communication programs to describe their own end. Sockets can be seen as a specialized pipe, but it is not like pipelines, and sockets are not limited to the amount of data they can accommodate. Linux supports multiple types of sockets. This is because each type of socket has its own communication addressing method. Linux supports the following sets of symbols or domains:
UNIX UNIX Domain Setting INET Internet Address Estall Support with TCP / IP Protocol AX25 Amateur Radio X25 IPX Novell IPX AppleTalk AppleTalk DDP X25 X25
There are some socket types that support the connection-oriented service type. Not all address families can support all service types. Linux BSD socket supports the following set of tabs:
Stream
These sockets provide a reliable duplex sequential data stream that guarantees that the data is not lost during the transfer, not being blocked and copied. The TCP protocol in the Internet address supports flow sleeve.
Datagram
These sockets provide duplex data transfer, but different from the flow sleeve, which does not guarantee the arrival of information. Even if they arrive, they can't protect their order, and even guarantee that they are copied and confused. This type of socket is supported by UDP protocol in the Internet address family. Raw
Allow direct processing of the underlying protocol (called "RAW"). For example, it is possible to open a RAW socket to Ethernet devices to see RAW IP data transfer.
Reliable Delivered Messages
It is very similar to the datagram, but it guarantees the arrival of the data.
Sequenced Packets
Similar to the flow sleeve, but the packet size is fixed.
Packet
This is not a standard BSD socket type, but a Linux-specific extension that allows you to process packets on the device level.
A socket is used in the customer server mode to communicate. The server provides a service that customers use this service. The web server is an example that provides a web page, and the client, or browser, to read these pages. The server should use a socket, first build a socket and bind it with a name. The format of the name is set by the address family of sockets, which is the local valid address of the server. The name of the socket or the address of the address is specified with the structure SockAddr. A inet socket is also bound to a port address. The registered port number can be found in / etc / services; for example, the port number of the web service is 80. After the socket is bound to the address bind, the server cannot listen to the introduction connection request on the specified binding address. The requested initiator, the client, build a socket and make a connection request to the specified target server address. For an inet socket, the server address is its IP address and its port number. These introduction requests must find the destination address through various protocol layers, and then wait for the server's listening socket. The server can receive or reject it after receiving the introduction request. If you decide to receive, the server must establish a new set of text to receive requests. When a socket is used to listen to the introduction of the connection request, it cannot be used to support the connection. After connecting the ends of the connection, you can freely send and receive data. Finally, when it is no longer needed, it will be closed. Be careful that the packet is handled correctly in the transfer process.
Accurate operation of BSD Socket To rely on the address family below it. Setting the TCP / IP connection with the set Amateur Radio X.25 connection is very different. Like a virtual file system, Linux has an abstract socket interface from the BSD Socket layer, the application, and the BSD socket support by each address family. When the kernel is initialized, the address family is placed in the kernel and registered himself to the BSD Socket interface. After that, when the application is established with BDS Sockets, a join will be generated between the BSD Socket and its supported address family. This connection is generated by cross-link data structures and address family table specific support programs. For example, whenever an application creates a new Socket, there will be an address family-specific Socket creation program for BSD socket interface.
When constructing the kernel, some address families and protocols are put into the protocols vector. Each is characterized by its name, for example, "inet" and its initial program address. When the socket is initialized, each protocol and initial program are called. For the Socket address family, this can cause them to register a set of protocol operations. This is a set of routines, each of which performs a specific operation of the address family. The registered protocol operation is present in the POPS vector, a vector pointing to the Proto_ops data structure.
The PROTO_OPS structure consists of an address family type and a series of pointers that point to the Socket operation routine corresponding to a particular address family. The POPS vector is indexed by the address family identifies, such as the Internet address family identifier (AF_INET is 2).
Figure 10.3: Linux BSD Socket Data Structure 10.4 INET SOCKET Layer
The INET Socket layer supports an Internet address family including TCP / IP protocol. As mentioned earlier, these protocols are hierarchical, one protocol uses another protocol service. Linux's TCP / IP code and data structure reflect this hierarchy model. It is to operate with the BSD Socket layers to operate through a series of Internet address family Socket, which is already registered to the BSD Socket layer when the network is initialized. These are saved in the POPS vector with other registered address families. The BSD Socket layer calls the INET layer Socket support routine from the registered INet Proto_Ops data structure to perform work. For example, an address family is an INET's BSD Socket establishment request, which will use the setup function of the next INET Socket. In these operations, the BSD Socket layer uses the Socket structure that describes the BSD socket to the INET layer. In order not to mix the BSD Socket with TCP / IP, the INET Socket layer uses its own data structure, SOCK, which is connected to the BSD Socket structure. This connection relationship can be seen from Figure 10.3. It connects the SOCK structure with the BSD Socket structure with the DATA pointer of the BSD Socket. This means that the later Inet Socket call can easily re-found the SOCK structure. The SOCK structure of the protocol operation pointer is also established when initialization, which relies on the request. If the request is TCP, the protocol operation pointer of the SOCK structure will point to the TCP protocol operation set required for the TCP connection.
10.4.1 Establishing BSD Socket
When a new Socket is established, it is determined by identifier to determine its address family, socket type, and protocol.
First, search the address family that matches the requested address family from the POPS vector. It may be a specific address family that is implemented as a core module, so that the kernel daemon must load this module before it can continue to work. Assign a new Socket structure to represent the BSD Socket. In fact, the Socket structure is part of the VFS Inode structure, allocating a socket is actually assigned a VFS inode. Unless you think that Socket operation is like a normal file operation, it will feel that this is very strange. All files are represented by the VFS inode structure. To support file operations, the BSD Socket must also be represented by VFS inode.
The latest BSD Socket structure contains a pointer to the address family-specific Socket routine, which can be used to find the Proto_OPS structure from the POPS vector. Its type is set to be requested Socket Type: Sock_Stream, Sock_DGram, etc. The address of the address family specific creation routine uses the address saved in the Proto_OPS structure.
Assign a free file descriptor from the current process FD vector, initialize the File structure. Includes the file operation pointer to point to the BSD Socket file operator supported by the BSD Socket interface. Any operation will be introduced to the Socket interface, passing them to the supported address family by calling its address family operation routine.
10.4.2 Binding the address to INET BSD Socket
In order to listen to the input Internet connection request, each server must establish an INET BSD Socket and bind the address to it. The binding operation is mainly processed in the INET Socket layer, and the TCP and UDP protocol layers below provide some support. Socket binding with an address cannot be used to make any other communication work, that is, the status of Socket must be TCP_Close. The SockAddr structure contains an IP address that is bound to an arbitrary port number. Typically bound IP addresses have been assigned to a network device that supports the INET address family and its interface is available. Which network interface can be used in the system to see which network interface is currently activated. The IP address can also be a broadcast address, all 1 or all 0. This is some specific address, which is used to indicate that it is sent to anyone 3. If the machine acts as a transparent agent or firewall, the IP address can be specified as any IP address, but only the process of superuser privileges can be bound to any IP address. The bound IP address is present in the SOCK structure of the RECV_ADDR and Saddr fields. The port number is optional, and if not specified, one is arbitrarily specified. Press convention, less than 1024 port numbers cannot be used by processes without superuser privileges. If the lower layer network does not assign a port number, a port number greater than 1024 is assigned. The packets received by the lower network device must be processed by the correct INET and BSD Socket. Therefore, UDP and TCP maintain some Hash tables to find addresses in the IP message and direct them to the correct socket / sock pair. TCP is a connection-oriented protocol, thus involving processing the TCP packets than the information used to process the UDP package.
UDP maintains a UDP port table, UDP_hash table. Composed of a pointer to the SOCK data structure, through a port number-based Hash function index. The UDPhash appearance is much smaller than the number of port numbers (UDP_HASH 128 or the UDP_HTable_SIZE) table points to a SOCK structural chain, which uses the next next pointer in each SOCK structure to connect each SOCK. .
TCP is very complicated, which includes several Hash tables. However, the TCP did not bind the SOCK structure in the binding operation, which only checks the requested port number is currently not used. The SOCK structure is added to the HASH table of TCP during Listen operation.
Review: What about the route entred?
10.4.3 Creating a connection on the INET BSD Socket, establish a socket, if you do not listen to the incoming request, then you can use it to send a request. For the connectionless protocol such as UDP, this Socket operation does not do a lot, but for connection-oriented protocols such as TCP, this operation includes establishing a virtual connection between two applications.
A connected operation can only be done by an INET BSD Socket in the correct state; in other words, sockets cannot be connected, and are used to listen to join connections. This means that the BSD Socket structure must be SS_UNCONNECTED state. The UDP protocol does not establish virtual connections between two applications. Any message issued is a datagram, which may reach the destination. But it does not support the CONNECT operation of BSD socket. The connection operation established on the UDP's INET BSD Socket simply sets the address of the remote application: IP address and IP port number. In addition, it also sets the Cache of the route entry so that this BSD Socket is not available again when the UDP package is sent to the routing database (unless this route is invalid). IP_ROUTE_CACHE pointer in the INET SOCK structure points to routing cache information. If the address information is not given, the route and IP address information will be automatically used to send messages. UDP changes the status of the SOCK to TCP_ESTABLISHED. For connecting operations based on TCP BSD Socket, TCP must establish a TCP message including connection information and send it to the destination IP. The TCP message contains information related to the connection, a uniquely identified message starts sequence number, by initializing the maximum value of the message size, and the transmission and reception window size, and the like. In TCP, all messages are numbered, and the initial sequence number is used as the first message number. Linux uses a reasonable random value to avoid malicious protocol conflicts. A message that successfully transmits from one end of the TCP connection to the other end To confirm that it has arrived correctly. Unrecognized messages will be retransmitted. The size of the sending and receiving window is the first number of the first message to confirm the previous message. The maximum value of the message size is related to the network device, and they are determined at the last time of the initialization request. If the message size of the network device of the receiving end is smaller, the connection will be on a small end. The application must wait for the response of the target application to accept or reject the connection after the application issues a connection request. TCP SOCK expects an input message that is added to TCP_LisTening_hash to enter a TCP message can be directed to this SOCK structure. TCP also starts timing, and when the target application does not respond request, the connection request is timeout.
10.4.4 Monitor INet BSD Socket
After the Socket is bound to the address, the connection request of the specified address can be listened. A network application can listen to Socket without first binding the address; in this example, the INET Socket layer finds an unused port number (for this protocol) and automatically binds it with Socket. Monitor the Socket function sets the socket state to TCP_Listen and makes the work you need to connect into the connection.
For UDP sockets, changing the status of Socket is sufficient, and TCP now adds SOCKET's SOCK data structure to two Hash tables and activates, tcp_bound_hash tables, and TCP_Listening_hash tables. Both tables index through a Hash function based on the IP port number.
Whenever, an activated listener socket receives a connected TCP connection request, TCP has to create a new SOCK structure to describe it. When receiving, this SOCK structure will become the underlying of the TCP connection. It also copies SK_BUFF containing the connection request and put it in the receiving_queue of the listening to the SOCK structure. The copy SK_BUFF contains a pointer to the newly established SOCK structure.
10.4.5 Receive connection request
UDP does not support connection concepts, and receives the INET Socket connection request only for TCP protocols, and a monitor Socket receives a new Socket structure from the original listener socket. The receiving operation is transmitted through the supported protocol layer, this example is an inet to receive any connection request. If the next protocol, such as UDP, does not support the connection, the INET protocol layer receives the operation will fail. Otherwise, the reception operation transmits the real protocol layer, this example is TCP. The reception operation can be blocking or non-blocking. In the case of non-blocking conditions, if the connection can be received, the reception operation fails, and the new Socket structure is discarded. In the case of blocking, the network application executes the receiving operation will add a waiting queue and hang it until the TCP connection request is received. After receiving a connection request, the SK_BUFF containing the request is discarded, and the SOCK data structure returns to the Inet Socket layer, which is connected to a new Earlier Socket structure. The New Socket File Descriptor (FD) is returned to the web application, and then the application can use this file descriptor to newly created INET BSD sockets in the Socket operation. 10.5 IP layer
10.5.1 Socket Cache
Each level of protocol is provided in another layer, so that the use of multi-storey network protocols will have a problem: Each protocol should be added to the protocol head and protocol tail when transmitting data, and the data is reached again. . In this way, there must be a data cache between different protocols, and each layer needs to know which location and the end of the specific protocol. A solution is to copy the cache in each layer, but this is very efficient. Linux switches data between the protocol layer and the network device driver with Socket cache or SK_Buffs. SK_BUFFS includes pointers and field lengths, so that each protocol layer can operate the application data through standard functions or "methods".
Figure 10.4: Socket Cache (SK_BUFF)
Figure 10.4 shows the SK_BUFF data structure; each SK_BUFF has a data block connected thereto. SK_BUFF has four pointers, these pointers are used to manipulate data from Socket cached:
HEAD
Point to the beginning of the data area in memory. This pointer is
SK_BUFF is fixed when it allocates the relevant data blocks.
Data
Point to the beginning of the current protocol data. This pointer is with current ownership
SK_BUFF which protocol layer is changed.
TAIL
Point to the end of the current protocol data. Similarly, this pointer also has currently owned
SK_BUFF which protocol layer is changed.
end
Point to the end of the data area in memory. This pointer is
SK_BUFF and its related data block are assigned.
LEN and TRUESIZE These two fields are used to describe the current protocol package length and data cache overall length. SK_BUFF processing code provides standard operation to add and remove the protocol header and protocol tail to the application. This can safely operate Data, Tail, and Len fields in SK_BUFF.
push
It
DATA pointer points to the beginning of the data area and increases
Len. Used to increase the protocol head at the beginning of the data to be transmitted.
PULL
It
The DATA pointer moves from the beginning of the data area to the end of the data area, and decreases
Len. Used to remove the protocol head at the beginning of the received data.
PUT
It
TAIL pointer points to the end of the data area and increases
Len. Used to add data or protocol information at the end of the data to be transmitted.
TRIM
It
TAIL pointer points to the beginning of the data area and decreases
Len. Used to remove data or protocol information at the end of the received data.
The SK_BUFF structure also includes a list of dual connection loops for SK_BUFFs for some pointers for some pointers. General SK_BUFF cases can add SK_BUFF to the front or back of these lists, or you can delete them.
10.5.2 Receiving IP Packages
The DD-CHAPTER chapter describes how Linux's network devices are placed in the kernel and initialized. A series of DEVICE data structures are connected to each other in the dev_base table. Each DEVICE structure describes its device and provides a callback routine, and the network protocol layer calls these routines when the network driver is required. These functions are closely related to the transmitted data and network device addresses. When a network device receives a package from the Internet, it must convert the received data to the SK_BUFF structure. These SK_BUFFs are added to the Backlog queue by the network driver. If the Backlog queue is too long, discard the received SK_BUFF. When you are ready to run, the network base will be set.
When the network underlying is running in accordance with the schedule, any network package that is waiting to be transmitted is processed by it. SK_BUFF decides which layers processed the received package.
When the Linux network layer is initialized, each protocol registers itself by adding the Packet_Type structure to the PTYPE_ALL list or the PTYPE_BASE HASH table. The Packet_Type structure contains the protocol type, a pointer to the network device, a pointer to the receiving data processing routine of the protocol, and finally includes a pointer to the next packet_type structure in the list chain or the HASH chain. The PTYPE_ALL link is used to listen to all packages received from the network device, usually not using it. The PTYPE_BASE HASH table is the protocol identifier to make it to determine which protocol will receive incoming network packets. The network underlayer matches the protocol type that passes into SK_BUFF through one or more packet_type items in the two tables. The protocol can match more than one item, such as a plurality of SK_BUFFs to copy when all of the transmission on the online transmission. SK_BUFF will pass the matching protocol processing routine.
10.5.3 Send IP Package
The application is transferred when the application exchanges data, otherwise it is generated by the network protocol to establish a connection or supporting a established connection. Regardless of whether the data is generated, a SK_BUFF is established to include data, and when the protocol layer, these protocols will add various heads.
SK_BUFF needs to be transmitted over a network device. First of all agreements, such as IP, you need to determine which network device is in use. This is the best route depends on the package. For a simple network through MODEM, such as through the PPP protocol, the choice of route is simple. The package should be sent to the local host through the local loop device or the gateway sent to the PPP MODEM. For more than the computer connected to the network, the more complicated the route is complex.
For each transmitted IP package, IP uses a routing table for the purpose IP address to resolve the route. When you successfully find the destination IP from the routing table, a RTABLE structure describing the route to use is returned. This includes the source IP address to be used, the address of the network Device structure, and sometimes pre-established hardware heads. These hardware headers are network devices specific, physical addresses, and other physical addresses, and other specific media information. If the network device is an Ethernet device, the hardware head should be shown in Figure 10.1, and the source and destination address should be a physical Ethernet address. Hardware heads can be cached during routing because it must be added to each of the IP packets to be transmitted. The physical address contained in the hardware head is parsed with the ARP protocol. The outgoing bag will be issued after the address is parsed. After parsing the address, the hardware head is cached so that the next IP package does not need to use ARP when using this interface.
10.5.4 Data block
Each network device has a maximum of a package size, a transmission or receiving packet cannot be larger than this value. The IP protocol allows data to be divided into smaller units so that the network device can process. The IP protocol head has a block field, which contains a flag and split offset.
When the IP package is ready to transfer, IP finds a network device to send an IP package. This device is found from the IP routing table. There is an MTU in each DEVICE structure to describe the maximum transmission unit (in bytes). If the MTU of the device is smaller than the package size of the IP packet to be transmitted, the IP package must be split into smaller units. Each unit is characterized by a SK_BUFF structure; its IP header is monitored to identify it is a plug-in package, which also contains the split offset. The last package is identified as the last IP unit. If the IP cannot assign SK_BUFF during the blocking process, the transmission failed. Receiving IP block units is more troublesome than sending them, as these IP units may arrive in any order, and all units must be received in order to assemble them. Check if it is an IP segmentation unit for each receiving an IP package. When the first IP segmentation unit arrives, IP creates a new IPQ structure, which is connected to the ipqueue list for IP unit reorganization. When receiving more IP elements, first find the correct IPQ structure and create an IPFRAG structure for each unit. Each IPQ structure uniquely describes an identification of the IP segmentation unit, including its source and destination IP address, upper layer protocol identifier, and this IP frame. When all IP segmentation units are received, they reform them into SK_BUFF and then handed over to the upper protocol processing. Each IPQ contains a timer that returns after each of the received a legal unit. If the timer arrives, the IPQ structure and some of its IPFRAG structures will be discarded, and the transmitted information is assumed to be lost. Then submit to the layer protocol to retransmit this information.
10.6 Address Analytical Protocol (ARP)
The address resolution protocol acts as a role that translates IP addresses into physical hardware addresses such as Ethernet addresses. IP requires this conversion when data (in SK_BUFF form) is driven through the device.
It performs various inspections to see if this device requires hardware heads, whether you need to rebuild the hardware head of the package. Linux caches hardware heads so that it can avoid reconstruction. If you need to rebuild the hardware head, call the hardware header reconstruction routine specified by the device. All Ethernet devices use the same header routine that converts the destination IP address into a physical address.
The ARP protocol itself is very simple, it includes two message types, ARP requests and ARP answers. The ARP request contains an IP address that needs to be parsed, and the ARP response (it hopes it) contains the parsed IP address, the hardware address. The ARP request broadcasts all hosts connected to the network, so for all the machines on the Internet, all the machines can see the ARP request. The machine with the IP address in the ARP request will issue an ARP response that contains its own physical address.
The ARP protocol is built around the ARP_TABLE structure table in Linux, and each structure describes an IP to the transformation of the physical address. These tablets are generated when IP address is analyzed, and is deleted when it is old. Each ARP_TABLE structure has the following fields:
Last USED This ARP item last time Last Updated This ARP item has been updated the time of the time FLAGS describing this item. If you complete the IP address, the IP address, the IP address, the hardware address to be parsed hardware, Hardware Header pointing to the cache Hardware head pointer Timer is a Timer_List term, and the number of times the timeout RETRIES ARP request retry for the ARP request without a response SK_Buff Queue Wait a list of SK_Buff Queue Waiting for the IP address resolution
The ARP table includes a pointer to the ARP_TABLE chain (ARP_TABLE vector). Cache These entries can accelerate access to them, each entry uses the last two bytes of the IP address to generate an index, and then look for the form chain to find the correct entry. Linux also caches pre-built hardware heads of the ARP_TABLE item in the form of hh_cache structure. Request an IP address resolution and there is no corresponding ARP_TABLE item, ARP must send an ARP request. It generates a new ARP_TABLE item in the table and SK_BUFF queues, and SK_Buff contains a network package that needs to be resolved. Run the ARP timer when sending an ARP request. If there is no response, the ARP will retry several times, if there is still no response, the ARP will delete the arp_table item. At the same time, the SK_BUFF structure waiting for the IP address parsing in the queue will be notified, and the upper level protocol to transmit their failure will handle this failed. UDP does not care for packet loss, and TCP will establish a TCP connection to retransmit. If the owner of the IP address returns its hardware address, the ARP_TABLE item is marked as completed, and the SK_BUFF in the queue will be deleted, and the transmission action continues. The hardware address is written to the hardware header of each SK_BUFF.
The ARP protocol layer must respond to ARP requests. It registers its protocol type (Eth_P_ARP) to generate a packet_type structure. This means that it will check all ARP packages received by the network device. As with the ARP response, this includes an ARP request. A ARP response is generated with the hardware address stored in the Device structure of the receiving device.
The network topology will change with time, and the IP address will be reassigned different hardware addresses. For example, some dialing services assigns an IP address for each newly created connection. In order to enable the ARP table to include these data items, ARP runs a periodic timer to see which timeout in all ARP_TABLE items. Be careful not to remove items that contain one or more cache hardware headers. Remove these items is dangerous because other data structures are used. Some ARP_TABLE items are marked as permanent, they will not be released. The ARP table cannot be too large; each ARP_TABLE item consumes some core memory. To assign a new entry and the size of the ARP table has reached its maximum value, then find and delete the oldest entry.
10.7 IP routing
The IP routing function determines which IP package that is scheduled to specify the specified IP address. There are many options when transmitting IP packages. Can you finally arrive at the goal? What network device is to use if you can use? If more than one network device can be used, which one is better? The information of the IP routing database gives the answers to these questions. There are two databases, the most important one is Forwarding Information Database. This is a clear list of known destination IPs and their best routes. Route Cache is used to quickly find the route of the destination IP. Like other bught, it includes only common routes; its contents come from Forwarding Information DataBase.
The routing can be added to the BSD Socket interface or removed from it through the IOCTL request. These are achieved by agreement. The INET protocol layer only allows processing of IP routing with superuser privileges to delete. These routes can be fixed or changeable over time. Most systems use fixed routes. The router runs the routing protocol, and the routing protocol continuously checks the available routes for all known destination IPs. The system without a router is an end system. The route protocol is implemented as a daemon, such as Gated, which also uses IOCTL to add and delete routes to the BSD Socket interface.
10.7.1 Routing Cache
Whenever IP routing, you must first check if there is a matching route in the routing cache. If there is no matching route in the routing cache, you have to find routing from the Forwarding Information Database. If there is no route there, the IP package will send failed and inform the application. If routing in the Forwarding Information Database is found in the routing cache, you will generate a new item for some routes and added to the routing cache. The routing cache is a table (IP_RT_HASH_TABLE), which includes a pointer to the RTABLE data structure chain. The HASH function uses the least most important two bytes in the IP address to index from the routing table. These two bytes are different from the best HASH value provided by the purpose. The first RTABLE item contains routing information, the destination IP address, used to reach the network device of the IP address, the maximum value of the information size, and the like. It also has a Reference Count, a usage country and a last time information (in Jiffies). Reference Count After each route is increased, the number of network connections used to display the route. It decreases when the application stops using the route. Useage count increases during each lookup route to sort the RTABLE item in its HASH chain. The last time information for all items in the routing cache will be periodically checked to determine if RTABLE is already old. If a certain route is not used recently, discard it from the routing cache. Since routing in the routing cache is in an orderly, the common route will be ranked in front of the HASH chain. This means finding these routes more quickly. 10.7.2 The Forwarding Information Database
Figure 10.5: The Forwarding Information Database
Forwarding Information Database (shown in Figure 10.5) contains IP routing on the current time of the current system. It is a very complex data structure, despite reasonable and effective arrangements, it is still not a fast database. In particular, you will be very slow when you look at the IP package for each of this database. This is why you want to use routing cache: Accelerate the transfer of IP packets with known good routes. The routing in the routing cache is derived from the Forwarding Information Database.
Each IP subnet describes a FIB_ZONE structure. The FIB_ZONE HASH table points to these structures. The Hash index is derived from the IP subnet mask. All routing to the same subnet is described by the FIB_NODE and FIB_INFO structure, which queues in the FZ_LIST of each FIB_ZONE structure. If the number of routes in this subnet increases, a Hash table is generated so that the lookup FIB_NODE structure is easier.
There are multiple routes to the same subnet, which may pass through one of the plurality of gateways. The IP routing layer does not allow with the same gateway to have more than one route to one subnet. In other words, if there are multiple routes to the same subnet, each route must be used to use a different gateway. There is a metric structure associated with each route. It is used to measure how excellent in this route. A route METRIC is essentially its number of IP subnets passing before arriving at the target subnet. The larger the metric, the worse the route.