Socket programming foundation under Linux

xiaoxiao2021-03-06  91

Socket programming foundation under Linux

Author: Northeastern University Software Technology R & D Center in sums King evaluation: the station Date: 2002-05-22 Description: Source:

1. Introduction Linux's rise can be said to be a miracle created by Internet. As a free free software that completely open its original code, Linux is compatible with a variety of UNIX standards such as POSIX, UNIX SYSTEM V, and BSD UNIX, etc.), multitasking operating systems with complex kernels. In China, with the popularity of the Internet, a group of Linux enthusiasts that mainly composed of students and ISP technicians in higher colleges and universities have grown. More and more programming enthusiasts have gradually love this excellent free software. This article describes the basic concepts and function calls of Socket under Linux. 2, what is SocketSocket (socket) is a method that communicates through standard UNIX file descriptors and other program. Each socket is used in one and a semi-related description: {protocol, local address, local port} is represented; a complete socket uses a related description: {protocol, local address, local port, remote address, remote port }, Each socket has a local socket that is allocated by the operating system. 3, Socket's three types (1) Flow socket (Sock_Stream) stream socket provides reliable,-facing communication flow; it uses TCP protocols to ensure the correctness and order of data transfer. (2) Data News Socket (SOCK_DGRAM) Data Supply Settings Define a connectionless service, data is transmitted by transfers with each other, and does not guarantee reliable, no errors. It uses a DBR Agreement UDP (3) Original Socket Original Socket Allows direct access to the underlying protocol such as IP or ICMP, which is powerful but more inconvenient, mainly for some protocols. 4. Send data by using the socket to send data to the flow socket system to send data. 2. For the data settlement, you will need to add a message header first, then call the sendto () function to send the data. 5, SOCKET data structure (1) struct sockaddr {// for storage socket address unsigned short sa_family; // address type char sa_data [14]; // 14 byte protocol address}; (2) Struct SockAddr_in {// IN Represents InternetShort Int Sin_Family; // Internet Protocol UNSIGNED SHORT INT SIN_PORT; / / Port number, must be a network byte order Struct in_addr sin_addr; // Internet address, must be network byte order unsigned char sin_zero ; // Add 0 (same size as Struct SockAddr}; (3) Struct in_addr {unsigned long s_addr;}; 6, network byte order and its conversion function (1) network byte sequence each machine internal pair of variables The byte storage order is different, and the data transmitted by the network is to unify the order. So the internal byte indicates the order of the order and the network byte order, must be converted to the data, from the program's portability requirements Even if the internal byte representation of this unit is the same as the network byte order, the data conversion function should be called before the transmission data, so that the program can be implemented correctly after being ported to other machines. Real conversion is not converted by system functions Decide.

(2) Conversion function * Unsigned short int htons (unsigned short int hostshort): The host byte sequence is converted into network byte order, and the unsigned short-operated 4Bytes * unsigned long int htonl (unsigned long int hostlong): The host byte sequence is converted into a network byte order, and the unsigned long-sized SHORT INT NTOHS (UNSIGNED SHORT INTSHORT): The network byte sequence is converted into the host byte order, and operate 4bytes without symbolic short-type * Unsigned long int NTOHL: Network byte sequence into host byte sequence, operates without symbolic long-scale 8Bytes Note: The above function prototypes are defined in Netinet / In.H, and IP address is converted. A function converts the string IP address represented by the digital point form with a 32-bit network byte sequence IP address (1) unsigned long int inet_addr (const char * cp): This function puts a number and point representation The string of the IP address is converted into an unsigned long, such as: struct sockaddr_in inaina.sin_addr.s_addr = inet_addr ("202.206.17.101") This function is successful: Return the conversion result; return constant inaddr_none when failing, this constant = -1, binary unsigned integer-1 is equivalent to 255.255.255.255, this is a broadcast address, so when I call INER_ADDR () in the program, it must be artificially failed to process the call failed. Since the function cannot process the broadcast address, the function inet_aton () should be used in the program. (2) INT inet_aton (constra * inv): This function converts the IP address of the string into binary form; return 1 when it is successful, otherwise returns 0, the converted IP address is stored in Parameter INP. (3) CHAR * INT_NTOA (STRUCT IN-AddR in): Convert 32-bit binary IP addresses to the IP address in the digital point form, and the result is returned in the function return value, and returns a pointer to the string. 8. The byte processing function socket address is multi-byte data, not ending with empty characters, which is different from the strings in the C language. Linux provides two sets of functions to handle multi-byte data, and a group starts with B (Byte), is a function compatible with the BSD system, and the other in MEM (memory) is a function provided by ANSI C. Functions starting with B include: (1) Void Bzero (Void * S, INT N): Set the first n byte of the memory specified by the parameter S to 0, usually it is used to clear the socket address.

(2) Void Bcopy (const void * src, void * dest, int N): Copy the memory area specified from the parameter src to the specified number of bytes to the memory area specified by the parameter DEST. (3) INT BCMP (const void * s1, const void * s2, int N): The previous N byte content of the memory area specified by the parameter S1 and the parameter S2 specified, if the same, return 0, otherwise returns Non 0. Note: The prototype of the above functions is defined in Strings.h. The functions starting with the MEM are: (1) Void * Memset (Void * S, INT C, SIZE_T N): Set the previous N bytes of the memory area specified by the parameter S as the contents of the parameter C. (2) Void * Memcpy (Void * Dest, const void * src, size_t n): function with bcopy (), difference: Function bcopy () can handle the area specified by parameter src and parameter DEST, Memcpy ( It is not possible. (4) INT MEMCMP (const void * s1, const void * s2, size_t n): The front n byte content of the comparison parameter S1 and the parameter S2 specifies the area, if the same, returns 0, otherwise returns non-0. Note: The prototype of the above function is defined in string.h. 9. Basic socket function (1) socket () # include #include int design (int domain, int use, int protocol) parameter Domain Specifies the set to create The protocol of the terminal can be the following value: AF_UNIX / / UNIX domain protocol Types, can be as follows: SOCK_STREAM // Flow socket, connective and reliable communication type SOCK_DGRAM // Data Supply Set, non-connected and unreliable communication type SOCK_RAW // Original socket, Only valid for the Internet protocol, can be used directly to access the IP protocol parameter protocol is usually set to 0, indicating that the use of the default protocol, such as the intersection of the Internet protocol, using the TCP protocol, and the data settlement uses the UDP protocol. When the socket is the original socket type, you need to specify the parameter Protocol because the original socket is valid for multiple protocols, such as ICMP and IGMP. The operation of creating a socket in the Linux system is mainly: Create a socket data structure in the kernel, then return a socket descriptor identifies this socket data structure. This socket data structure contains various information of the connection, such as the other party address, TCP state, and transmitting and receiving buffer, etc., the TCP protocol controls this connection according to the content of this socket data structure.

(2) Function Connect () # include #include int connect (int sockfd, struct sockaddr * servaddr, int address) parameter SOCKFD is a socket description returned by the function socket The parameter service specifies the socket address of the remote server, including the IP address and port number of the server; the parameter addrlen specifies the length of the socket address. Returns 0 when successful, otherwise returns -1 and sets the global variable to any of the following types: Etimeout, Econnrefused, EHOSTUNREACH or ENETUNREACH. Before calling the function connect, the client needs to specify the socket address of the server process. The client generally does not need to specify its own socket address (IP address and port number), the system automatically selects a unused port number from 1024 to 5000 port number, then this port number and native The IP address populates this socket address. The client call function connect to actively establish a connection. This function will start the 3 handshake process of the TCP protocol. The function returns after the connection is established or when an error occurs. The incorrect error in the connection process is: (1) If the client TCP protocol does not receive confirmation of its SYN data segment, the function returns, the error type is ETIMEOUT. Typically, the TCP protocol will send the SYN data segment multiple times after sending the SYN data segment, and the function returns to the error after all the transmission is high. Note: SYN (SYNCHRONIZE) bit: Requests the connection. TCP uses this data segment to establish a connection to the other party TCP protocol. In this data segment, the TCP protocol informs the initial serial number it selects the other party and negotiates the maximum data segment size with the other party agreement. The serial number of the SYN data segment is the initial sequence number, which can be confirmed. When the protocol receives the confirmation of this data segment, establish a TCP connection. (2) If the remote TCP protocol returns an RST data segment, the function is immediately returned, the error type is ECONNREFUSED. When the remote machine does not have a service process at the destination port number specified by the SYN data segment, the TCP protocol of the remote machine will send an RST data segment to report this error to the client. The client's TCP protocol no longer continues to send the SYN data segment after receiving the RST data segment, and the function is immediately returned. Note: RST (reset) bit: Represents the request reset connection. When the TCP protocol receives a data segment that cannot be processed, this data segment is sent to the other party TCP protocol, indicating that the connection identified by this data segment has an error, and the TCP protocol is requested to clear this connection. There are three situations that may cause the TCP protocol to send RST data segments: (1) SYN data segment specified by the destination port is waiting; (2) TCP protocol wants to abandon an existing connection; (3) TCP receives one Data segments, but the connection identified by this data segment does not exist. The TCP protocol received by the RST data segment immediately disconnected the connection normally and reports an error to the application. (3) If the client's SYN data segment causes a router to generate a "destination" type of ICMP message, the function is incorrectly returned, the error type is EHOSTUNREACH or ENETUNREACH.

Usually the TCP protocol logs this message after receiving this ICMP message, then sends a SYN data segment several times. After all sending are failed, the TCP protocol checks this ICMP message, and the function is returned. Note: ICMP: Internet Message Control Protocol. The operation of the Internet is mainly controlled by the Internet's router, the router completes the transmission and reception of the IP packet. If an error occurs when sending a packet, the router uses the ICMP protocol to report these errors. The ICMP packet is encapsulated in the data section of the IP packet, and its format is as follows: Type Code Check and Data 0 8 16 24 31 Type: Indicates the type of ICMP packet. Code: Provide further information for ICMP packets. Check: Provides checksum for the content of the entire ICMP packet. The ICMP packet has the following types: (1) Destination is not reachable: a, the destination host is not running; B, the destination address does not exist; C, the entry in the routing table corresponds, so the router cannot find the host Routing. (2) Timeout: The router minimizes the survival time (TTL) domain of the received IP packet, if the value of this domain becomes 0, the router discards this IP packet and sends this ICMP message. (3) Parameter error: Send when there is an invalid domain in the IP packet. (4) Redirection: Notify the host of a new path. (5) Echo request, Echo Answer: These two message terms test the host to arrive. The requester sends an Echo request ICMP packet to the destination host, and the destination host is returned after receiving this ICMP packet, returns the ECHO to answer the ICMP packet. (6) Time Stamp Request, Time Stamp Answer: ICMP Protocol Use these two messages to get the current time of its clock from other machines. During the call function Connect, when the client TCP protocol sends the confirmation of the SYN data segment, the TCP state is converted to the SYN_SENT state by the closed state. After receiving the confirmation of the SYN data segment, the TCP state is converted to the Established state, the function Successfully returned. If the call function Connect fails, you should use Close to close this socket descriptor, and this socket descriptor cannot be used again to call the function Connect.

Note: TCP Protocol Status Conversion Map: Passive Open Close Active Open (Create TCB) (Delete TCB) (Establish TCB, Send SYN) Receive SYN Send (Send SYN, ACK) Receive SYN ACK (No Action) Receive SYN ACK Receive SYN, ACK (No Action) Close (Send FIN) Close Receive FIN (Send FIN) Receive Fin Receive Fin ACK (No Action) (Send ACK) Close (Send FIN) CLOSE Receive the ACK receiving a FIN ACK Receive the ACK of the FIN (Download ACK) (No Action) (No Action) 2MSL Timeout (Delete TCB) (3) Function Bind () Function Bind Binds the local address with the socket, Definitions as follows: #include #include int bind (int sockfd, structure sockaddr * myaddr, int address); parameter sockfd is a socket descriptor returned by the function sockt; parameter MyAddr is a local address; the parameter addrlen is the length of the socket address structure. Returns 0 when execution is successful, otherwise, returns -1, and sets the global variable errno as an error type EADDRINUser. Both the server and clients can call the function bind to bind the socket address, but it is usually the server call function bind to bind its own recognized port number. Binding operations generally have the following combination: Table 1 Program type IP address port number Description Server INADDR_Any Non-Zero Value Specifies the IP Address Non-Zone Server Local IP Address Non-Zero Value Specify Server IP Address and Confird Port Volume Client INADDR_Any Non-Zero Value Specifies the client's connection port number Client Local IP Address Non-Zero Value Specifies the client's IP address connection port number client local IP address zero specified client's IP address separately as follows: (1) Server specifying the socket The recognized port number of the word address, does not specify an IP address: When the server calls bind, set the IP address of the socket as a special INADDE-ANY, indicating that it is willing to receive a client connection from any network device interface. This is the most common binding method of the server. (2) The server specifies the recognized port number and IP address of the socket address: When the server calls BIND, if the IP address of the socket is set to a local IP address, this means that this machine receives only from this IP address. A client connection for a specific network device interface. When the server has multiple network cards, this way can be used to limit the range of reception of the server. (3) The client specifies the connection port number of the socket address: in general, the client calls the port number of your socket address when the client calls the Connect function. The system will automatically select an unused port number and use the local IP address to populate the corresponding item in the socket address. But sometimes the client needs to use a specific port number (such as reserved port number), and the system does not automatically assign a reserved port number without the client, so you need to call the function bind and a unused reserved port number binding.

(4) Specify the client's IP address and connection port number: indicating that the client communicates with the specified network device interface and port number. (5) Specify the client's IP address: indicating that the client communicates with the specified network device interface and port number, the system automatically selects an unused port number. Generally only used when there are multiple network device interfaces in the host. We generally do not use a fixed client port number on the client unless you must use. There is a unfavorable port number using a fixed port number on the client: (1) The server performs active shutdown operation: The server will finally enter the TIME_WAIT status. When the client is connected to this server again, the same client port number is still used, so this connection is exactly the same as the previous connection, but one, for the Time_Wait state, did not disappear So this connection request is rejected, and the letter CONNECT returns in an error. The error type is an ECONNREFUSED (2) client execution active shutdown operation: the client will finally enter the TIME_WAIT status. When this client program is executed again, the client will continue to bind this fixed client port number, but because the previous connection is in the Time_Wait state, there is no disappearance, the system will find that this port number is still occupied, so this time The binding operation fails, the function bind returns to Error, the error type is EADDRINUSE. (4) Function listen () function Listen converts a socket to a listening socket, defined as follows; #include int Listen (int setfd, int backlog) parameter sockfd specifies the set to be converted Tag descriptor; parameter backlog Set the maximum length of the request queue; return 0 when execution is successful, otherwise returns -1. The function listen feature is two: (1) Convert a active socket that has not been connected (function socket can be used to actively connect but not accept the connection request) into a passive connection socket. After executing Listen, the server's TCP status is turned to the Listen state. (2) The TCP protocol will arrive at the connection request queue, and the second parameter of the function Listen specifies the maximum length of this queue. Note: Parameter backlog's role: TCP protocol maintained two queues for each listening sleeve: (1) Unfinished connection queue: TCP connections that have not completed 3 handshake operations have one in this queue. The TCP hopes that after receiving a client SYN data segment, create a new entry in this queue, then send a confirmation of the client SYN data segment and its own SYN data segment (ACK SYN data segment), waiting for the client Confirmation of your own SYN data segment. At this point, the socket is in the SYN_RCVD state. This entry will be saved in this queue until the client returns a confirmation or connection timeout to the SYN data segment. (2) Complete the connection queue: Each has completed 3 handshake operations, but the TCP connection that has not been received (call function accept) has one item in this queue. When a connection to the connection queue receives a confirmation of the SYN data segment, the TCP protocol has been completed, and the TCP protocol has never completed the connection queue to complete the connection queue. At this point, the socket is in the ESTABLISHED state. This entry will be saved in this queue until the application call function accepts to receive it.

The parameter backlog specifies the maximum length of the completion of the connection queue of a certain listening sleeve, indicating that the maximum number of non-received connections that this socket can receive. If the completion queue of the listening socket is full when a client's SYN data segment arrives, the TCP protocol will ignore this SYN data segment. For the SYN data segment that cannot be received, the TCP protocol does not send the RST data segment, and (5) The function accept () function accept (5) The function accept () function accept (5) The function accept () function accept receives a TCP connection that has been established from the completion of the listening socket. If the completion of the connection queue is empty, then this process sleeps. #include int accessR, int * addressdr * addr, int * addrlen parameter SOCKFD Specifies the listening socket descriptor; parameter addr is a pointer to an Internet socket address structure; parameter Addrlen is a pointer to a integer variable. When performing success, return 3 results: Function return value is a new socket descriptor, identify this received connection; store the client address in the structural variable points to the parameter addr; the parameter addrlen pointed to the integer variable storage customer The length of the address address. Returns -1 when failing. The listening socket is used to receive the client connection request, and the TCP protocol cannot be used to identify this connection with the listening socket descriptor, so the TCP protocol creates a new socket. Identify this connection to receive, and play a descriptor to the application. There are two sockets now, one is the listening socket used when calling the function accept, and the other is the connection socket returned by the function accept. One server usually only needs to create a listening socket, during the entire event of the server process, use it to receive all client connection requests, turn off this listening socket before the server process is terminated; for no reception (An accepted) connection, all the TCP protocols create a new connection socket to identify this connection, the server uses this connection socket to communicate with the client, when the server processes this client request, turn off this connection socket word. When the function accept occurs when the function is established, if the process captures the signal, the function will return, the error type is EINTR. For such errors, the function accept is generally reused to receive the connection. (6) Function Close () Function Close Close a socket descriptor. Define as follows: #include int close (int SockFD); return 0 when execution is successful, otherwise Returns -1. Like the Close of the Operation File Descriptor, the function close minus the reference counter of the socket descriptor. If the reference count of the descriptor is greater than 0, then the process references this descriptor, the function close returns; if it is 0 Then start the operation of the cleaning socket descriptor, and the function close returns immediately. After calling the close, the process will no longer access this socket, but the TCP protocol will continue to use this socket, pass the data that has not been sent to the other party, then send the FIN data segment, perform the shutdown operation, and wait until this TCP connection After fully shut down, the TCP protocol deletes the socket. (7) Functions read () and write () are used to read and write data from socket.

转载请注明原文地址:https://www.9cbs.com/read-96883.html

New Post(0)