PROXY source code analysis

xiaoxiao2021-03-06  15

Proxy source code analysis - talk about how to learn linux network programming This article comes from: http: //www.china-pub.com Author: Li Peiyuan (2001-08-10 12:00:00)

Linux is a very reliable operating system, but all friends who have used Linux will feel that Linux and Windows are "fool" operating system (there is no meaning of Windows here, the opposite should be the advantages of Windows) Better, the latter is undoubted to be more operative. But why there are so many lovers who love Linux. Of course, freedom is the most attractive point, and the strong features of Linux are also a very important reason, especially Linux powerful network functions. Looking at today's WAP business, banking network business and e-commerce that have been reddish and half the sky is increasingly relying on Linux-based solutions. So Linux network programming is very important, and when we come into contact with Linux network programming, we will find that this is a very interesting thing, because before some places on the network communication, there is a place in this section of code. I suddenly turned over. It is always a bit uncomfortable when you just start learning and programming, but as long as you read a few codes, we will experience the fun of it. Below I will start from a PROXY source code, talk about how Linux network programming. First, this source code is not what I have written, let us thank this prawn named Carl Harris, who wrote this code and spread it to the Internet for everyone to learn discussions. Although this code only describes the simplest proxy operation, it is indeed a classic, it not only clearly describes the concept of the client / server system, and almost includes all aspects of Linux network programming, which is ideal for Linux network programming. Scholars learn. The usage of this Proxy program is that we can use this proxy to log in to other host's service ports. If you generate an executable file named Proxy after compiling, the description of the command and its parameters is: ./proxy where parameter proxy_port refers to the proxy server port you specify. Parameters Remote_host refers to the host name of the remote host we want to connect, and IP addresses are equally valid. This hostname should be unique on the network. If you are not sure, you can use the uname -n command to use the uname -n command on the remote host. Parameter service_port is the service name that the remote host can provide, or directly type the port number corresponding to the service. The corresponding operation of this command is to bind the proxy_port port of the proxy server to the Remote_host's service_port port. Then we can access Remote_host through the Proxy_Port port of the proxy server. For example, a computer, the network host name is Legends, the IP address is 10.10.8.221, if you execute on my computer: [root @ leE /Root]#./proxy 8000 Legends telnet, then we can access this command to access Legends Telnet port.

-------------------------------------------------- --------------- [root @ lee / root] #telnet legends 8000Trying 10.10.8.221 ... connection to legends (10.10.8.221) .scape character is '^]' Red Hat Linux Release 6.2 (Zoot) kernel 2.2.14-5.0 on AN i686login: ------------------------------------------------------------------------------------------------------------------------------ ------------------------------- The above binding operation can also use the following command: [root @ le / root] #. / proxy 8000 10.10.8.221 23 23 is a standard port number for Telnet services, and other service port numbers we can view in / etc / services. Below I will talk about my coarse understanding of Linux network programming from this code, and I will ask you a lot of crowds.

◆ main () function --------------------------------------------- -------------------- # include #include #include #include #include #include #include #include #include #include #include #define TCP_PROTO "tcp" int proxy_port; / * port to listen for proxy connections on * / struct sockaddr_in hostaddr; / * host addr assembled from gethostbyname () * / extern int errno; / * defined by libc.a * / extern char * sys_myerrlist []; void parse_args (int argc, char ** argv); void daemonize (int servfd); void do_proxy (int usersockfd); void reap_status (void); void errorout (char * MSG); / * this is my model. I'll tell you why we must do this later * / typedef void signal; / ****************** ***************************************************** FUNCTION: Main Description : Main Level Driver. After daemonizing the process, a socket is Opened to Listen for Connections on the proxy port, connections are accepted and children are spawned to handle each new connection arguments:. argc, argv you know what those are return value:. none calls:. parse_args, do_proxy globals:. reads proxy_port ******. *********************************************************** ******** / main (argc, argv) int Argc; char ** argv; {Int Clilen; int CHildPid; int suckfd, newsfd; struct sockaddr_in servaddr, cliaddr; parse_args (argc, argv); / * Prepare an address structure; sIzeof (servaddr)); servaddr.sin_family = ADDR.S_ADDR =

Htonl (inaddr_any); servaddr.sin_port = proxy_port; / * get a socket ... * / if ((SockFD = socket (AF_INET, SOCK_STREAM, 0)) <0) {FPUTS ("Failed to create Server Socket / R / N ", stderr); exit (1);} / * ... and bind our address and port to it * / if (bind (Sockfd, Struct SockAddr_in *) & Servaddr, Sizeof (Servaddr)) <0) {FPUTS ("Faild to Bind Server Socket To Specified Port / R / N", stderr); EXIT (1);} / * Get Ready to Accept with At Most 5 Clients Waiting to Connect * / Listen (SockFD, 5); / * turn ourselves into a daemon * / daemonize (sockfd); / * fall into a loop to accept new connections and spawn children * / while (1) {/ * accept the next connection * / clilen = sizeof (cliaddr); newsockfd = accept (sockfd, (struct sockaddr_in *) & cliaddr, & clilen); if (newsockfd <0 && errno == EINTR) continue; / * a signal might interrupt our accept call * / else if (newsockfd <0) / * something quite () Amiss - Kill The Server * / ER Rorout ("Failed to Accept Connection"); / * fork a child to handle this connection * / if ((ChildPid = fork ()) == 0) {Close (SOCKFD); do_Proxy (newsfd); exit (0); } / * If fork () failed, the connection is limited1 Dropped - OOPS! * / LOSE (Newsockfd);}} ----------------------- ----------------------------------------- above is the main program of Proxy source code Some, maybe you have also seen this code online, but you will find two places in the above code, all in the pre-compilation section.

When a place is to define an external character pointer array, I will change the extern char * sys_errlist [] in the original code; modified to Extern char * sys_myerrlist []; reason is in my Linux environment "stdio.h" is already Sys_ERRLIST [] is defined below: Extern __const char * __ const Sys_ERRLIST []; maybe Carl Harris is not defined when the code is not defined in 94 years, but now we don't modify it, the system will tell when compiling Our sys_errlist has a defined conflict. In addition, I added a function type definition: typedef void sigfunc (int); I will explain it later. Sockets and socket address structures define this primary program is a typical server program. The most important network communication is the use of sockets, and the socket descriptor SOCKFD and NewSockFD are defined in the beginning of the program. Next, define the client / server socket address structure CLIADDR and ServadDR, and store the relevant communication information about the client / server. Then call the Parse_args (Argc, Argv) function to process the command parameters. About this parse_args () function we will introduce it later. Creating a communication socket is below the detailed process of establishing a server. The first operation of the server program is to create a socket. This is implemented by calling a function socket (). Socket () function specific description is: ----------------------------------------- -------------------------- #inClude #include int design (int Domain, int Type , int protocol; ------------------------------------------------------------- --------------------------------------- AF_INET expresses the use of the TCP / IP protocol, AF_UNIX means using UNIX protocol, AF_ISO Tag uses the ISO protocol. TYPE specifies the type of socket type, the general connection communication type (such as TCP) is set to SOCK_STREAM, when the socket is a datagram, TYPE should be set to SOCK_DGRAM, if it is the original socket that can directly access the IP protocol TYPE should be set to SOCK_RAW. The parameter protocol is typically set to "0", indicating the use of the default protocol. When the socket () function is executed, a descriptor of this socket is returned, and if an error returns "-1", and Errno is the corresponding error type. Setting the server socket address structure In normal case, first clear the socket address structure of the server information, and then fill in the corresponding content in the address structure, ready to accept the connection establishment request sent by the client. This cleaner can be implemented in a variety of byte processing functions, such as Bzero (), Bcopy (), MEMSET (), Memcpy (), etc., two functions starting with the letter "B" are compatible with the BSD system, The following two are the functions provided by ANSI C. Bzero () used in this code is: Void Bzero (Void * S, INT N); the specific operation of the function is to clear the previous N bytes of the memory specified by the parameter S. MemSet () is also common, which is described as: Void * Memset (Void * S, INT C, SIZE_T N); the specific operation is to set the previous N byte of the memory area specified by the parameter S as the contents of the parameter C.

The next step is to fill in the corresponding content in the server socket address structure that has already been cleared. The Linux system socket is a universal network programming interface that should support a variety of network communication protocols, each protocol uses a socket address structure specifically defined by yourself (such as TCP / IP network socket address) The structure is Struct SockAddr_in). However, in order to keep the socket function call parameters, the Linux system also defines a universal socket address structure: ---------------------- ------------------------------------------- Struct SockAddr {Unsigned Short sa_family; / * address type * / char sa_data [14]; / ​​* protocol address * /} ------------------------ ----------------------------------------- Sa_Family means that the socket is used The protocol address type, for our TCP / IP network, its value should be AF_INET, SA_DATA to store specific protocol addresses, different protocols have different address formats. This universal socket address structure generally does not need to define specific instances, but the mandatory type conversion that is commonly used to do the socket address structure. If we can often see such usage: bind (Sockfd, (Struct SockAddr *) & Servaddr, SIZEOF (SERVADDR)) The socket address structure for TCP / IP protocols is SOCKADDR_IN, which is defined as: ------------------------- ---------------------------------------- struct in_addr { __u32 s_addr;}; struct sochaddr_in {short int sin_family; unsigned short int sin_port; struct in_addr sin_addr; / * This part has not been taken into use yet * / nsigned char_ _ pad [_ _ SOCK_SIZE __- sizeof (short int) -sizeof (unsigned short int) - sizeof (struct in_addr)];}; # define sin_zero_ - pad ---------------------------------------------------------------------------------------------------------------------------- ----------------------------------- where sin_zero member is not used, it is for and universal sockets The address Struct SockAddr is integrated and specially introduced. When programming, it is generally zero by bzero () or MEMSET (). The settings of other members are usually like this: servaddr.sin_family = AF_INET; indicates that the socket uses the TCP / IP protocol. Servaddr.sin_addr.s_addr = htonl (inaddr_any); Setting the IP address of the server socket is special value inaddr_any, which means that the server is willing to receive a client connection from any network device interface. The HTONL () function means converting the host sequence bytes into network sequence. Servaddr.sin_port = htons (port); Set the communication port number, Port should be we have defined. In this example, servaddr.sin_port = proxy_port; this means that the port number is the return value proxy_port of the function.

In addition, in this example, we have not seen two headers in the precompilation section containing and , because these two heads The files have been included in and , and the two headers behind are not related to the platform, so these two header files are usually used in network communications. Server Open Address If the server is to accept the client's connection request, it must first open your own address across the network. After setting the server's socket address structure, the operation of the public address can be completed by calling the address and socket of the function bind () binding the server. Detailed description of the function bind () is: ----------------------------------------- ------------------------ # include #include int bind (int sockfd, struct sockaddr * Addr, int Addrlen; ------------------------------------------- ---------------------- Parameter SOCKFD is our socket descriptor created by calling socket (). The parameter addr is a native address, and the parameter addrlen is the length of the socket address structure. The function returns "0" when the function is executed, otherwise "-1" is returned and the errno variable is EADDRINUAER. If the server calls the bind () function, if the IP address of the socket is set for a local IP address, then this means that the server only receives a connection request from a particular host from this IP address. However, in general, set the IP address to INADDR_any to accept the connection request sent by all network device interfaces. The client generally does not call the bind () function, because the client does not need to specify its own socket address port number when the client is connected, the system will automatically select a unused port number for the client, and automatically populate the local IP address The correspondence item in the client set of the address structure. However, in some specific cases, the client needs to use a specific port number. For example, the Rlogin command in Linux requires the retention port number, and the system cannot automatically assign the client to the client, which requires calling bind. ) To bind a reserved port number. However, in some special environments, this binding specific port number will also bring some negative impact. After the HTTP server enters the Time_Wait state, the client will reject this connection request if the client is required to establish a connection again with the server again. If the client finally enters the TIME_WAIT status, the Bind () function will return immediately when the BIND () function is immediately executed, because the system will consider that there is a two connection binding the same port. Conversion Listening Socket Next, the server needs to convert our binding sockets that we just complete with the IP address and port number to listen in the Listening socket. Only the server program needs to perform this step. We implement this by calling a function listen (). Listen () Detailed Description: ------------------------------------------ ----------------------- # include int Listen (int suckfd, int backlog); -------- -------------------------------------------------- ------- Parameter SOCKFD Specifies the socket descriptor we requested to convert, the parameter backlog sets the maximum length of the request queue. Function listen () mainly completes the following.

The first is to convert the socket into a listening socket. Because the socket created by the function socket () is active socket, the client can use such a socket proactive and server to establish a connection by calling the function connect (). The server's situation is exactly the same, the server needs to receive the client's connection request through the socket, which requires a "passive" socket. Listen () can convert a "passive" socket in such a "passive" socket, which is listening. After executing the Listen () function, the server's TCP turns by closed into a Listen state. In addition, Listen () can set the maximum length of the connection request queue. Although the usage of the parameters backlog is very simple, it is just a simple integer. However, it is very important to understand the meaning of the request queue to understand the communication process of the TCP protocol. The TCP protocol actually maintains two queues for each listening socket, one is the unfinished connection queue, the members in this queue are not completed 3 handshake; the other is the completion of the connection queue, the member in this queue Although it has completed 3 handshakes, it has not been called Accept () received by the server. The parameter backlog actually specifies this listening socket to complete the maximum length of the connection queue. In this example we use: Listen (SockFD, 5); indicates that the maximum length of the completion connection queue is 5. Receive connection Next We see a daemon in the main program to create a daemon, about this daemonize () and daemons related concepts, let's wait for a while. The server program then enters an unconditional loop to listen to the connection request of the receiving client. In this process, if the client calls the Connect () request connection, the function accept () can accept a connection request from the completion of the contact word of the listening. If the connection queue is empty, this process sleeps. ACCEPT () Detailed Description: ------------------------------------------ ----------------------- # include int access (int sockfd, struct sockaddr * addr, int * addrlen); -------------------------------------------------- ------------- Parameter sockfd is the listening socket descriptor of our conversion; parameter addr is a pointer to the socket address structure, the parameter addrlen is a integer pointer. When the function is successfully executed, returns 3 results, the function returns a new socket descriptor, and the server can communicate with the client through this new socket descriptor. The client's related information is stored in the socket address structure pointed to by the parameter AddR, and the AddRlen pointer will describe the length of the foregoing socket address structure. Under normal circumstances, the server is not very interested in this information, so we can often see that some of the two parameters of the accept () function are set to NULL in some source code. However, in this Proxy source code, you need to use the relevant client information, so we see by performing newsckfd = accept (sockfd, (strunt socketdr_in *) & cliaddr, & clilen); putting the details of the client in the address structure CLIADDR in. And Proxy communicates with the client through socket newsockfd. It is worth noting that this returned socket descriptor is different from the listening sockets of our conversion. In a server program, you can always use only one listening socket to receive multiple client connection requests; if we want to establish an actual connection with the client, you need to call accept () Returns a new socket.

When the server is processed after the client's request, be sure to turn the corresponding socket; if the entire server program will end, you must turn the listening socket. If the Accept () function failed, "-1" returns "-1", if the Accept () function blocking waiting client calls connect () to establish a connection, the process will pick up the signal at this time, then the function is returning "-1" while Set the value of the variable errno to EINTR. This is different from the ACCEPT () function failed. So we can see this statement in your code: -------------------------------------- --------------------------- if (newsfd <0 && errno == eintr) Continue; / * a signal might interface inter ACCEPT () Call * / else if (newsockfd <0) / * Something Quite Amiss - Kill the Server * / Errorout ("Failed to Accept Connection); ------------------ ----------------------------------------------- As can be seen The operation is completely different when processing these two cases, and it is also accept () returns "-1", if there is errno == EINTR, then the system will call accept () to accept the connection request, otherwise the server process will end directly . Handling client request After the server is connected to the client, you can handle the client's request. Under normal circumstances, both server programs have to create a child process for handling client requests; while the parent process continues to listen, always ready to accept other client connection requests. Our Proxy program is no exception. It creates a child process that handles the client request by calling fork. I want to be in Linux / UNIX programming, the importance of fork () doesn't have to say anything, in a large server program, usually in the child process, through the Exec () series according to the client request Functions call different handles, which is also a very important place in learning Linux / UNIX programming. However, our proxy program aims to tell some of the basic concepts of Linux network programming, so a function DO_PROXY () that completes the proxy function directly in the subroutine section, and its actual parameter newsockfd is accept () returned to the socket descriptor . Another point to pay attention is that because the child process inherits the file descriptor available in all parent processes, we must turn off the listening sleeve in the child process (CLOSE (sockfd) of the neutralization part of the code;) The socket descriptor returned in the parent process (such as a Close (NewSockFD) of the parent process part of the code;).

◆ Function PARSE_ARGS () This function is defined by: Void Parse_Args (int Argc, char ** argv); --------------------------- -------------------------------------- / *********** *********************************************************** *** Function: PARSE_ARGS Description: Parse The Command Line Args. Arguments: argc, argv you know what these area. Return Value: None. Calls: None. Globals: Writes Proxy_port, Writes Hostaddr. ************** *********************************************************** ****** / void Parse_Args (argc, argv) int Argc; char ** argv; {INT i; struct hostent * hostp; struct server * servp; unsigned long inaddr; struct {char proxy_port [16]; char isolated_host [64]; char service_name [32];} PARGS; IF (Argc <4) {Printf ("USAGE:% s / r / n", Argv [0]); exit (1);} strcpy (Pargs.Proxy_Port, Argv [1]); strcpy (Pargs.isolated_host, argv [2]); strcpy (Pargs.Service_name, Argv [3]); for i = 0; I h_addr, hostp- h_length & hostaddr.sin_addr,>); else {printf ( "% s: unknown host / r / n", PARGS.ISOLATED_HOST; EXIT (1);

} IF ((servp = getServByname (Pargs.service_name, tcp_proto))! = Null) hostdr.sin_port = servp-> s_port; Else IF (atoi (pargs.service_name)> 0) hostddr.sin_port = Htons (ATOI (Pargs. Service_name); Else {Printf ("% s: invalid / unknown service name or port number / r / n", paggs.service_name; exit (1);}} ------------ -------------------------------------------------- --- The role of this function is to pass command line parameters. The passage of parameters is implemented by two global variables, which is int proxy_port and struct sockaddr_in hostdr. The proxy port and the bound host network information are used to pass the proxy port waiting for the connection request. After the command line parameters are in the local variable definition, the function first detects whether the command line parameter meets the requirements of the program, that is, follow the proxy server port, remote hostname, and service port number after the command, if the above requirements are not met, then the agent The server program ends. If the above requirements are met, the three parameters of the command line are stored into our custom Pargs structure. Note that three members of the Pargs structure are stored in the form of a command line parameter information, and then we need to call the function to convert these parameter information into a digital form. Transfer parameters Next, the three parameters of the command line are to be assigned to the global variable proxy_port and Hostaddr for other functions. First transfer the proxy server port PARGS.PROXY_PORT, where the program calls a system function isDigit () verifies whether the port number entered by the user is valid. Isdigit () specific description: ------------------------------------------------------------------------------------------------------------------------------------------------ ----------------------- # include int isdigit (int C) -------------- -------------------------------------------------- - Isdigit () function is used to detect whether the parameter "c" is one of the numbers 1 to 9. If the answer is certain, return a "0" value, but in turn, return "0". Such methods are used in the program to block the user's input: if (! Isdigit (* (* (pargs.proxy_port i))) BREAK; before passing the valid port number to the global variable proxy_port, Convert into network byte order. This is because there are many different devices in the network, which represent the byte order of the data different. For example, in memory address 0x1000 stores a 16-bit integer FF11, different companies' machines are not stored in memory, and some put ff placed in the starting position of the memory pointer 0x1001, which is called Big-endian sequence; some of the opposite, ie the Little-Endian order. This host-based data storage order is called Host Byte ORDER. In order to communicate between different types of hosts, the network protocol specifies a unified network byte order, which is specified as a Little-Endian order.

So the network byte order of data and host byte order may be different, so it is necessary to pay attention to the conversion between different order when writing a communication program. So, there must be such a statement in the program: proxy_port = htons (atoi (Pargs.Proxy_Port)); function htons () The function of htons is to convert host byte sequence to network byte order. Its specific description is: -------------------------------------------- --------------------- # include unsigned short int htons (unsigned short int data) ---------- -------------------------------------------------- ----- The function similar to the htons () is HTONL (), NTOHS (), and NTOHL (), respectively, all between the network and the host byte order. If these names are easier to confuse, we can memorize the Host, N represents NetWork, S represents unsigned long, and LON represents unsigned long. So "hton" is "host-to-network": transform host bytes are network bytes. The "NTWORK-to-Host" function is used to receive data. In our routine, because the port number is generally not more than 4 digits, the unsigned short type htons () can be used. Note that the parameters of Htons () in the routine are the return result of another function atoi (). The specific description of the ATOI () function is: ----------------------------------------- ------------------------ # include int atoi (const char * nptr) ----------- -------------------------------------------------- ---- Its role is to convert the string of character pointer NPTR to the corresponding integer and return it as a result. This operation is almost identical exactly the effect of calling strtol (nptr, (char **) null, 10). The only difference is that atoi () does not have an error return information. The reason for calling this function is because the system will use all parameters as a string when reading the command line, so we must convert it into an integer form. Next, the routine first clears all members of the global variable Hostaddr, and then sets the member hostaddr.sin_family to the TCP / IP protocol flag AF_INET. The other two parameters and of the command line can be passed to two members who hostddr.sin_port and hostdr.sin_addr. Here we use two local variables struct hostent * hostp and struct server * servp to deliver parameter information.

Struct Hostent Detailed Description: ------------------------------------------- ---------------------- Struct hostent {char * h_name; char ** h_aliases; int h_addrtype; int h_length; char ** h_addr_list;}; # define h_addr H_addrlist [0] ---------------------------------------------- ------------------ Hostent member's meaning is that h_name represents the official name of the host on the network, h_aliases is a list of all host alias, h_addrtype means the address type of the host , Generally set to TCP / IP protocol AF_INET, H_LENGTH is the address length of the host, typically set to 4 bytes. h_addr_list is a list of IP addresses of the host. We want to use it to deliver the remote host name or IP address we expect to be bound. Because the hostname parameter in the command line has been stored in Pargs.Isolated_host, we call the inet_addr () function for binary and byte sequential transition to the host name or host's IP address. The description of the inet_addr () function is: ------------------------------------------ ----------------------- # include #include #include Unsigned long int inet_addr (const char * cp) ------------------------------------------------------ ------------------------- inet_addr () The role is to convert the Internet host address pointed to by the parameter CP into a binary form in the form of a digital / point. Simultaneously convert to network byte sequence, and return the conversion results directly. If the IP address pointed to by the CP is not available, the function returns INADDR_NONE or "-1". Although Carl Harris uses this inet_addr () function while writing this program, I still suggest that you use another function inet_aton () when writing your own program. The reason is that inet_addr () returns "-1" when IP address is not available, but we think that IP address 255.255.255.255 is definitely an effective address, then its binary return value will also be "-1", so INET_ADDR () cannot be Process this IP address. The function inet_aton () uses a better way to return an error message, and its specific description is: ------------------------- ---------------------------------------- # include # INCLUDE #include int inet_aton (const char * cp, struct in_addr * inp) -------------------- ------------------------------------------- Function is returned Non-zero, the conversion result is stored in the IN_ADDR structure pointed to the pointer INP. This structure defines that we have already introduced in front of the article. Returns "0" if the IP address pointed to by the parameter CP is not available. This avoids the problem of inet_addr ().

If the user typed in the command line is the IP address of the remote host, only INET_ADDR () even if the task is completed, but what should I do if the user is type the host domain? So we can see such statements in routines: ------------------------------------- ---------------------------- IF ((INADDR = INET_ADDR)! = INADDR_NONE) BCOPY (& INADDR, & Hostaddr. SIN_ADDR, SIZEOF (INADDR); ELSE IF ((Hostp = gethostByname (Pargs.Isolated_host))! = null) bcopy (HostP-> h_addr, "hostaddr.sin_addr, hostp-> h_length); Else {Printf ("% s: Unknown host / r / n ", Pargs.Isolated_host; exit (1);} ------------------------------------------------------------------------------------------- ---------------------------------- In which the gethostbyname () function is used to convert the host domain name. Its specific description is: -------------------------------------------- --------------------- # include struct hostent * gethostByname (const char * hostname); ----------- -------------------------------------------------- ---- Parameter HostName points to the domain name address we need to convert, the function returns the conversion result, if the function is successful, then the result is directly returned to a pointer to the Hostent structure, otherwise the empty pointer NULL is returned. The routine is to call inet_addr () and gethostByname () to pass the host domain name in the command line parameter or the host IP address to the global variable Hostaddr member SIN_ADDR for the proxy execute function DO_PROXY () call. Below is the transfer service name or a service port number. Here you should use the structure of SERVENT to pass the intermediary, and the detailed description of Struct Servent is: --------------------------------- -------------------------------- Struct servent {char * s_name; char ** s_aliases; int s_port; char * s_proto ;}; ----------------------------------------------- ------------------ The meaning of its members is the formal name of S_Name for the service, such as FTP, HTTP, etc., s_aliases is the alias list of services, S_Port is the port number of the service For example, in general, the port number of the FTP is 21, the port number of the HTTP service is 80. Note that this port number should be stored as a network byte order, and S_PROTO is the type of application protocol.

Use the GetServByName () function to convert the service name in the command line parameter, this function is described in: -------------------------- --------------------------------------- # include struct server * getServByname (const char * servname, const char * protoname); -------------------------------------- -------------------------- its role is to convert the service name pointing to the pointer servname as the port number indicated by the corresponding integer representation, the parameter protoname The protocol used by the service, the value of the Protoname parameter in the routine is TCP_PROTO, which means using the TCP protocol. When the function is successful, return a struct server-type pointer, where the S_Port member is the service port number we care about. If the user typed in the command rather than the service name, then the following statement is processed using the following statement: hostaddr.sin_port = htons (atoi (Pargs.Service_name)); here, the command line The parameters have been converted into the byte sequence and digital type required by network communication, and stored in three global variables, waiting for the Do_Proxy () function to call. ◆ Daemonize () Function Create a daemon When you introduce the main () function, I mentioned that the general server program has to create a daemon before receiving the client connection request. The daemon is a very important concept in Linux / UNIX programming. Therefore, discussing the daemon in detail, it is very helpful to learn the relationship between beginners.

Here is the daemonize () function in the routine: --------------------------------------- -------------------------- / *********************** ************************************************* Function: daemonize Description: Detach The Server Process from the current context, creating a pristine, predictable environment in which it will execute.arguments: servfd file descriptor in use by server return value:. none calls:. none globals:. none **********. *********************************************************** **** / void daemonize (servfd) int servfd; {Int ChildPid, FD, FDTABLESIZE; / * IGNORE TERMINAL I / O, STOP SIGNALS * / SIGNAL (SIGTTOU, SIG_IGN); SIGNAL (SIGTTIN, SIG_IGN); SIGNAL (SIGTSTP , SIG_IGN); / * fork to put us in the background (WHETHER OR NOT THE User Specified '&' on the command line * / if ((ChildPid = fork ()) <0) {FPUTS ("Failed to fork First Child / r / n ", stderr); exit (1);} else if (childpid> 0) exit (0); / * Terminate Parent, Continue In Child * / / / * Dissociate from Process Group * / if (setpgrp (0, getpid ()) <0) {FPUTS ("Failed to Become Process Group Leader / R / N", stderr); EXIT (1);} / * Lose Controlling Terminal * / IF ( (FD = Open ("/ dev / tty", o_rdwr)> = 0) {ioctl (fd, tiocnotty, null); Close (fd);} / * close any open file descriptors * / for (fd = 0, fdtablesize = getdtablesize (); fd

} ------------------------------------- ---------------- The role of this function is to create a daemon. In the Linux system, if you want to convert a normal process into a daemon, you must perform the following steps: 1. Call the function fork () creates a sub-process, then the parent process is terminated, and the retention sub-process continues to run. The reason why the parent process is terminated is because when a process is started by the front process, the child process is automatically turned into a background process after the start of the parent process is terminated. In addition, we will create a new meeting period in the next step, which requires the process of creating a meeting period not a team leader process of a process group. When the parent process is terminated, the child process is running, which guarantees that the group ID of the process group does not wait for the process ID of the sub-process. The definition of function fork () is: ------------------------------------------ ----------------------- # include #include PID_T fork (void); ----- -------------------------------------------------- --------- This function is called once, but returns twice, the difference between the two returns is the return value of the child process is "0", while the return value of the parent process is the ID of the child process. Returns "-1" if an error. 2. The guarantee process will not obtain any control terminals. The usual approach is to call the function setsid () created a new meeting. Setsid () Detailed Description: ------------------------------------------ ----------------------- # include #include PID_T setsid (void); ----- -------------------------------------------------- ---------- The first step has guaranteed the process of calling this function is not a team leader of the process group, then this function will create a new meeting, the result is: First, this process becomes The first process of the meeting period (Session Leader, the first process of the system's default meeting is the process of creating this meeting). Moreover, this process is the only process in this meeting. Then, this process will become a new process group's leading process, the group ID of the new process group is the process ID of the process. Finally, ensure that this process does not control the terminal, even if this process has a control terminal, this connection will be released after the meeting period is created. If the process of calling this function is a team leader of a process group, then the function will return the error message "-1". Of course, we have other ways to make the process unable to get the control terminal, just like the routines, -------------------------- --------------------------------------- IF ((fd = Open ("/ dev / Tty ", o_rdwr))> = 0) {IOCTL (FD, Tiocnotty, Null); Close (fd);} ------------------------ ----------------------------------------- / DEV / TTY is a stream equipment It is also our terminal mapping. Call the close () function Close the terminal. 3. Signal processing. Generally, some signals are typically ignored. Here, the concept of signal is involved.

The signal is actually equivalent to the software interrupt, and the signal mechanism under Linux / UNIX provides a method of processing an asynchronous event. The end user typed the key issued by the issuance, or the system is abnormal, which will terminate one or more by signal processing mechanisms. The operation of the program. The signals caused in different situations are different. However, all the signals have their own name, all the names start with "SIG", but they have different things, we can learn what happened in the system through these names. When the signal appears, we can ask the system to perform the following three operations: ◇ ◇ 忽 信号. Most signals are treated in this way. In routines, we can see this use. But it is worth noting that there are two exceptions, that is, the SIGKILL and SIGSTOP signals cannot be ignored. ◇ Capture the signal. This is the most flexible way of operation. This means of processing means that when some signal occurs, we can call a function to process this situation. The most common situation is that if the SIGCHID signal is captured, the sub-process has been terminated, and then the waitPid () function can be called in the capture function of this signal to obtain the process ID of the child process and its termination state. In our routine, there is an example of this usage. Also, if the process creates a temporary file, then write a signal capture function for the process termination signal SIGTERM to clear these temporary files. ◇ Execute the default action of the system. For most signals, the default action of the system is terminating the process. Under Linux, there are many kinds, I will not introduce this here, if you want to know more about these signals in detail, you can view the header file , these signals are defined as positive integers, that is They signal numbers. When processing the signal, you must use a function signal (), which is described in detail: --------------------------- -------------------------------------- # include void (* SIGNAL INT Signo, Void (* Func) (int)) (int); --------------------------------- -------------------------------- The parameter SIGNO is the signal name, the value of the parameter func is the following according to our needs. Several cases: (1) constant SIG_DFL indicates the default action of the execution system. (2) constant SIG_IGN indicates ignoring the signal. (3) The address of the processing function that needs to be called after receiving the signal, this signal capture program should have an integer parameter but there is no return value. The signal () function returns a function pointer, and the function points to which the pointer should have no return value (Void), which actually points to the previous signal capture program. Here is back to our daemonize () function. This function ignores three signals when creating a daemon: Signal (SIGTTOU, SIG_IGN); SIGNAL (SIGTTIN, SIG_IGN); Signal (SIGTSTP, SIG_IGN); the meaning of these three signals is: sigttou represents the background process write control terminal SigTtin represents the background process read control terminal, and SigtStp indicates that the terminal hangs. 4. Turn off the file descriptor that is no longer needed and opens a new file descriptor for standard input, standard output, and standard error output (or inheriting the standard input, standard output, and standard error output file descriptor, this operation is Optional).

In our routine, because this daemonize () is executed after the Listen () function is executed, it is necessary to keep the successful listening socket, so we can see such Statement: if (fd! = Servfd) Close (FD); 5. The call function chDIR ("/") changes the current work directory to the root directory. This is to ensure that our process does not use any directory. Otherwise our daemon will always occupy a certain directory, which may cause superusers that cannot uninstall a file system. 6. Call the function umask (0) Set the file mode to "0". This is because some permissions may be prohibited by the creation of the inherited file creation. For example, our daemon needs to create a set of readable files, and this daemon is created from the parent process to create the way the block is possible to block the two permissions, then newly created groups. The file does not take effect if it is read or written. Therefore, you have to create a shielded word to "0". At the end of the daemonize () function, we can see such a signal capture processing statement: Signal (SigCLD, (SIGFUNC *) Reap_status); this is not a step in the process of creating the daemon, its role is to call us customized The REAP_STATUS () function processes the zombie process. Reap_status () is defined in the routine: --------------------------------------- -------------------------- / *********************** ************************************************** Function: Reap_Status Description: Handle A SigCLD Signal BY, ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, **************************************************************************************** / VOID REAP_STATUS () {Int PID; Union Wait Status; While (PID = Wait3 (& Status, Wnohang, Null)> 0); / * loop while there is more dead children * /} ------------ -------------------------------------------------- --- The original text of the above signal capture statement is: Signal (SigCLD, Reap_status); We just said that the second parameter of the Signal () function must have an integer parameter but there is no return value. Reap_status () is no parameters, so the original statement cannot pass when compiling. So I added a type definition for SIGFUNC () in the pre-compiled section, here for mandatory type conversion to Reap_status. Moreover, in the BSD system, the SIGCHLD signal is usually used to handle the information termination, and SigCLD is a signal name defined in System V. If the processing mode of the SIGCLD signal is set to capture, then the kernel will immediately check if the system is There is a child process that has already terminated the waiting process. If there is, the signal capture handler is immediately called. Wait (), WaitPID (), WAIT3 () or Wait4 () (), Wait4 () (), will return to the termination status of the sub-process in the signal capture handler.

The difference between these "Waiting" functions is that when the sub-process that requires the function "Waiting" has not terminated, Wait () will cause its caller to block; and in the parameter of WaitPid (), it can be set to make the caller do not block the occlusion. The wait () function is not set to which specific child process is waiting for the one to terminate in all sub-processes, but must set the waiting sub-process ID in the parameter when calling WaitPid (). Wait3 () and WAIT4 () parameters are more than Wait () and WaitPid () more "Rusage". Reap_status () in the routine called the function wait3 (), this function is supported by the BSD system, and we will list it with the definition of Wait4 (): -------------- -------------------------------------------------- - # include #include #include #include PID_T WAIT3 (INT * STATLOC, INT OPTIONS, STRUCT RUSAGE * Rusage); PID_T WAIT4 (PID_T PID, INT * STATLOC, INT OPTIONS, STRUCT RUSAGE * RUSAGE); ------------------------------------------------------------------------------------------------------------------------------------ ------------------------------------, if the pointer statloc is not "null", then it will Point to the returned child process termination. The parameter PID is the process ID of our waiting sub-process. Parameters Options are our control selection, usually WNOHANG or Wuntraced. The options Wnohang used in the routine, meaning if the termination of the sub-process is not immediately (for example, because the child process is not ended), the wait function does not block, returns "0" at this time. The Wuntraced option means if the system supports job control, and if the status to wait is already suspended, and its status has never reported since pause, it returns its status. Parameter Rusage If not "NULL", it will point to the resource summary of the termination process returned by the kernel, which includes the total amount of user CPU time, the number of wages, and the number of times the signal is received. ◆ Agent Server DO_PROXY () When the routine main () function is over, we see that after the server accepts the client's connection request, it will create a child process, and execute a proxy service program DO_PROXY in the child process. ().

-------------------------------------------------- --------------- / ***************************************** *************************************** Function: Do_Proxy Description: Does The Actual Work of Virtually Connecting a Client To The Telnet Service on The Telnet Service on The Telnet Service Isolated host.Arguments: usersockfd socket to which the client is connected. Return Value: None.calls: none. globals: reads hostaddr. ************************* *************************************************************** / VOID DO_PROXY (USERSOCKFD) INT USERSOCKFD {Int isosockfd; fd_set; int Iolen; char buf [2048]; / * Open a socket to connection to the isolated host * / if ((isosockfd = socket (AF_INET, SOCK_STREAM, 0) <0) errorout ( "failed to create socket to host"); / * attempt a connection * / connstat = connect (isosockfd, (struct sockaddr *) & hostaddr, sizeof (hostaddr)); switch (connstat) {case 0: break; case ETIMEDOUT : Case Econnrefused: Case ENETUNREACH: STRCPY (BUF, SYS_MYERRLIST [Errno]); STRC AT (BUF, "/ R / N"); Write (UserSockFD, BUF, Strlen); Close (UserSockFD); Exit (1); / * Die Peacefully if We can't Establish A connection * / break; Default: ERROROUT ("Failed to Connect To Host");} / * now We're Connected, Serve Fall Into The Data Echo loop * / while (1) {/ * select for readability on Either of ity oor two sockets * / fd_zero (& rdfdset); FD_SET (usersockfd, & rdfdset); FD_SET (isosockfd, & rdfdset); if (select (FD_SETSIZE, & rdfdset, NULL, NULL, NULL) <0) errorout ( "select failed"); / * is the client sending data * / If (fd_isSockfd, & rdfdset) {IF ((Iolen = Read (UserSockFD, BUF, SIZEOF (BUF))) <= 0) Break;

/ * Zero Length Means The Client Disconnected * / Rite (isosockfd, Buf, Iolen); / * Copy to Host - Blocking Semantics * /} / * Is The Host Sending Data? * / if (FD_ISSET (ISOSockfd, & RDFDSET)) {F (Iolen = Read (isosockfd, buf, sizeof (buf))) <= 0) Break; / * Zero Length Means the host disconnected * / rite (userSockfd, buf, iolen); / * Copy to Client - Blocking semantics * /}} / * We're Done with the sockets * / close (isosockfd); Lose (userSockfd);} ---------------------- ------------------------------------------ in our proxy server example In the process, a period of truly connecting user hosts and remote hosts is done by this do_proxy () function. Recall the introduction of our Usage of this Proxy program. First bind our proxy with the remote host, then the user establishes a connection with the remote host through the Proxy's binding port. In the main () function, our proxy has been connected to the user host with the user host, and in this do_proxy () function, Proxy will be the corresponding service port of the remote host (specified by the user in the command line parameter ) Establish a connection and responsible for transmitting data exchanged between user hosts and remote hosts. Because of the establishment of a connection with the remote host, we see the first half of the Do_Proxy () function is actually equivalent to a standard client program. First create a new socket descriptor isosockfd, then call the function connect () to establish a connection between the remote host. The definition of function connect () is: ------------------------------------------ ----------------------- # include #include int connect (int sockfd, struct sockaddr * Servaddr, int Addrlen; -------------------------------------------- --------------------- Parameter SOCKFD is the socket descriptor returned by the call function socket (), the parameter servaddr points to the socket address structure of the remote server, parameters AddRlen specifies the length of this socket address structure. The function connect () returns "0" when the function is successful, returns "-1" if the execution fails, and the global variable errno is set to the corresponding error type. The following three error types are processed in the Switch () function call in the routine: Etimedout, ECONNREFUSED and ENETUNREACH. The three error types are: etiMedout represents a lot of reasons for this situation, the most common is the server busy, unable to answer the client's connection request; EconnRefused represents the connection, that is, the server is not ready Listen the socket, or no monitoring of the state of listening, eNetunReach means that the network is not strengthened.

In this case, the second parameter servaddr of the Connect () function is a global variable Hostaddr, where the function PARSE_ARGS () converts the command line parameters. If the connection establishes failure, we call our custom function errorOut () output information "Failed to Connect to Host" in the routine. Errorout () function is defined as: ------------------------------------------ ----------------------- / ************************ ********************************************** FUNCTION: ERROROUT DESCRIPTION: DISPLAYS An Error Message on The Console And Kills The Current Process. Arguments: MSG - Message To Be Displayed. Return Value: None - Does Not Return. Calls: None. globals: none. *********************** ****************************************************************** / VOID Errorout (msg) char * msg; {file * console; console = fopen ("/ dev / console", "a"); fprintf (console, "proxyd:% s / r / n", msg); Fclose (Console ); Exit (1);} ------------------------------------------------------------------------------------------------------------------- ----------------------- DO_PROXY () The second half of the function is to establish a connection between the user host and the remote host through Proxy. We have both proxy and user host connections (DO_PROXY () function parameter usersockfd, and PROXY is connected to the remote host connection isosockfd, then the simplest and direct communication establishment method is from a socket Read, then write directly to another socket. Such as:------------------------------------------------ ----------------- Int n; Char BUF [2048]; While ((N = Read (userSockfd, buf, sizeof (buf))> 0) IF (writeosure (isosockfd , BUF, N)! = n) Err_sys ("Write Wrror / N"); ------------------------------- -------------------------------- This form of blocking I / O is when one-way data is passed Very effective, but in our proxy operation, we ask the user host and the remote host two-way communication, which requires us to read the two socket descriptors both to be written. If still uses this method If I / O, it is very likely that long time is blocked in a descriptor. Therefore, the routine calls the select () function when processing this problem, which allows us to perform I / O multi-channel transfer. The specific meaning is The select () function can construct a table that contains all the file descriptors we want to use. Then we can call a function, this function can detect the status of these file descriptors when a certain (we specified This function is returned when the file descriptor is ready to perform I / O operation, which file descriptor informs the process has been able to execute I / O operations. This avoids long-term blocking.

There is also a function poll () to implement I / O multi-channel transfer, because SELECT () is called in the routine, we only have more detailed introductions to select (). Detailed description of the select () series function is: ---------------------------------------- ------------------------- # include #include #include INT SELECT (INT N, FD_SET * READFDS, FD_SET * WRITEFDS, FD_EST * EXCEPTFDS, STRUCT TIMEVAL * TIMEOUT; FD_CLR (INT FD, FD_SET * SET); FD_ISSET (INT FD, FD_SET * SET); fd_set (int FD, fd_set * SET); fd_zero (fd_set * set); --------------------------------------- -------------------------- SELECT () function will create a file descriptor that we care about, and its parameters will be in the kernel These file descriptors set the conditions we have cared, for example, whether it is readable, can be written, and whether it is abnormal, and in the parameter, we can set the maximum time we wish to wait. When SELECT () is successfully executed, it will return the number of descriptors currently prepared, while the kernel can tell us status information for each descriptor. If the timeout is timeout, "0", if an error, the function returns "-1" and sets Errno as the corresponding value. The last parameter TIMEOUT of select () will set the wait time. The structure TimeVal is defined in the file . -------------------------------------------------- --------------- struct timeval {__time_t timeval {__time_t TV_sec; / * seconds * / __time_t TV_usec; ​​/ * microseconds * /}; -------------- -------------------------------------------------- - There are three situations in the settings of the parameter TIMEOUT. This means that timeout == null in the routine, this means that the user wants to wait forever until one of our designated file descriptors is ready, or captures a signal. If it is due to capturing the signal, the select () will return "-1", and the value of Errno is Errno is ENTR. If timeout-> TV_sec == 0 && timeout-> TV_USEC == 0, then this means that it is not waiting. SELECT () Tested all the specified file descriptor immediately returned immediately. This is a polling method that gets a plurality of descriptor states without blocking the SELECT () function. If TIMEOUT-> TV_sec! = 0 || Timeout-> TV_USEC! = 0, then the value of these two parameters is time for us to wait for the function. Where the TV_sec sets the time unit in seconds, the TV_USEC sets the time unit of microseconds. If you have timeout, there is still no one ready for all file descriptors we specified, select () will return "0".

The data type of the three parameters in the middle is fd_set, which means that the file descriptor set, and readfds, writefds, and exceptfds are pointers that point to file descriptors, which describe the readable, writable and writable and writable of our concern. Various file descriptors of state abnormalities. The reason why we call SELECT () can create a file descriptor "table", the so-called table is composed of the data structure pointed to by these three parameters. The specific structure is shown in Figure 1. Among them, one of the file descriptors who cares about every SET_FD data type is retained. Therefore, when monitoring the status of the file descriptor, the associated bit is queried in these set_fd data structures. The first parameter n is used to explain how many descriptive positions that need to be traversed. n The value is usually set to select the maximum value from all the file descriptors we care about. For example, we set the largest of all file descriptors to 6, then set N to 7, the system is only used to pass the top 7 bits (FD0 ~ FD6) when the descriptor is detected. However, if you don't want to trouble, we can set the N value directly to fd_setsize as in the routine. This is the maximum number of file descriptors set in the system, and different systems are different, typically 256 or 1024. This will traverse all descriptors when the descriptor state is detected. When calling the select () function implementation multiple I / O transfer, first we have to declare a new file descriptor set, just like the routine: fd_set rdfdset; then call fd_zero () empty this file descriptor set All bits, so as not to detect the descriptor, return error results: fd_zero (& rdfdset); then call fd_set () Set the bit of our concern in the file descriptor. In this example, we care about two socket descriptors connected to the user host and remote host, so we do this: fd_set (userSockfd, & rdfdset; fd_set (isosockfd, & rdfdset); then call SELECT () Returns the state of the descriptor, at which point the descriptor state is stored in the descriptor set, that is, the set_fd data structure. In Figure 1 we see that all descriptions are "0", after select () returns, such as FD0 readable, then set the status flag to "1" on the READFDS descriptor set FD0. If the FD1 can be written, the state flag is set to "1" on the corresponding bit of the WRITEFDS descriptor set, and the condition of the state is also the same. In this example, we only care about whether the two socket descriptors can be written, so we do this as a SELECT () function: SELECT (fd_setsize, "& rdfdset, null, null, null) So how to detect SET_FD after select () What is the status of the bit in the data structure? This is to call the function fd_isset (), if the status of the corresponding file descriptor is "ready" (ie, the descriptor is "1"), fd_isSet () returns "1", otherwise returns "0".

转载请注明原文地址:https://www.9cbs.com/read-46217.html

New Post(0)