First, CGI Overview
Servers: Xiao Zhi Yan
The CGI (public gateway interface) specifies the interface protocol standard for the web server to call other executable program (CGI program). The web server implements the interaction of the CGI program and the interaction of the web browser, that is, the CGI program accepted the web browser sent to the web server, processes the response results to the web server and web browser. CGI program generally completes processing, database query, and integration of traditional application systems. The CGI program can be written in any programming language, such as the shell scripting language, Perl, Fortran, Pascal, C language, etc. However, the CGI program written in C language has features that perform fast, high security (because C language programs are compiled and cannot be modified). CGI interface standards include standard input, environment variables, standard output three parts. 1. Standard input CGI program Like other executable programs, you can get input information from the web server through standard input (stdin), such as data in the form, which is the so-called POST method to pass data to the CGI program. This means that the CGI program can be executed in the operating system command line status and debug the CGI program. The POST method is a common method. This article will take this method as an example to analyze the methods, processes, and techniques of the CGI program design. 2. The environment variable operating system provides many environment variables that define the execution environment of the program, and the application can access them. Web servers and CGI interfaces have also set up some of their environment variables to deliver some important parameters to the CGI program. The CGI's GET method also passes the data in Form to the CGI program via environment variable Query-String. 3. The standard output CGI program transmits the output information to the web server via the standard output (stdout). Information transmitted to the web server can be used in various formats, usually in the form of plain text or HTML text so that we can debug the CGI program in the command line status and get their output. Below is a simple CGI program that outputs the information in HTML directly to the WE B browser. #Include
#include
Main ()
{
INT, I, N;
Printf ("ContentType: Text / PLAIN / N / N");
n = 0;
IF (GetENV ("Content-Length")))
n = atoi (GetENV (Content-Length "));
For (i = 0; i
PUTCHAR ('/ n');
Fflush (stdout);
}
This procedure will be briefly analyzed below.
PRINFT ("ContentType: Text / Plain / N / N");
This row passes the string "contentType: Text / PLAIN / N / N" to the web server through standard output. It is a MIME header information that tells the web server that the output of the subsequent output is in the form of pure ASCII text. Please note that there are two new banks in this header, because the web server needs to see an empty line before the actual text message starts.
IF (GetENV ("Content-Length")))
n = atoi (GetENV ("Content-Length"));
This line first checks if the environment variable content-length exists. The web server sets this environment variable when calling the CGI program using the POST method. Its text value indicates the number of characters in the input of the web server to the input of the CGI program. Therefore, we use the function atoi () to convert the value of this environment variable into an integer And assign the variable N. Note the web server does not terminate its output with file endors, so if you do not check the environment variable Content-Length, the CGI program cannot know when the input is over.
For (i = 0; i
This row is circulated from 0 to (Content-length-1) times to copy each of the characters in the standard input to the standard output, which is to send all the inputs to the web server in the form of ASCII.
In this case, we can summarize the general work process of the CGI program as follows.
1. By checking the environment variable content-length, it is determined how much input; 2. Cycle use getChar () or other file read function to get all the input;
3. Process the input in accordance with the corresponding method;
4. Tell the web server through the "contentType:" header information, tell the web server format of the output information;
5. Transfer the output to the web server by using Printf () or PUTCHAR () or other file write functions.
In short, the main task of the CGI program is to get input information from the web server, process, and then send the output to the web server.
Second, environment variables
The environment variable is a text string (name / value pair), which can be set by OS shell or other program, or can be accessed by other programs. They are a simple means of the web server to deliver data to the CGI program, which is called environment variables because they are global variables, and any programs can access them. Below is some environment variables that are often used in the CGI program design. HTTP-REFERER: Call the URL of the web page of the CGI program. Remote-host: Call the machine name and domain name of the CGI program's web browser. Request-method: Refers to the method used by the web server to pass the data to the CGI program, divided into both GET and POST. The GET method passes only the data to the CGI program through environment variables such as query-string, while the POST method delivers data to the CGI program through environment variables and standard input, so the Post method can easily deliver more data to the CGI program. Script-name: The name of the CGI program. Query-String: When using the POST method, the data in the Form finally placed in the Query-String, passed to the CGI program. Content-type: The MIME type passed to the CGI program data is usually "Applica Tion / X-WWW-Form-Url Encode", which is the data encoding type of the data to the CGI program from the HTML Form, called URL encoding type. Content-length: The number of data characters transmitted to the CGI program (number of bytes). In a C language program, to interpret the environment variable, you can use the getENV () library function. For example: IF ("Content-length")) N = ATOI ("Content-length")); Please note that it is best to call two Getenv (): The first check is that the environment variable is existing. Use this environment variable for the second time. This is because the function getnv () returns a null (empty) pointer when the given environment variable name does not exist, if you do not check it first, it will cause the CGI program to crash when the environment variable does not exist.
Third, from the analysis and decoding of the input
1. Analysis Name / Value Checks the user when the user submits an HTML Form, first encodes the data in the Form, and sends it to the web server, and then passed to the CGI program by the web server. Its format is as follows: Name1 = value1 & name2 = value2 & name3 = value3 & name4 = value4 & ... where the name is the INPUT, SELECT, or TextArea, which defines defined in the Form, and the value is the user input or selected scheduled value. This format is URL encoding, and it needs to be analyzed and decoded in the program. To analyze this data stream, the CGI program must first decompose the data stream into a group / value pair of a group of groups. This can be done by looking for the following two characters in the input stream. Whenever the character =, marks the end of a FORM variable name; whenever character &, marking the end of a FORM variable value. Note that the value of the last variable of the input data is not to end. Once the name / value is decomposed, some special characters in the input must be converted into the corresponding ASCII characters. These special characters are: : convert to spaces;% xx: Special characters represented by their hexadecimal ASCII code value. Convert it into a corresponding ASCII character based on value xx. This conversion is performed on the Form variable name and variable value. Below is a CGI program that analyzes Form data and delivers the result to the web server. #Include
#include
#include
Int htoi (char *);
Main ()
{
INT I, N; CHAR C;
Printf ("ContentType: Text / PLAIN / N / N");
n = 0;
IF (GetENV ("Content-Length")))
n = atoi (GetENV ("Content-Length"));
For (i = 0; i
C = getchar ();
Switch (c) {
Case '&':
C = '/ n';
Break;
Case ' ':
C = '';
Break;
Case '%': {
Char s [3];
s [0] = getchar ();
s [1] = getChar ();
S [2] = 0;
C = HTOI (s);
i = 2;
}
Break;
CASE '=':
C = ':';
IS-EQ = 1;
Break;
}
PUTCHAR (C);
IF (IS-EQ) Putchar ('');
}
PUTCHAR ('/ n');
Fflush (stdout);
}
/ * Convert HEX STRING To INT * /
Int htoi (char * s)
{
CHAR * DIGITS = "0123456789AbcDef";
IF (Islower (s [0])) s [0] = TouPper (s [0]);
IF (Islower (s [1])) s [1] = TouPper (s [1]);
Return 16 * (Strchr (Digits, S [0]) --STRCHR (Digits, '0')
)
(Strchr (Digits, S [1]) - STRCHR (Digits, '0'));
}
The above program first outputs a MIME header to the web server, check the number of characters in the input, and cycle checks each character. When the character is the character is, it means a name / value end, the program outputs an empty line; when the character is , turn it into a space; when the character is%, it means a ten-character ten The beginning of the six-input value, call the HTOI () function to convert the subsequent two characters to the corresponding ASCII character; when the character is =, it means the end of the name / value of the name part, and convert it into character:. Finally, the converted characters will be output to the web server.
Fourth, generate HTML output
The output generated by the CGI program consists of two parts: MIME header information and actual information. The two parts are separated in one space. We have seen how to use the MIME header information "ContunType: text / plain / n / n" and PRINTF (), PUT CHAR () and other function calls to output pure ASCII text to the Web server. In fact, we can also use MIME header information "c ontenttype: Text / html / n / n" to output HTML source code to the web server. Please note that there must be a blank line after any MIME header information. Once the MIME header is sent to the WE B server, the web browser will consider that the subsequent text output is HTML source code, and any HTML structure can be used in the HTML source code, such as hyperlink, image, form, and other CGI programs. Call. That is, we can dynamically generate HTML source output output in the CGI program, and below is a simple example. #Include
#include
Main ()
{
Printf ("ContentType: Text / HTML / N / N");
PRINTF (" / n");
Printf ("
HEAD> / N");Printf ("
/ N ");
Printf ("
Printf ("