Develop a proxy server with Java

xiaoxiao2021-03-06  46

Yu Liangsong, source author

The application of a proxy server is very wide. For example, in the corporate network, it can be used to control the Internet content that employees browsing at work, preventing employees from accessing certain types of content or some specified sites. The proxy server actually plays the role of the intermediar between the browser and the web server, which can perform a variety of processing to the browser, can filter ad and cookie, and can pre-extract the web page, so that the browser accesses the speed of the page. Faster, wait.

First, basic knowledge

Regardless of which way to apply the proxy server, the process of monitoring the HTTP transmission is always as follows:

Step 1: Internal browser sends a request to the proxy server. The first line requested contains the target URL.

Step 2: The proxy server reads the URL and forwards the request to the appropriate target server.

Step 3: The proxy server receives the response from the Internet target machine and forwards the response to the appropriate internal browser.

For example, assume that there is an enterprise employee tried to access the www.cn.ibm.com website. If there is no proxy server, the employee's browser opens the Socket to run the web server running this site, and the data returned from the web server is also passed directly to the employee's browser. If the browser is configured to use the proxy server, the request first reaches the proxy server; then, the proxy server extracts the target URL from the first line of the request, opens a socket to www.cn.ibm.com. When www.cn.ibm.com returns a response, the proxy server forwards the response to the employee's browser.

Of course, the proxy server is not only applicable to the corporate environment. As a developer, it is a very good thing to have a very good thing. For example, we can use proxy servers to analyze the interaction of your browser and web servers. This feature is useful when testing and resolving problems in web applications. We can even use multiple proxy servers at the same time (most proxy servers allow multiple server links to be used together). For example, we can have a business proxy server, plus a proxy server written in Java to debug the application. However, it should be noted that every server on the proxy server chain will have a certain impact on performance.

Second, design planning

As indicated by his name, the proxy server is just a special server. Like most servers, if you want to handle multiple requests, the proxy server should use threads. Below is the basic planning of a proxy server:

Wait for requests from the customer (web browser).

Start a new thread to process customer connection requests.

The first row of the browser request (the line content contains the requesting target URL).

Analyze the first line of content requested to get the name and port of the target server.

Open a socket that delivers a target server (or the next proxy server, such as the right).

Send the first line of the request to the output socket.

Send the remainder of the request to the output socket.

Send the data returned by the target web server to the issued browser.

Of course, if you consider the details, the situation will be more complex. In fact, there are two main problems here: First, read data from Socket is best for further processing, but this will generate performance bottlenecks; second, the connection between the two sockets must be efficient. There are several ways to achieve these two goals, but each method has its own price. For example, if you want to filter when data is entered, these data is best read by row; however, when data arrives at the proxy server, immediately forward it more suitable for efficient requirements. In addition, data transmission and reception can also use multiple separate threads, but a large amount of creation and removing threads will also bring performance issues. Therefore, for each request, we will use a thread to process the data reception and send, while when the data arrives at the proxy server, forward it as quickly as possible. Third, instance

Note that reusability is important during the process of writing this proxy server with Java. Because of this, when we want to handle the browser request in another project, it is convenient to recharge the proxy server. Of course, we must pay attention to the balance between flexibility and efficiency.

Figure 1 shows the output interface of this proxy server instance (httpproxy.java), when the browser accesses http://www-900.ibm.com/cn/, the proxy server to the default log device (ie the standard output device screen) Output the URL requested by the browser. Figure 2 shows the output of SubhtpProxy. Subhttpproxy is a simple extension of httpproxy.

In order to construct the proxy server, I will send an HTTPProxy class from the Thread class class (the code in the body of the article is some piece of this class, and the complete code is downloaded from this article). The HTTPProxy class contains some properties to customize the behavior of the proxy, see Listing 1 and Table 1.

[Listing 1]

/ **********************************************

* A foundation proxy server class

**********************************************

* /

Import java.net. *;

Import java.io. *;

Public class httpproxy extends thread {

Static public int connect_retries = 5;

Static public int connect_pause = 5;

Static public int Time-out = 50;

Static public int buffs = 1024;

Static public boolean logging = false;

Static public outputReam log = null;

// Socket for incoming data

Protected socket socket;

/ / Superior proxy server, optional

Static private string parent = null;

Static private int parentport = -1;

Static Public Void SetParentProxy (String Name, INT PPORT) {

Parent = Name;

PARETPORT = PPORT;

}

// Create a proxy thread on a given Socket.

Public httpproxy (socket s) {socket = s; start ();

Public void Writelog (int C, Boolean Browser "throws oewception {log.write (c);

}

Public void writelog (byte [] bytes, int offset,

INT LEN, BOOLEAN Browser "throws oException {

For (int i = 0; i

}

/ / By default, log information is output to

// Standard output device,

// Detective class can overwrite it

Public String ProcessHostName (String Url, String Host, INT Port, Socket Sock) {

Java.text.dateformat cal = java.text.dateformat.getdatetimeInstance ();

System.out.println (Cal.Format (New java.util.date ()) "-"

URL " Sock.getineTaddress () "
");

Return Host;

}

Table I

Variable / method description

CONNECT_RETRIES attempts to connect the number of times the remote host before abandonment.

Connect_Pause The pause time between two connection attempts.

Time-out Waiting time waiting for the Socket input.

Bufsiz Socket Enter buffer size.

Logging is required to require a proxy server to record all transmitted data in the log (True means "Yes").

Log An OutputStream object, the default log routine will output log information to the OutputStream object.

setParentProxy is used to link a proxy server to another (you need to specify the name and port of another server).

After the proxy server is connected to the web server, I use a simple loop to pass data between two sockets. A problem may occur here, that is, if there is no operable data, call the read method may cause the program to block, thereby suspend the program. To prevent this problem, I use the Setsotimeout method to set the timeout time of the socket (see Listing 2). This way, if a socket is not available, the other is still an opportunity to process, I don't have to create a new thread.

[Listing 2]

// Perform a thread

Public void run () {

String line;

String host;

INT port = 80;

Socket Outbound = NULL;

Try {

Socket.setsotimeout (TIMEOUT);

InputStream is = socket.getinputStream ();

OutputStream OS = NULL;

Try {

/ / Get the content of the request line

Line = "";

Host = "";

INT State = 0;

Boolean Space;

While (true) {

INT c = is.read ();

IF (c == - 1) Break;

IF (Logging) WRITELOG (C, TRUE);

Space = Character.iswhitespace (CHAR) C);

Switch (state) {

Case 0:

IF (Space) Continue;

State = 1; Case 1:

IF (space) {

STATE = 2;

CONTINUE;

}

LINE = line (char) C;

Break;

Case 2:

IF (space) Continue; // Skip multiple blank characters

State = 3;

Case 3:

IF (space) {

State = 4;

// Only analyze the host name part

String host0 = Host;

Int n;

n = host.indexof ("//");

IF (n! = - 1) Host = host.substring (n 2);

n = host.indexof ('/');

IF (n! = - 1) Host = host.substring (0, n);

// Analyze the port number that may exist

n = host.indexof (":");

IF (n! = - 1) {

Port = integer.parseint (Host.Substring (n 1));

Host = host.substring (0, n);

}

Host = Processhostname (Host0, Host, Port, Socket);

IF (parent! = null) {

Host = pent;

Port = PARETPORT;

}

int RETRY = Connect_Retries;

While (Retry -! = 0) {

Try {

Outbound = new socket (Host, Port);

Break;

} Catch (exception e) {}

// Waiting

Thread.sleep (connect_pause);

}

IF (Outbound == Null) Break;

Outbound.setsotimeout; Timeout

OS = Outbound.getOutputStream ();

Os.write (line.getbytes ());

Os.write ('');

Os.write (Host0.GetBytes ());

Os.write ('');

PIPE (IS, Outbound.getInputStream (), OS, Socket.getOutputStream ());

Break;

}

Host = host (char) C;

Break;

}

}

}

Catch (IOException E) {}

} Catch (exception e) {}

Finally {

Try {socket.close ();} catch (exception e1) {}

Try {Outbound.close ();} catch (exception E2) {}

}

}

Like all thread objects, the main operation of the HTTPProxy class is completed in the RUN method (see Listing 2). The RUN method implements a simple state machine, from the web browser read characters each time, continuous this process until there is enough information to find the target web server. Then, Run opens a socket to the web server (if multiple proxy servers are linked together, the RUN method opens a socket of the next proxy server inside the chain). After opening the socket, RUN first writes part of the request to Socket and then calls the PIPE method. The PIPE method performs reading and writing operations directly between the two sockets.

If the data size is large, you can create a thread may have higher efficiency; however, when the data size is small, the cost of creating a new thread will offset the benefits it brings. Listing 3 shows a very simple MAIN method that can be used to test the HTTPProxy class. Most of the work is done by a static STARTPROXY method (see LISTING 4). This method uses a special technology that allows a static member to create an instance of an HTTPPROXY class (or subclass of the HTTPPROxy class). Its basic idea is to pass a Class object to the StartProxy class; then, startProxy method uses the image API (Reflection API) and the getDeclaredConstructor method to determine which constructor of the Class object accepts a socket parameter; Finally, the StartProxy method calls newInstance method Create this Class object.

[Listing 3]

// Simple MAIN method for testing

Static Public Void Main (String Args []) {

System.out.println ("In Port 808 Start Proxy Server / N");

Httpproxy.log = system.out;

Httpproxy.logging = false;

Httpproxy.startProxy (808, httppproxy.class);

}

}

[Listing 4]

Static Public Void StartProxy (int port, class clobj) {

Serversocket sock;

Socket sock;

Try {

SSOCK = New Serversocket (port);

While (true) {

Class [] sarg = new class [1];

Object [] arg = new object [1];

SARG [0] = Socket.class;

Try {

Java.lang.reflect.constructor cons = Clobj.getDeclaredConstructor (SARG);

Arg [0] = sock.accept ();

Cons.newinstance (arg); // Create an instance of httpproxy or its derived class

} Catch (exception e) {

Socket Esock = (socket) arg [0];

Try {esock.close ();} catch (exception EC) {}

}

}

} Catch (ioexception e) {

}

}

With this technique, we can extend the HTTPPROxy class without creating a StartProxy method. To get a given class object, just add .class after normal name, if there is an instance of an object, then call the getClass method). Since we pass the Class object to the StartProxy method, create a HTTPPROXY's derived class, you don't have to change StartProxy. (Download code contains a derived simple proxy server).

Conclude

There are two ways to use derived class custom or adjust the proxy server: Modify the name of the host, or capture all data through the proxy server. The ProcessHostName method allows the proxy server to analyze and modify the host name. If logging is enabled, the proxy server calls the WRITELOG method for each character of the server. How to deal with this information is completely determined by our own - you can write it to the log file, you can output it to the console, or any other handle that meets our requirements. A Boolean tag in the WRITELOG output indicates that the data is from the browser or the web host. Like many tools, the proxy server does not have good or bad problems, and the key is how to use them. Proxy servers may be used to violate privacy, but they can also block voicers and protection networks. Even if the proxy server and browser are not in the same machine, I am also happy to regard the proxy server as a way to extend the browser function. For example, before sending the data to the browser, you can use the proxy server to compress the data; future proxy servers may even translate the page from a language into another language ... The possibility will never end.

Please download this article from this article: javaproxyserver_code.zip

About author

Yu Liangsong, software engineer, independent consultant and freelance writer. Originally engaged in PB and Oracle development, the main interest is to develop in Internet. You can contact me via javaman@163.net.

转载请注明原文地址:https://www.9cbs.com/read-84113.html

New Post(0)