Apply HTTPClient to deal with various stubborn web servers

xiaoxiao2021-03-06  116

In general, we all use IE or Navigator browsers to access a web server to browse the page view information or submit some data. These pages accessed are only some of ordinary pages, and some require users to log in, or require authentication, and some by encryption, such as HTTPS. The browser we currently currently does not constitute a problem. However, you may need to visit some pages through the program at some point, such as "stealing" from someone else's webpage; use the page provided by some sites to complete some feature, for example, we want to know a mobile phone. The number of the number and we have no such data, so I have to complete this feature with some websites of other companies. At this time, we need to submit your mobile phone number to the web page and resolve the data we want from the return page. . If the other party is just a very simple page, then our program will be very simple. This article is not necessary to have a lot of money here. However, considering some service authorization issues, many companies are often accessible by a simple URL, but must be registered and then logged in, you can use the pages that provide services, this time involves the cookie problem. deal with. We know that current popular dynamic web technology such as ASP, JSP is not to handle session information by cookie. In order for our program to use the service page provided by others, ask the program first log in to access the service page, this process needs you to process cookies yourself, think about how you use java.net.httpurlconnection to complete these features Horror things! Moreover, this is just a very common "stubborn" in our stubborn web server! If you have been uploaded by HTTP? No headache, these issues have "it" is very easy to solve!

We can't list all possible stubborn, we will handle several most common issues. Of course, as mentioned earlier, if we use java.net.httpurlconnection to get these questions is a terrible thing, so we will introduce an open source project before starting, this project is Apache Open Source Organization. HTTPCLIENT, it is part of Jakarta's Commons project, and the current version is 2.0RC2. COMMONS has already had a NET sub-project, but HTTPCLIENT is separately raised, and it can be seen that the HTTP server is not easy.

The Commons-HttpClient project is specially designed to simplify the HTTP client to make a variety of communication programming. Through it makes it easy to solve the original headache now, for example, you no longer manage the communication method of http or https, telling it that you want to use HTTPS mode, the rest is handed over to HTTPClient for you. This article will provide you with HTTPClient to solve them for several issues that we often encounter while writing HTTP client programs. In order to make readers faster familiar with this project, we will give a simple example to read The content of a web page, then step by step into all the problems in the advancement.

1. Read web page (http / https) content

Here is a simple example we give to access a page.

/ *

* Created on 2003-12-14 by liudong

* /

Package http.demo;

Import java.io.ioException;

Import org.apache.commons.httpclient. *;

Import org.apache.commons.httpclient.methods. *;

/ **

* The simplest HTTP client, used to demonstrate access to a page via GET or POST

* @Author liudong

* /

Public class SimpleClient {

Public static void main (string [] args) throws oException {

HTTPCLIENT Client = New httpclient ();

/ / Set proxy server address and port

//client.gethostconfiguration( ).SetProxy ("proxy_host_addr", proxy_port);

// Use a get method, if the server needs to connect through the HTTPS, then you only need to change the HTTP in the URL to https

HttpMethod method = New getMethod ("http://java.sun.com");

// use the POST method

// httpmethod method = new postmethod ("http://java.sun.com");

Client.executeMethod (METHOD);

// Print the status of the server returns

System.out.println (Method.getStatusline ());

// Print the information returned

System.out.println (Method.getResponsebodyAsstring ());

// Release connection

Method.releaseConnection ();

}

In this example, an instance of an HTTP client (httpclient) is first created, then the method of submit is GET or POST, and finally the submitted method is performed on the HTTPClient instance, and finally read the server feedback from the selected submission method. result. This is the basic process using HTTPClient. In fact, use a line of code to get the process of the entire request, very simple!

2. Submitting parameters to the web page in GET or POST, in the first one, we have already introduced how to use the GET or POST method to request a page, this section is different from the parameters required to set the page when submitted. We know if it is a request method, then all parameters are placed directly to the URL of the page with the question mark and the page address, each parameter is used, for example: http: //java.sun.com? Name = liudong & mobile = 123456, but it will have a little bit of trouble when using a Post method. Example of this section demonstrates how to query the city where the mobile phone number is located, the code is as follows: / *

* Created on 2003-12-7 by liudong

* /

Package http.demo;

Import java.io.ioException;

Import org.apache.commons.httpclient. *;

Import org.apache.commons.httpclient.methods. *;

/ **

* Submit parameter demonstration

* This program is connected to a page used to query the location of the mobile phone number.

* To query the province and city of the number section 1330227

* @Author liudong

* /

Public class SimplehttpClient {

Public static void main (string [] args) throws oException

{

HTTPCLIENT Client = New httpclient ();

Client.gethostConfiguration (). Sthost ("www.imobile.com.cn", 80, "http");

Httpmethod method = getpostMethod (); // Submit data using the POST

Client.executeMethod (METHOD);

// Print the status of the server returns

System.out.println (Method.getStatusline ()); // Print Results page

String response =

New string (Method.getResponseBodyasstring (). getBytes ("8859_1");

// Print the information returned

System.out.println (Response);

Method.releaseConnection ();

}

/ **

* Submit data using GET

* @Return

* /

Private static httpmethod getgetMethod () {

RETURN NEW GETMETHOD ("/ Simcard.php? Simcard = 1330227");

}

/ **

* Submit the data using the POST method

* @Return

* /

Private static httpmethod getpostMethod () {

Postmethod Post = New PostMethod ("/ SimCard.php");

NameValuePair Simcard = New NameValuePair ("Simcard", "1330227");

Post.setRequestBody (new namevaluepair [] {simcard};

Return Post;

}

} In the example above, the page http://www.imobile.com.cn/simcard.php requires a parameter is simcard, this parameter value is the mobile phone number segment, that is, the first seven digits of the mobile phone number, the server will return the mobile phone Number, city, and other details. GET's submission method only needs to add parameter information after the URL, and POST needs to set the parameter name and the value it corresponds to the value of 3. Processing page reset in JSP / Servlet programming RESPONSE.SENDREDIRECT method is the redirection mechanism in the HTTP protocol. It is the difference between the in JSP is that the latter is a jump of the page in the server, that is, the application container loads the content of the page to be jumped and returned to the client; and the former returns A status code that may be seen in the following table, and then the client reads the URL of the page you want to jump and reload the new page. It is such a process, so when we program, it is necessary to determine if the return value is a value in the next table by httpmethod.getstatuscode () method to determine whether to jump. If you have confirmed that the page jump is required, you can get the new address by reading the Location property in the HTTP header.

status code

Constant to HTTPSERVLETRESPONSE

Detailed Description

301

SC_MOVED_PERMANENTLY

The page is already moving to another new address

302

SC_MOVED_TEMPORARILY

Page temporarily move to another new address

303

SC_SEE_OTHER

The address of the client request must be accessed through another URL

307

SC_TEMPORARY_REDIRECT

SC_MOVED_TEMPORARILY

The following code snippet demonstrates how to handle the page redirects Client.executeMethod (POST);

System.out.println (post.getStatusline (). TOSTRING ());

Post.releaseConnection ();

/ / Check if it is redirected

Int statuscode = post.getStatuscode ();

IF ((statuscode == httpstatus.sc_moved_temporarily) ||

(statuscode == httpstatus.sc_moved_permanently) || (statuscode == httpstatus.sc_see_other ||

(statuscode == httpstatus.sc_temporary_redirect)) {

// Read the new URL address

Header header = post.getResponseheader ("location");

IF (header! = null) {

String newuri = header.getValue ();

IF (NEWURI == Null) || (Newuri.equals ("))))

Newuri = "/";

GetMethod Redirect = New GetMethod (Newuri);

Client.executeMethod (redirect);

System.out.println ("Redirect:" Redirect.getStatusline (). TOSTRING ());

Redirect.releaseConnection ();

Else

System.out.Println ("Invalid Redirect");

} We can write two JSP pages yourself, one of which redirects to another page with the response.sendredirect method to test the above example. 4. The simulation input username and password are logging in. This section should be said to be the most frequently encountered in HTTP client programming. Many websites are only visible to registered users. In this case, you must require the use of the correct username and password login. After you can browse to the you want. Because the HTTP protocol is stateless, that is, the validity period of the connection is limited to the current request, and the connection is closed after the content is completed. In this case, the cookie mechanism must be used in order to save the user's login information. Take JSP / servlet as an example, when the browser requests a JSP or a servlet page, the application server returns a parameter, named JSessionID (different from different application servers), the value is a long unique string cookie This string value is also the session ID currently accessed the site. The browser will bring a cookie information such as jsessionID when accessing this site, and the application server obtains the corresponding session information based on the read this session identifier. For websites that require users to log in, the user data will be saved in the server after the user login is successful, so when accessing to other pages, the application server reads the current request corresponding to the current request according to the cookie sent on the browser. Identify to obtain the corresponding session information, and then determine if the user data exists in the session information, if there is, it is allowed to access the page, otherwise the jump to the login page requires the user to enter the account and password to log in. This is the general method that generally uses JSP development sites in handling user login. In this way, for HTTP clients, if you want to access a protected page, you must simulate the job made by your browser. First, the request login page is requested, then read the cookie value; request the login page again and join the login The page required by the page; the end is the page requested to be finally required. Of course, other requests except for the first request need to come on the cookie information so that the server can determine if the current request has passed verification. Said so much, but if you use httpclient, you don't even need to increase the code, you just need to pass the login information to execute the login process, then directly access the you want the page, and there is no difference with access to a normal page. Because the class httpclient has made all the things that I have made, it is great! The following example implements the process of access. / ** Created on 2003-12-7 by liudong

* /

Package http.demo;

Import org.apache.commons.httpclient. *;

Import org.apache.commons.httpclient.cookie. *;

Import org.apache.commons.httpclient.methods. *;

/ **

* Examples used to demonstrate login forms

* @Author liudong

* /

Public class formlogindemo {

Static Final String logon_site = "localhost";

Static Final Int logon_port = 8080;

Public static void main (string [] args) throws exception {

HTTPCLIENT Client = New httpclient ();

Client.gethostConfiguration (). setost (logon_site, logon_port);

// Simulation login page login.jsp-> main.jsp

Postmethod Post = New PostMethod ("/ main.jsp"); NameValuePair Name = New NameValuePair ("Name", "LD");

NameValuePair Pass = New NameValuePair ("Password", "LD");

Post.setRequestBody (new name, pass});

Int status = client.executeMethod (post);

System.out.println (Post.getResponsebodyAsstring ());

Post.releaseConnection ();

/ / View cookie information

Cookiespec cookiespec = cookiepolicy.getDefaultspec ();

Cookie [] cookies = cookiespec.match (logon_site, logon_port, "/", false, client.getState (); getCookies ());

IF (cookies.length == 0) {

System.out.println ("none");

} else {

For (int i = 0; i

System.out.println (cookies [i] .tostring ());

}

}

// Access the required page main2.jsp

GetMethod get = new getMethod ("/ main2.jsp");

Client.executeMethod (GET);

System.out.println (Get.getResponsebodysstring ());

Get.releaseConnection ();

}

} 5. The parameters submitted to the XML format parameters are simple. It is simple. It is only a ContentType problem that is submitted. The following example demonstrates the process of reading XML information from the file file and submits to the server, which can be used to test web services. Import java.io.file;

Import java.io.fileinputstream;

Import org.apache.commons.httpclient.httpclient;

Import org.apache.commons.httpclient.methods.entityenclosingMethod;

Import org.apache.commons.httpclient.methods.postmethod;

/ **

* Examples used to demonstrate submission of XML format data

* /

Public class postxmlclient {

Public static void main (string [] args) throws exception {

File Input = New File ("Test.xml");

Postmethod Post = New PostMethod ("http: // localhost: 8080 / httpclient / xml.jsp");

// Set the content of the request directly read from the file

Post.setRequestBody (New FileInputStream (Input));

IF (Input.Length ()

Post.setRequestContentLength; Input.Length ());

Else Post.setRequestContentLength (EntityENClosingMethod.content_length_chunked); / / Specify the type of request content

Post.setRequestHeader ("Content-Type", "Text / XML; Charset = GBK");

HTTPCLIENT httpclient = new httpclient ();

Int results = httpclient.executemethod (post);

System.out.println ("Result Status Code: Result);

System.out.println ("Response Body:");

System.out.println (Post.getResponsebodyAsstring ());

Post.releaseConnection ();

}

} 6. With the HTTP upload file HttpClient used a separate httpMethod subclass to process the upload of the file, this class is MultipArtPostMethod, which has encapsulated the details of the file upload, and we have to do just tell it that we are going to upload the full path we have to upload the file. However, the following code snippet demonstrates how to use this class. MultipartPostMethod FilePost = New MultipArtPostMethod (targetURL);

FilePost.AddParameter ("FileName", TargetFilePath;

HTTPCLIENT Client = New httpclient ();

// Since the file to be uploaded may be relatively large, the maximum connection timeout is set this.

Client.getttpConnectionManager (). getParams (). setConnectionTimeout (5000);

INT status = client.executeMethod (filepost); TargetFilePath is the path where the file to be uploaded. 7. Accessing Enable authentication This page often encounters this page. When accessing it, you will pop up a browser's dialog requires entering the username and password, this user authentication is different from our form based on our previously introduced User authentication. This is the HTTP's certification strategy, HTTPClient supports three certification methods including: basic, summary, and NTLM certification. The basic authentication is the simplest, universal but the most unsafe; the summary authentication is the authentication method joined in HTTP 1.1, and NTLM is a Microsoft's definition rather than universal specifications. The latest version of NTLM is more secure. One way. The following example is downloaded from the HTTPClient's CVS server, which simply demonstrates how to access a authentication protected page: import org.apache.commons.httpclient.httpclient;

Import org.apache.commons.httpclient.usernamepasswordcredentials;

Import org.apache.commons.httpclient.methods.getMethod;

Public class basicauthenticationexample {

Public BasicAuthenticationExample () {

}

Public static void main (string [] args) throws exception {

HTTPCLIENT Client = New httpclient ();

Client.getState (). setcredentials ("www.veriSign.com",

"realm",

New UserNamePasswordcredentials ("UserName", "Password")

);

GetMethod get = new getMethod ("https://www.verign.com/products/index.html");

Get.SetDoauthentication (TRUE);

Int status = client.executeMethod (GET);

System.out.println (Status "/ N" get.getResponsebodyasstring ());

Get.releaseConnection ();

}

} 8. Using HTTPClient multi-threads simultaneously accesses HttpClient while downloading multiple files from a site simultaneously. For the same HTTPConnection, only one thread access, to ensure that conflicts in multi-threaded work environments, HTTPClient uses a multi-threaded connection manager class: MultithreadedhttpConnectionManager, it is easy to use, only need to construct When the httpclient instance is incorporated, the code is as follows: MultithReadedHttpConnectionManager ConnectionManager =

New multithreadedhttpConnectionManager ();

HTTPCLIENT Client = New HttpClient (ConnectionManager); although accessed the client instance later. Reference: HTTPCLIENT Home: http://jakarta.apache.org/commons/httpclient/ How to work about NTLM: http://davenport.sourceforge.net/ntlm.html

转载请注明原文地址:https://www.9cbs.com/read-104212.html

New Post(0)