Overview
In today's network era, downloading software is one of the most frequent software. In the past few years, the download technology has also been constantly developing. The most original download feature is just a "download" process, which is continuously read from the web server. The biggest problem is that due to the instability of the network, once the connection is disconnected, the download process is interrupted, it has to come once again.
Subsequently, the concept of "breakpoint renewal" came out, as the name suggests, if the download is interrupted, after re-establishing the connection, skip the downloaded part, and only downloading some of the download yet.
Regardless of whether "multi-thread download" technology is invented by Mr. Hong Yizhang, Hong Hao makes this technology unprecedented facts. After the "Internet Ant" software is popular, many download software also followed whether there is "multithreaded download" technology, and even how many download threads can be used to evaluate the elements of downloading software. The foundation of "Multithreaded Download" is that the web server supports remote random read, which is supported by "breakpoint renewal". In this way, you can divide the file into several parts when downloading, and each part creates a download thread for download.
Now, don't write a dedicated download software, in the software you have written, adding a download function is sometimes necessary. If your software supports automatic online upgrade, or automatically downloads new data in the software, this is useful, and it is a very useful feature. The topic of this article is how to write a download module that supports "breakpoint renewal" and "multi-thread". Of course, the download process is very complicated. In an article, it is difficult to clarify all, so it is basically ignored, such as abnormal processing and network error handling, etc., such as abnormal processing and network error handling. The development environment I use is C Builder 5.0, friends who use other development environments or programming languages are appropriate to modify themselves.
Introduction to HTTP Protocol
Downloading files is the process of interacting with the web server, and their interactive "language" professional name is an agreement. There are a variety of protocols for transferring files. The most commonly used HTTP (Hyper Text Transfer Protocol) and FTP (file transfer protocol), I use HTTP.
The most basic command of the HTTP protocol: GET, POST and HEAD. GET requests a specific object from the web server, such as an HTML page or a file, the web server sends this object through a socket connection as a response; the head command enables the server to give the server to this object's basic description, such as the type, size, and update time of the object. . The POST command is used to send data to the web server, which typically causes the information to a separate application, and the result is returned to the browser. Download is achieved by get command.
Basic download process
Write the download program, you can use the Socket function directly, but this requires developers to understand and become familiar with the TCP / IP protocol. In order to simplify the development of Internet client software, Windows provides a Wininet API that encapsulates commonly used network protocols, which greatly reduces the threshold for developing Internet software. The Wininet API function we need to use is shown in Figure 1, and the call sequence is basically from top to bottom, and the specific function prototype is referred to MSDN.
figure 1
When using these functions, you must strictly distinguish between the handles used. The type of these handles is the same, all Hinternet, but the role is different, this is very confused. According to the order and call relationship of these handles, it can be divided into three levels, and the handle of the next level is obtained by the handle of the previous level.
Internetopen is the first to call the function, it returns the Hinternet handle level, I habpely defined as HSession, ie the session handle.
InternetConnect uses the HSession handle, returning the HTTP connection handle, I define it as hconnect. HTTPOPENREQUEST uses the HConnect handle, the returned handle is an HTTP request handle, defined as hRequest.
HttpsendRequest, HttpQueryInfo, InternetSetFilePointer, and InternetReadFile use HttPopenRequest to return the handle, ie hRequest.
When these handles are no longer used, you should use the function InternetCloseHandle to turn it off to release the resources you usually.
First build a thread module named THTTPGETTHREAD, and then automatically hang, I want the thread to automatically destroy after completion, so set in the constructor:
FreeOnterminate = true; // Automatic deletion
And add the following member variables:
Char buffer [httpget_buffer_max 4]; // Data buffer
Ansistring furl; // Download the URL of the object
Ansistring foutfilename; // Save path and name
Hinternet FHSession; // Conference handle
Hinternet fhconnect; // http connection handle
Hinternet fhrequest; // http request handle
BOOL FSUCCESS; / / Download whether it is successful
INT ifilehandle; // Output file handle
1, establish a connection
According to the functional division, the download process can be divided into 4 parts, which is established, read the information to download files and analyze, download files, and release the resource. The function of establishing the connection is as follows, where ParseURL's role is to obtain the web path of the host name and download file from the download URL address, and DOONSTATUSTEXT is used to output the current state:
// Initialization download environment
Void THTTPGETTHREAD :: StartHttpget (Void)
{
Ansistring hostname, filename;
Parseurl (Hostname, FileName);
Try
{
// 1. Establish a session
FHSession = Internetopen ("http-get-demo",
Internet_Open_Type_Preconfig,
NULL, NULL,
0); // Synchronous mode
IF (FHSession == Null); Exception ("Error: Interker"));
DoonStatustext ("OK: Interkeen");
// 2. Establish a connection
Fhconnect = InternetConnect (fHSession,
Hostname.c_str (),
Internet_default_http_port,
NULL, NULL,
Internet_Service_HTTP, 0, 0);
IF (fhconnect == null). "));
DOONSTATUSTEXT ("OK: InternetConnect");
// 3. Initialization download request
Const char * facepttypes = "* / *";
FhRequest = httpopenRequest (fhconnect,
"Get", // get data from the server
Filename.c_str (), // The name of the file wants to read
"Http / 1.1", // use the protocol NULL,
& FacceptTypes,
Internet_flag_reload,
0);
IF (fHREQUEST == NULL) Throw ("ERROR: HTTPOPENREQUEST");
DoonStatustext ("OK: httpopenrequest");
// 4. Send a download request
HttpsendRequest (FhRequest, NULL, 0, NULL, 0);
DoonStatustext ("OK: httpsendrequest");
} catch (Exception & Exception)
{
ENDHTTPGET (); // Close connection, release resources
DoonStatustext (Exception.Message);
}
}
/ / Extract the host name and download file path from the URL
Void THTTPGETTHREAD :: Parseurl (Ansistring & Hostname, Ansistring & FileName)
{
Ansistring URL = FURL;
INT i = URL.POS ("http: //");
IF (i> 0)
{
Url.delete (1, 7);
}
i = url.pos ("/");
Hostname = url.substring (1, i-1);
FileName = url.substring (i, url.length ());
}
It can be seen that the program is sequentially called in the order in Figure 1, and the InternetRequest function gets 3 related handles, and then sends the downloaded request to the web server via the HttpsendRequest function.
The first parameter of Internetopen is independent. The last parameter is set to Internet_flag_async, which will establish an asynchronous connection, which is very significant, considering the complexity of this article, I am not adopted. But for readers who need higher download requirements, it is highly recommended to use asynchronous ways.
HTTPOPENREQUEST opens a request handle, the command is "get", indicating the download file, and the protocol used is "http / 1.1".
Another place to note is the parameter FACCEPTTYPES of HttPopenRequest, indicating that the file type that can be opened, I am set to "* / *" indicates that all file types can be opened, and it can be changed according to actual needs.
2. Read the information to be downloaded and analyze
After sending a request, you can use the HTTPQueryInfo function to obtain information about the file, or obtain the information of the server and the related operations supported by the server. For download programs, the most commonly used is the size of the file to get the file, that is, the number of bytes containing the file. The module is as follows:
/ / Get the size of the file to be downloaded
INT __FASTCALL THTTPGETTHREAD :: GetWebFileSize (void)
{
Try
{
DWORD buflen = httpget_buffer_max;
DWORD dwindex = 0;
Bool retQueryInfo = httpqueryinfo (fHREQUEST,
Http_query_content_length,
Buffer, & buflen,
& dwindex);
IF (RetQueryInfo == False) "" Error: httpqueryinfo "); doonStatustext (" OK: httpqueryinfo ");
INT filesize = strt (buffer); // file size
DOONGETFILESIZE (FILSIZE);
} catch (Exception & Exception)
{
DoonStatustext (Exception.Message);
}
Return FileSize;
}
DOONGETFILESIZE in the module is an event that issues a file size. After obtaining the file size, for the multi-threaded download program, you can make a suitable file block, determine the starting and size of each file block.
3, download file module
Before you start download, you should also arrange how to save the download results. There are a lot of ways, I use the file function provided by C Builder to open a file handle. Of course, you can also use the API of Windows itself, and you can consider that all buffers can be considered in the memory.
// Open the output file to save the downloaded data
DWORD THTTPGETTHREAD :: OpenOutfile (void)
{
Try
{
IF (FileExists (FoutFileName))
Deletefile (foutfilename);
iFileHandle = filecreate (foutfilename);
IF (ifilehandle == - 1) throw ("Error: FileCreate");
DoonStatustext ("OK: CREATEFILE");
} catch (Exception & Exception)
{
DoonStatustext (Exception.Message);
}
Return 0;
}
/ / Execute the download process
Void THTTPGETTHREAD :: DOHTTPGET (VOID)
{
DWORD dwcount = OpenOutfile ();
Try
{
// Send a start download event
DOONSTATUSTEXT ("Startget: InternetFile");
// read data
DWORD dwRequest; // Request to download the number of bytes
DWORD dwread; // actually read the number of bytes
DwRequest = httpget_buffer_max;
While (True)
{
Application-> ProcessMess ();
Bool ReadReturn = InternetReadFile (FhRequest,
(LPVOID) Buffer,
DWREQUEST,
& dwread);
IF (! Readreturn) Break;
IF (dwread == 0) Break;
// save data
Buffer [dwread] = '/ 0';
FileWrite (ifileHandle, Buffer, dwread);
Dwcount = dwcount dwread;
/ / Send a download process event
DoonProgress (dwcount);
}
Fsuccess = true;
} catch (Exception & Exception)
{
Fsuccess = false;
DoonStatustext (Exception.Message);
}
FILECLOSE (ifilehandle);
DOONSTATUSTEXT ("end: internetReadfile";}
The download process is not complicated. Like reading a local file, perform a simple loop. Of course, this convenient programming is also benefited from Microsoft's package to network protocols.
4, release the resources occupied
This process is simple, call the InternetCloseHandle function in the opposite order that produces each handle.
Void THTTPGETTHREAD :: ENDHTTPGET (VOID)
{
IF (fconnected)
{
DoonStatustext ("Closing: InternetConnect");
Try
{
InternetCloseHandle (FhRequest);
InternetCloseHandle (fhconnect);
InternetCloseHandle (FHSession);
} catch (...) {}
FHSession = NULL;
FHConnect = NULL;
FhRequest = NULL;
Fconnected = false;
DOONSTATUSTEXT ("Closed: InternetConnect");
}
}
I think that after the handle is released, set the variable to NULL is a good program habit. In this example, it is also necessary to use these handle variables again when downloading fails.
5, function module call
The call to these modules can be scheduled in the Execute method of the thread object, as shown below:
Void __fastcall httpgetthread :: execute ()
{
Frepeatcount = 5;
For (int i = 0; i
{
STATHTTPGET ();
GetWebFileSize ();
DOHTTPGET ();
Endhttpget ();
IF (fsuccess) Break;
}
/ / Send a download completion event
IF (fsuccess) dooncomplete ();
Else doonerror ();
}
A loop is performed here, that is, if an error is generated, the number of repetitions can be set as a parameter.
Realize breakpoint resumption
It is not very complicated on the basic download code, and there are two points in the main problem:
1. Check the local download information to determine the number of bytes already downloaded. Therefore, you should make appropriate modifications to the function of the output file. We can build a secondary file to save the downloaded information, such as the number of bytes already downloaded. I have dealtically simple, first check if the output file exists, if there is, then get the size, and here as the part of the downloaded part. Since Windows does not acquire the file size API, I have written the GetFileSize function to obtain the file size. Note that the same code as the previous is omitted.
DWORD THTTPGETTHREAD :: OpenOutfile (void)
{
......
IF (FileExists (FoutFileName))
{
DWORD dwcount = getFileSize (foutfilename);
IF (dwcount> 0)
{
iFileHandle = FileOpen (foutfilename, fmopenwrite);
Fileseek (ifilehandle, 0, 2); // Mobile file pointer to the end
IF (ifilehandle == - 1) throw ("Error: FileCreate");
DoonStatustext ("OK: OpenFile");
Return dwcount;
}
Deletefile (foutfilename);
......
}
2. Adjust the file pointer on the Web before starting the download file (ie, perform the InternetReadFile function). This requires the web server to support the operation of randomly read files, and some servers are restricted, so this possibility should be judged. For the modification of the DohttpGet module, the same code is also omitted:
Void THTTPGETTHREAD :: DOHTTPGET (VOID)
{
DWORD dwcount = OpenOutfile ();
IF (dwcount> 0) // Adjust file pointer
{
DWSTART = DWSTART DWCOUNT;
IF (! setfilepointer ()) // server does not support operation
{
// Clear the output file
Fileseek (ifilehandle, 0, 0); // Mobile file pointer to the head
}
}
......
}