Overview
In today's network era, downloading software is one of the most frequent software. In the past few years, the download technology has also been constantly developing. The most original download feature is just a "download" process, which is continuously read from the web server. The biggest problem is that due to the instability of the network, once the connection is disconnected, the download process is interrupted, it has to come once again.
Subsequently, the concept of "breakpoint renewal" came out, as the name suggests, if the download is interrupted, after re-establishing the connection, skip the downloaded part, and only downloading some of the download yet. Regardless of whether "multi-thread download" technology is invented by Mr. Hong Yizhang, Hong Hao makes this technology unprecedented facts. After the "network ant" software is popular, many download software also follow, whether it is? Quot; multi-thread download "technology, can even support how many download threads become the elements of downloading software." Multithreaded download "The foundation is that the web server supports remote random reads, which supports" breakpoint renewal ". This way, you can divide files into several parts, each part creates a download thread for download.
Now, don't write a dedicated download software, in the software you have written, adding a download function is sometimes necessary. If your software supports automatic online upgrade, or automatically downloads new data in the software, this is useful, and it is a very useful feature. The topic of this article is how to write a download module that supports "breakpoint renewal" and "multi-thread". Of course, the download process is very complicated. In an article, it is difficult to clarify all, so it is basically ignored, such as abnormal processing and network error handling, etc., such as abnormal processing and network error handling. The development environment I use is C Builder 5.0, friends who use other development environments or programming languages are appropriate to modify themselves.
Introduction to HTTP Protocol
Downloading files is the process of interacting with the web server, and their interactive "language" professional name is an agreement. There are a variety of protocols for transferring files. The most commonly used HTTP (Hyper Text Transfer Protocol) and FTP (file transfer protocol), I use HTTP.
The most basic command of the HTTP protocol: GET, POST and HEAD. GET requests a specific object from the web server, such as an HTML page or a file, the web server sends this object through a socket connection as a response; the head command enables the server to give the server to this object's basic description, such as the type, size, and update time of the object. . The POST command is used to send data to the web server, which typically causes the information to a separate application, and the result is returned to the browser. Download is achieved by get command.
Basic download process
Write the download program, you can use the Socket function directly, but this requires developers to understand and become familiar with the TCP / IP protocol. In order to simplify the development of Internet client software, Windows provides a Wininet API that encapsulates commonly used network protocols, which greatly reduces the threshold for developing Internet software. The Wininet API function we need to use is shown in Figure 1, and the call sequence is basically from top to bottom, and the specific function prototype is referred to MSDN.
figure 1
When using these functions, you must strictly distinguish between the handles used. The type of these handles is the same, all Hinternet, but the role is different, this is very confused. According to the order and call relationship of these handles, it can be divided into three levels, and the handle of the next level is obtained by the handle of the previous level.
Internetopen is the first to call the function, it returns the Hinternet handle level, I habpely defined as HSession, ie the session handle. InternetConnect uses the HSession handle, returning the HTTP connection handle, I define it as hconnect.
HTTPOPENREQUEST uses the HConnect handle, the returned handle is an HTTP request handle, defined as hRequest.
HttpsendRequest, HttpQueryInfo, InternetSetFilePointer, and InternetReadFile use HttPopenRequest to return the handle, ie hRequest.
When these handles are no longer used, you should use the function InternetCloseHandle to turn it off to release the resources you usually.
First build a thread module named THTTPGETTHREAD, and then automatically hang, I want the thread to automatically destroy after completion, so set in the constructor:
FreeOnterminate = true; // Automatic deletion
And add the following member variables:
Char buffer [httpget_buffer_max 4]; // Data buffer ANSISTRING FURL; // Download the object's urlansistring foutFileName; // Save path and name Hinternet FHSession; // session handle HinterNet Fhconnect; // HTTP connection handle Hinternet FhRequest; / / http request handle BOOL FSUCCESS; / / Download whether to success int ifilehandle; // Output file handle
1, establish a connection
According to the functional division, the download process can be divided into 4 parts, which is established, read the information to download files and analyze, download files, and release the resource. The function of establishing the connection is as follows, where ParseURL's role is to obtain the web path of the host name and download file from the download URL address, and DOONSTATUSTEXT is used to output the current state:
// Initialize the download environment void THttpGetThread :: StartHttpGet (void) {AnsiString HostName, FileName; ParseURL (HostName, FileName); try {// 1. establish a session FhSession = InternetOpen ( "http-get-demo", INTERNET_OPEN_TYPE_PRECONFIG, NULL, NULL, 0); // Synchronous mode if (fHSession == null); doonStatustext ("OK: Interkeen"); // 2. Establish connection fHconnect = InternetConnect (fHSession, hostname .c_str (), INTERNET_DEFAULT_HTTP_PORT, NULL, NULL, INTERNET_SERVICE_HTTP, 0, 0); if (FhConnect == NULL) throw (Exception ( "Error: InternetConnect")); DoOnStatusText ( "ok: InternetConnect"); // 3. Initialization Download Request const char * FACCEPTTYPES = "* / *"; fHRequest = httpopenRequest (fHConnect, "get", // Get data filename.c_str (), // want to read "http / 1.1" , // Used protocol NULL, & FACCEPTTYPES, Internet_Flag_Reload, 0); if (fHRequest == null) Throw (EXC Eption ("Error: httpopenrequest"); doonStatustext ("OK: httpopenrequest"); // 4. Send download request httpsendrequest (fHRequest, null, 0, null, 0); doonstatustext ("ok: httpsendrequest");} catch (Exception & exception) {EndHttpGet (); // close the connection, the release of resources DoOnStatusText (exception.Message);}} // hostname extracted from the URL and the file path of the void THttpGetThread :: parseURL (AnsiString & hostName, AnsiString & FileName) { Ansistring URL = FURL; INT I = URL.POS ("http: //"); if (i> 0) {url.delete (1, 7);} i = url.pos ("/"); hostname =
Url.Substring (1, i-1); filename = url.substring (i, url.Length ());} It can be seen that the program is in sequence according to the order in Figure 1, and the Internet, InternetConnect, and HTTPOPENREQUEST functions can be called 3 Related handles, then send the downloaded request to the web server via the HttpsendRequest function.
The first parameter of Internetopen is independent. The last parameter is set to Internet_flag_async, which will establish an asynchronous connection, which is very significant, considering the complexity of this article, I am not adopted. But for readers who need higher download requirements, it is highly recommended to use asynchronous ways.
HTTPOPENREQUEST opens a request handle, the command is "get", indicating the download file, and the protocol used is "http / 1.1".
Another place to note is the parameter FACCEPTTYPES of HttPopenRequest, indicating that the file type that can be opened, I am set to "* / *" indicates that all file types can be opened, and it can be changed according to actual needs.
2. Read the information to be downloaded and analyze
After sending a request, you can use the HTTPQueryInfo function to obtain information about the file, or obtain the information of the server and the related operations supported by the server. For download programs, the most commonly used is the size of the file to get the file, that is, the number of bytes containing the file. The module is as follows:
// Get the file to be downloaded size int __fastcall THttpGetThread :: GetWEBFileSize (void) {try {DWORD BufLen = HTTPGET_BUFFER_MAX; DWORD dwIndex = 0; bool RetQueryInfo = HttpQueryInfo (FhRequest, HTTP_QUERY_CONTENT_LENGTH, Buffer, & BufLen, & dwIndex); if (RetQueryInfo = = false) throw (Exception ( "Error: HttpQueryInfo")); DoOnStatusText ( "ok: HttpQueryInfo"); int FileSize = StrToInt (Buffer); // file size DoOnGetFileSize (FileSize);} catch (Exception & exception) {DoOnStatusText ( Exception.Message);} Return FileSize;}
DOONGETFILESIZE in the module is an event that issues a file size. After obtaining the file size, for the multi-threaded download program, you can make a suitable file block, determine the starting and size of each file block.
3, download file module
Before you start download, you should also arrange how to save the download results. There are a lot of ways, I use the file function provided by C Builder to open a file handle. Of course, you can also use the API of Windows itself, and you can consider that all buffers can be considered in the memory.
// open the output file, to save the downloaded data DWORD THttpGetThread :: OpenOutFile (void) {try {if (FileExists (FOutFileName)) DeleteFile (FOutFileName); iFileHandle = FileCreate (FOutFileName); if (iFileHandle == - 1) throw (Exception ( "Error: FileCreate")); DoOnStatusText ( "ok: CreateFile");} catch (Exception & exception) {DoOnStatusText (exception.Message);} return 0;} // download procedure void THttpGetThread :: DoHttpGet ( Void) {dWord dwcount = OpenOutfile (); try {// Send a start download event doonStatustext ("startget: internetReadFile"); // read data dword dwrequest; // Request to download the number of bytes DWord dwread; // actually read the number of bytes dwRequest = HTTPGET_BUFFER_MAX; while (true) {Application-> ProcessMessages (); bool ReadReturn = InternetReadFile (FhRequest, (LPVOID) Buffer, dwRequest, & dwRead); if (! ReadReturn) break; if (dwRead == 0) Break; // Save data buffer [dwread] = '/ 0'; FileWrite (ifileHandle, Buffer, dwread); dwcount = dwcount dw Read; // send out the download process events DoOnProgress (dwCount);} Fsuccess = true;} catch (Exception & exception) {Fsuccess = false; DoOnStatusText (exception.Message);} FileClose (iFileHandle); DoOnStatusText ( "End: InternetReadFile") The download process is not complicated, as in the way to read the local file, perform a simple loop. Of course, this convenient programming is also benefited from Microsoft's package to network protocols.
4, release the resources occupied
This process is simple, call the InternetCloseHandle function in the opposite order that produces each handle.
void THttpGetThread :: EndHttpGet (void) {if (FConnected) {DoOnStatusText ( "Closing: InternetConnect"); try {InternetCloseHandle (FhRequest); InternetCloseHandle (FhConnect); InternetCloseHandle (FhSession);} catch (...) {} FhSession = NULL; fHconnect = NULL; fHRequest = null; fhRequest = false; doonStatustext ("closed: internetconnect");}} I think that after release the handle, set the variable to null is a good program habit. In this example, it is also necessary to use these handle variables again when downloading fails.
5, function module call
The call to these modules can be scheduled in the Execute method of the thread object, as shown below:
void __fastcall THttpGetThread :: Execute () {FrepeatCount = 5; for (int i = 0; i A loop is performed here, that is, if an error is generated, the number of repetitions can be set as a parameter. Realize breakpoint resumption It is not very complicated on the basic download code, and there are two points in the main problem: 1. Check the local download information to determine the number of bytes already downloaded. Therefore, you should make appropriate modifications to the function of the output file. We can build a secondary file to save the downloaded information, such as the number of bytes already downloaded. I have dealtically simple, first check if the output file exists, if there is, then get the size, and here as the part of the downloaded part. Since Windows does not acquire the file size API, I have written the GetFileSize function to obtain the file size. Note that the same code as the previous is omitted. DWORD THttpGetThread :: OpenOutFile (void) {...... if (FileExists (FOutFileName)) {DWORD dwCount = GetFileSize (FOutFileName); if (dwCount> 0) {iFileHandle = FileOpen (FOutFileName, fmOpenWrite); FileSeek (iFileHandle, 0,2 ); // mobile file pointer to the end IF (ifilehandle == - 1); DoonStatustext ("OK: OpenFile"); Return dwcount;} deletefile (foutFileName);} ...... } 2. Adjust the file pointer on the Web before starting the download file (ie, perform the InternetReadFile function). This requires the web server to support the operation of randomly read files, and some servers are restricted, so this possibility should be judged. For the modification of the DohttpGet module, the same code is also omitted: void THTTPGETTHREAD :: DOHTTPGET = OpenOutfile (); if (dwcount> 0) // Adjust file pointer {dwstart = dwstart dwcount; ing ! Setfilepointer ()) // server does not support operation {// Clear output file Fileseek (ifilehandle, 0, 0); // Mobile file pointer to head}} ...} Multi-thread download To achieve multi-thread download, the main problem is to download the creation and management of threads. The exact merge of each part of the file has been downloaded, and the download thread also has necessary modifications. 1, download the modification of thread In order to adapt to multi-threaded programs, I am adding the following member variables in the download thread: INT FINDEX; / / The index dWord dwstart; // download started position dword dwtotal; // Download the number of bytes in the thread DWORD DWORD FGETBYTES; // Download the total byte number And add the following attribute value: __property AnsiString URL = {read = FURL, write = FURL}; __ property AnsiString OutFileName = {read = FOutFileName, write = FOutFileName}; __ property bool Successed = {read = FSuccess}; __ property int Index = {read = FIndex, write = FIndex }; __ property DWORD StartPostion = {read = dwStart, write = dwStart}; __ property DWORD GetBytes = {read = dwTotal, write = dwTotal}; __ property TOnHttpCompelete OnComplete = {read = FOnComplete, write = FOnComplete}; At the same time, add the following processing in the download process DOHTTPGET, Void THTTPGETTHREAD :: DOHTTPGET (VOID) {... try {... while (true) {Application-> processMess (); // Refigure the number of bytes that you need to download, make dwrequest dwcount I first established a component module called TComponent as a base class, named THTTPGETEX, and increase the following member variables: // Internal Variable THTTPGETTHREAD ** HTTPTHREADS; // Save the established thread ANSISTRING * OUTTMPFILES; // Save the result file BOOL * fsuccesss BOOL * fsuccesss; // Save Download Results of each thread // The following is the attribute variable INT FHTTHREADCOUNT; // Use the thread number Ansistring FURL; ANSISUSTRING FoutFileName; The use of each variable is like code annotations, where FSuccess is used in particular, and will be explained in detail below. Since the running run is irreversible, the components may continue to download different files continuously, so the download thread can only be created, and then destroyed immediately after use. The module that creates a thread is as follows, where the getSystemTemp function gets the system's temporary folder, and OnthreadComplete is the event after the thread download is completed, and its code is introduced: // allocate resources void THttpGetEx :: AssignResource (void) {FSuccesss = new bool [FHttpThreadCount]; for (int i = 0; i HttpThread-> GetBytes = FileSize; HttpThread-> Index = 0; HttpThread-> OutFileName = OutTmpFiles [0];} else {HttpThread-> OutFileName = OutTmpFiles [FHttpThreadCount-1]; HttpThread-> Index = FHttpThreadCount-1; // Support breakpoint resume, establish multiple threads for (int i = 0; i void __fastcall THttpGetEx :: DownLoadFile (void) {CreateHttpThreads (); THttpGetThread * HttpThread; for (int i = 0; i After the thread download is complete, the onthreadcomplete event is issued, and it is determined in this event that all downloaded threads have been completed. If yes, the part of the merge file. It should be noted that there is a problem with a thread synchronization, otherwise a few threads generate this event at the same time, they will conflict with each other, and the results will be confusing. Synchronous methods are a lot, my method is to create thread mutual exclusive objects. const char * MutexToThread = "http-get-thread-mutex"; void __fastcall THttpGetEx :: OnThreadComplete (TObject * Sender, int Index) {// Create a mutex HANDLE hMutex = CreateMutex (NULL, FALSE, MutexToThread); DWORD Err = GetLastError (); if (err == error_already_exists) // Already, waiting {WaitForsingleObject (hmutex, infinite); // 8000L); hmutex = createmutex (null, false, mutextothread);} // When a thread ends Check if all believes completed fsuccesss [index] = true; bool s = true; for (int i = 0; i At this point, the key section of multi-thread download is over. But in actual applications, there are many factors that should be considered, such as network speed, disconnection, etc. must be considered. Of course, there are some details, but it is difficult to write it by the space. If the reader friend can refer to this article, I am very pleased. I also hope that readers can learn from each other and make progress together. For detailed examples of this article (including downloading components and usage programs), please go to the "Programmer" URL. Http://www.xingzhou.com/myArticle/showArticle.asp?classid=1&page=1&sort=&s=1100