Getting Started by ASP Thief (Remote Data Acquisition) Program

xiaoxiao2021-03-06  14

The "thief" mentioned here refers to the powerful feature provided by XMLHTTP components in XML in ASP, captures data on the remote website (picture, web pages, and other files) to the local, and displayed after various processes. The page or a class of programs entering into the database. You can pass this thief program, complete the task that seems to be completely impossible, so that after stealing a station, turn into your own page, or save some data (articles, pictures) to some stations to Used in the local database. The advantages of "thief" are: no need to maintain the website, because the data in the thief program is from other websites, which will update with the website update; save a large number of server resources, the general thief process, several files, all web content They are all from other websites. The disadvantage is: unstable, if the target site is wrong, the program will also make mistakes, and if the target website is upgraded, then the thief program should be modified accordingly; speed, because it is remote call, speed, and read data on the local server Compared to, it must be slower. How, it sounds amazing? Let's start to learn some "thief" procedures! Let's take a simple point to study, the weather forecast code code on the QQ website is as follows: <% on error resume next server.scripttimeout = 999999 function gethttppage (path) T = getBody (path) gethttppage = bytestobstr (t, "GB2312") End function 'First, some initialization settings of the thief program, the above code is ignored by all non-poorly wrong errors, respectively, set the run timeout time of the thieves (this will not run timeout Error), conversion The original default UTF-8 encoding is converted into GB2312 encoding, otherwise it will be garbled with the web page with the XMLHTTTP component. Function GetBody (url) on error resume next Set Retrieval = CreateObject ( "Microsoft.XMLHTTP") With Retrieval .Open "Get", url, False, "", "" .Send GetBody = .ResponseBody End With Set Retrieval = Nothing End Function 'Then call the XMLHTTP component to create an object and perform initialization settings.

Function BytesToBstr (body, Cset) dim objstream set objstream = Server.CreateObject ( "adodb.stream") objstream.Type = 1 objstream.Mode = 3 objstream.Open objstream.Write body objstream.Position = 0 objstream.Type = 2 objstream .Charset = Cset BytesToBstr = objstream.ReadText objstream.Close set objstream = nothing End Function Function Newstring (wstr, strng) Newstring = Instr (lcase (wstr), lcase (strng)) if Newstring <= 0 then Newstring = Len (wstr ) End function 'Processing the data that can be caught needs to call the AdoDb.Stream component and perform initialization settings. %> 'The following is the page display section <% DIM WSTR, STR, URL, START, OVER, and CITY' Define some variables required to use the VATY = Request.QueryString ("ID") ") The ID variable (ie User-selected city) assigned to ID URL = "http://appnews.qq.com/cgi-bin/news_qq_search?city=" & "" "This settings you need to capture the page address, of course you can also specify some An address without using the variable wstr = gethttppage (URL) 'Get all data start = newstring (WSTR, "Head")' here Set the header of the data that needs to be processed, and this variable should be set to different situations, specifically The content can be determined by viewing the source code of the page that needs to be captured. Because we need to grab the entire page in this program, set it to all capture. Note that the content set must be unique, and cannot be repeated. Over = newstring (WSTR, "Tail") 'and Start are the tail of the data that needs to be processed, the same, the content of the setting must be unique in the page. Body = MID (WSTR, START, OVER-START) 'Settings the range of the display page' The time to use Qiankun Move , you can replace the characters specified in the data by replacing. Body = Replace (Body, "Skin1", "Weather Forecast - Sk Network") Body = Replace (Body, "http://appnews.qq.com/cgi-bin/news_qq_search?city", "tianqi.asp? id ") 'This program has completed the replacement work, and similar replacement operations can continue if there are other needs. RESPONSE.WRITE BODY%> After replacing the content you need to modify, you can display the modified content on the page. To the end of this program, the program is used and the result: After the above code is removed, the description portion is saved as tianqi.asp, and uploaded to the space that supports ASP and XML, running in the browser. You can further interface beauty or program optimization on this program.

转载请注明原文地址:https://www.9cbs.com/read-49689.html

New Post(0)