Thief procedure principle and simple example

xiaoxiao2021-03-06  77

First of all, I thank you for your support for the small plug-ins of "Beautiful Home", because it is not my work (I am also a rookie), so I am grateful to the original author to contribute to everyone, and also to everyone [Repost] one Tips, I hope to help everyone. Now there are more popular thieves, there are news thieves, music thieves, download thieves, then how do they do it, let me be a brief introduction, I hope to help from the webmasters. (1) The principle thief program is actually invoked by the XMLHTTP components in XML to call the webpage on other websites. For example, the news thief program, many of them call the Sina's news page, and some replacements are made, and the advertisements have also been filtered. The advantages of the thief program are: No need to maintain the website, because the data in the thief program is from other websites, which will be updated with the update of the site; save server resources, general thief procedures, all web content is From other websites. Disadvantages: Unstable, if the target website is wrong, the program will make an error, and if the target website is upgraded, then the thieves should be modified accordingly; speed, because it is remote call, speed, and read data on the local server Compared to, it must be slower. (II) The following is a brief description of the application of XMLHTTP in ASP <% 'Common Function' 1, enter the URL target web address, return value geTHTTPPAGE is the HTML code for the target web page Function gethtppage (URL) DIM httpset http = server. createobject ( "MSXML2.XMLHTTP") Http.open "GET", url, falseHttp.send () if Http.readystate <> 4 then exit functionend ifgetHTTPPage = bytesToBSTR (Http.responseBody, "GB2312") set http = nothingif err. Number <> 0 Then Err.clear End Function'2, conversion of Chau, directly using the web page with Chinese characters with Chinese characters will be Hemma, can convert Function Bytestobstr (Body, Cset) DIM ObjstreamSt by AdoDb.Stream Components objstream = Server.CreateObject ( "adodb.stream") objstream.Type = 1objstream.Mode = 3objstream.Openobjstream.Write bodyobjstream.Position = 0objstream.Type = 2objstream.Charset = CsetBytesToBstr = objstream.ReadText objstream.Closeset objstream = nothingEnd Function ' Try to call

http://www.3doing.com/earticle/

HTML content DIM URL, htmlurl = "http://www.3doing.com/earticle/ html = gethttppage (url) response.write html%> ASP thief introduction tutorial ..... with QQ weather thief source code It is a bit difficult to do a good thief: P, flexible use XMLHTTP components, you can also do "thieves" here "thief" refers to the powerful function provided by XML HTTP components in XML in ASP, putting remote websites The data (pictures, web pages, and other files) can be grabbed to the local, and after various processes, it is displayed on the page or stores a class of programs into the database. You can pass this thief program, complete the task that seems to be completely impossible, so that after stealing a station, turn into your own page, or save some data (articles, pictures) to some stations to Used in the local database. The advantages of "thief" are: no need to maintain the website, because the data in the thief program is from other websites, which will update with the website update; save a large number of server resources, the general thief process, several files, all web content They are all from other websites. The disadvantage is: unstable, if the target site is wrong, the program will also make mistakes, and if the target website is upgraded, then the thief program should be modified accordingly; speed, because it is remote call, speed, and read data on the local server Compared to, it must be slower. How, it sounds amazing? Let's start to learn some "thief" procedures! Let's take a simple point to study, the weather forecast program program on the QQ website is demonstrated: http://www.colasky.com/weather.asp

Source code download:

http://www.colasky.com/weather.rar

The code is as follows: <% on error resume nextserver.scripttimeout = 999999function gethttppage (path) T = getBody (path) gethttppage = bytestobstr (t, "gb2312") End function 'First, some initialization settings of the thief program, the role of the above code It is ignored all non-poorly wrong errors, set the run timeout time of the thief program (which does not run timeout errors), convert the original default UTF-8 encoding into GB2312 encoding, otherwise directly use XMLHTTP components The web page that calls with Chinese characters will be garbled. Function GetBody (url) on error resume nextSet Retrieval = CreateObject ( "Microsoft.XMLHTTP") With Retrieval .Open "Get", url, False, "", "" .Send GetBody = .ResponseBodyEnd With Set Retrieval = Nothing End Function ' Then call the XMLHTTTP component to create an object and perform an initialization setting. Function BytesToBstr (body, Cset) dim objstreamset objstream = Server.CreateObject ( "adodb.stream") objstream.Type = 1objstream.Mode = 3objstream.Openobjstream.Write bodyobjstream.Position = 0objstream.Type = 2objstream.Charset = CsetBytesToBstr = objstream. ReadText objstream.Closeset objstream = nothingEnd FunctionFunction Newstring (wstr, strng) Newstring = Instr (lcase (wstr), lcase (strng)) if Newstring <= 0 then Newstring = Len (wstr) End Function 'data processing needs to crawl back Call the Adodb.Stream component and initialize the settings. %> 'The following is the page display section <% DIM WSTR, STR, URL, START, OVER, and CITY' Define some variables required to use the VATY = Request.QueryString ("ID") ") The ID variable (ie User-selected city) assigned idurl = "http://appnews.qq.com/cgi-bin/news_qq_search?city=" & "" "This settings need to be captured, of course, you can also specify someone directly Address without using the variable wstr = gethttppage (URL) 'Get all data start = newstring (WSTR, "")' here to set the header of the data that needs to be processed, and this variable should be set to different situations. The specific content can be determined by viewing the source code of the page that needs to be captured. Because we need to grab the entire page in this program, set it to all capture. Note that the content set must be unique, and cannot be repeated.

转载请注明原文地址:https://www.9cbs.com/read-121580.html

New Post(0)