A class (reproduced) from the page of the URL and URLs (reproduced) from the web TAG

xiaoxiao2021-03-06  106

A class that analyzes the URL and URL titles from the web page TAG

Author: Kiki De brother cat

Download the source code one, have to say nonsense

I know Get_Links with mshtml's htmldocument2 to get the IHTMLELEMENTCOLLECTION interface, get IHTMLANChorElement through IHTMLELEMEMENTCOLLECTION, and get all links to the web page through the get_href of the IHTMLANChorElement interface! But this is a way to use MSHTML, as I don't like yourself, I don't like things that you can't see it (although Microsoft is better than me). So, I will encapsulate a class by analyzing web tag characters, and I know that there is a deficiencies, so I will publish it, I hope some people can get a better new new year on this basis. version.

Second, about this class

Perhaps someone wants to say, the URL of the analysis page is not to analyze href = ... What can you? But it is easy to say that many things are easy, but you have to do something as much as possible, and you will know that truly try it will know. For example, some connections are such a URL = ... For example, the connection inside JavaScript (I now analyze the part of JavaScript is also a problem), such as the and so on ... I can handle it in this class, I have been processed as much as possible, helplessness is limited, it is not very perfect.

Third, the interface of this class

The interface function is only one of this class constructor. Here is the declaration of this function:

CWEBHOST (const cstring & m_str_webcode, / * web code * /

Vector

& m_vec_url, / * Analyze the structure of the URL and URL title structure * /

CSTRING & STR_URL); / * URL * /

HyperLink is a structure I am in the Urlsturct file:

//URLSTURCT_.H

// Ultra connection data structure

#ifndef _____hyperlinktag_h____

#define _____hyperlinktag_h____

// Super connection data set

Typedefstruct taghyperlinktag {

// link address;

CString str_hyperlink;

// link text;

CString str_hyperlinktext;

} HyperLink;

#ENDIF

Fourth, the list of functions of this class

Function name function cwebhost (...); constructor Void onRetrunWebContent (...); Return all links VOID ONGETHTMLURL (...); get html URL Void OnRetjumpurl (...); get the jump URL void onreturnframeurl (...) Url cstring onConversionURL (...) obtained to get nested code; convert URL to absolute address void onlinesejavascript (...); Return the URL CString OnGetLinkText (...) in JavaScript code; take URL connection text

V. This class's processing process

Sixth, detailed code

There are too many code, limited to the space, so you have to go to the source code.

Seven, still want to say some nonsense

This type of analysis is defective. I hope everyone will ask questions, or simply write a class from newly written. From the small language level, I don't understand, please don't understand. Compile vc7.1 windows server 2003

转载请注明原文地址:https://www.9cbs.com/read-126514.html

New Post(0)