Create your own browser in the Central University of Education with Socket and MSHTML Object Model Xue Wei 01-4-20 11:15:22
The birth of the HTTP protocol and the birth of the web browser add more exciting to our network. But in practical applications, we may have different needs instead of simply use browsers, such as the ability to browse the Internet in our application. Microsoft's ChtmlView class is very convenient to achieve web browsing. But it is very unhappy, and it is unable to dynamically modify the elements you want on the web page. This article explores some methods to use Socket to transfer HTML documents, using Microsoft's dynamic MSHTML object model to implement some browser internal mechanisms.
Everyone knows that the HTML document is made by marker language, that is, commonly known as TAG. Microsoft's browser IE implements a corresponding object model (ObjectModel) for these tags, encapsulated by MSHTML.DLL. The implementation of the IE browser is also implemented by mshtml.dll. With mshtml.dll, we can directly operate the properties and methods of the object model. The object model of the MSHTML is based on the COM component object. The object's interface is based on iDispatch, and the operation MSHTML object model must pass the IDispatch interface. Many of the interfaces are encapsulated in MSHTML, for example, the IHTMLANChorElement interface corresponds to the
tag, the IHTMLHleMLEMENT interface corresponding to the HTML document, and the IHTMLTable interface corresponds to the
tag. The most important thing is the IHTMLDocument2 interface, which corresponds to the Document component. The Document component is equivalent to the HTML document. People who have used JavaScript will be familiar with it.
Let us explain the application of MSHTML. Before an example, I will tell the Socket and HTTP protocol. The HTTP protocol is connected to the server and client through the TCP, which works at 80 ports. HTTP communicates with the request / response mechanism between Client and Server. HTTP messages are divided into request and response. Each message consists of starting line, a message header, and a message. The form is as follows:
Generic-Message = Start-Line
* Message-Header
CRLF
[Message-Body]
START-LINE is as follows:
Start-line = request-line | status-line
Request-line The request sent to the server to Server, as follows:
Method includes GET, POST, etc. In this case, we only use GET to send a request to the server.
For detailed HTTP protocol, please refer to RFC2068.
Create a new single document EXE project file in VC , in order to inherit from CHTMLVIEW with MSHTML. The CHTMLVIEW class encapsulates the MSHTML interface. Add a Socket class to the project.
Class Chttpsocket: Public Csocket
{
..................
protected:
CWND * m_pparentwnd;
}
Where m_pparentWnd points to our view class for transmitting messages. Define the Socket in the view class.
Class CskhttpView: Public ChtmlVIEW
{
protected: // Create from Serialization Only
CSKHTTPVIEW ();
Declare_DyncReate (CSKHTTPVIEW)
..............
protected:
Chttpsocket m_socket;
IHTMLDocument2 * phmdoc2;
...... ..
}
PHMDOC2 is an IHTMLDocument2 interface. Initialize Socke, connect the site we want to log in, assume that www.163.net. CSKHTTPVIEW :: CSKHTTPVIEW ()
Let's get the interface of IHTMLDocument2. There are generally two methods for this interface. First, use CoCreateInstance and call QueryInterface. The other is Get_Document using the MSHTML control object, and this interface is encapsulated in the ChtmlView class. We use the latter one. It should be noted that we need to get its interface after generating the IHTMLDocument object in ChtmlView.
WSPRINTF (BUF, "Get http://www.163.net http / 1.1 / r / n / r / n");
INT IRET = m_socket.send (BUF, LSTRLEN (BUF), 0);
IF (IRet == Socket_ERROR)
MessageBox ("Socket Send Error", NULL, MB_OK;
}
We send an HTTP protocol to the server "get http://www.163.net http / 1.1 / r / n / r / n", which retrieves the web page of the specified URI address. The server responded to we accept it via Socket. IHTMLDocument2 contains many object excuses, through many PUT_, Get_Methals We can get these objects, events, methods. In this example we get a Body object, which corresponds to the
body> object in the HTML text. Get_body () by IHTMLDocument2 gets the IHTMLELEMENT interface of the specified body object, call the PUT_INNERHTML () accepted by the IHTMLELEMENT interface to place in the document. The web page is displayed in our view. This article is just a brief introduction to the MSHTML object, which actually contains many interfaces and functions, and we can design our own style browser through these interfaces. Comrade interested in see the MSDN documentation.