1, DOM tree
All types of XML parsers require processing objects to "well" XML documents, some can also validate according to DTD or XML Schema, the DOM (DOCUMENT OBJECT MODEL) parser analyzes XML documents to generate a one-time Objects in memory are used to describe the document.
The DOM is an interface-independent interface that allows programs and scripts to dynamically access and modify the content, structure, and type of documents. It defines a series of objects and methods to perform a variety of random operations for DOM trees:
● Document Object: As the highest node of the tree, the Document object is an entry for the entire document.
● Element and Attr object: These node objects are mapped to a part of a document, and the set level of the node just reflects the structure of the document.
● Text object: As the child node of the Element and Attr object, the Text object expresses the text content of the element or attribute. The Text node no longer contains any child nodes.
● Collect index: DOM provides several set indexing methods, which can be traversed in the specified manner. Index parameters are starting from 0.
All nodes in the DOM tree are inherited from the Node object. The Node object defines some of the most basic properties and methods, using these methods to realize the traversal of the tree, and at the same time, the name of the node can be known according to the attribute, and the value is judged.
With DOM, developers can dynamically create XML, traverse documents, increase / delete / modify document content. The API provided by the DOM has nothing to do with the programming language, so the implementation method of different parsers may also be different for interfaces that have no clear definitions in some DOM standards. For convenience, the examples of this article use the MSXML DOM scheme and write code with VB Script.
2, the structure of the DOM tree
After the Document object is established, you can contact the XML document or data island. The loading method of the data island is to assign the data island ID to the Document object:
Set doc = dsoDails.xmldocument
The loading document is generally divided into three steps:
1. Create an analyzer instance using the CreateObject method;
2. Set the async property to false, prohibiting asynchronous loading, so when the document is loaded, the control will return to the calling process. If you want to get a document loading status, you can read the readyState property value;
3. Use the LOAD method to load the specified document.
SET DOC = CreateObject ("Microsoft.xmldom")
Doc.async = false
Doc.Load "Books.xml"
XML DOM also provides a way to loadXML to load an XML string into a DOM tree. When using the XML string directly as the parameters of the method.
3, DOM Tree Access
You can use the DocumentElement property to access the root element using the DocumentElement property after the document is loaded:
Set rootnode = doc.documentelement
Once a reference to a node (eg, root node) in the DOM tree is established, the appropriate method can be traversed according to the level relationship between the nodes.
The following is the use of Books.xml as an example:
XML Version = "1.0"?>
book>
book> booklist> xml>
Build a reference to the second
Set thenode = dsobooks.xmldocument.documentelement.childNodes (1)
● Root node: then.OWNERDocument Returns the Document node, pointing to the XML document itself;
● Brother node: Thenode.Previoussibling Returns the first
● Parent Node: Thenode.ParentNode Returns
● Child Node: THENODE.FIRSTCHILD Returns
Once you get the reference to the node, you can read the information of the node:
● Node Type: Thenode.NodeType, this example is 1, the Document object type is 9, the element type is 1, the attribute type is 2;
● Name Name: Thenode.NodeName, this example is BOOK;
● Node value: THENODE.NODEVALUE, this example is NULL, for the Attr node, returning attribute value, and for the ELEMENT node, NULL is returned.
In MSXML, there is some additional methods and properties for Node objects:
● NodeTypeString: Use strings to display node types, such as thenode.nodeTypeString result is "element";
● Text: Displays the text content of the current node and all the subtots;
● XML: Get XML document data, usually all content starting from the root element.
4, XML format dynamic conversion
By learning XSL, we have been able to use the style single to convert the XML document. But this process is static, that is, when writing code, the XSL file that has been specified on XML has been specified, and the change cannot be changed during the program operation. With DOM, we can implement dynamic conversion of XML format, that is, load XSL to convert XSL when running, and transforms XML documents. The step of loading the XSL into the DOM object is basically the same as the load process of the XML document (XSL itself is an XML document):
SETSTYLESHEET = CreateObject ("Microsoft.xmLDom")
Stylesheet.async = false
Stylesheet.Load "TransformDetails.xsl"
DOM provides two functions for this conversion, and the role object can be any node in the tree. This allows for a format conversion to any portion of the DOM tree.
● TRANSFORMNODETOBJECT method: This method requires two parameters, the first parameter points to the XSL file, the second parameter stores the node of the converted XML data. E.g:
Set targetnode = creteObject ("Microsoft.xmLDom")
Srcnode.TransformNodeToObject Stylesheet, TargetNode
● TRANSFORMNODE method: This method only needs one parameter to indicate the XSL file. As the following example converts the source node into a string variable STR:
Str = srcNode.TransFormNode (Stylesheet)
1. Error when DOM parsing
DOM may produce a wide variety of errors when parsing the XML document, and can learn the possible causes and related information according to the properties in the ParseError object.
Commonly used properties and their meanings are as shown in the following table:
Property description
ErrorCode error code
FilePos error in the absolute character position in the document
Line error in line error
LinePos error location character location
Reason error
SrcText error source code
URL has recently been a URL address of an XML document containing erroved errors
Bamboo
2. Access the elements and properties in the DOM tree
DOM also provides a number of ways to find nodes. Among them, the search mechanism is:
● Search the element according to the label name;
● Search nodes using the XSL mode;
● Search nodes with a collection index.
Take Books.xml as an example, the getElementsBytagname method in the Document object is to find the element according to the tag name in the parameter, and the return value is a nodelist object:
Set doc = dsoDails.xmldocument
Set authors = doc.getElementsBytagname ("author")
The above query results include all 4 author of the document. If the getElementSbyTagName method in the Element object is called, other cases are the same in addition to all subsequent nodes of the element.
All types of nodes have a SelectNodes method, the unique parameter of this method is the XSL mode rule, and the return value is a collection of results that matches the rule. Calling this method You can use the XSL mode matching policy to find nodes. E.g:
Set rootnode = doc.documentelement
Set cheapbooks = rootnode.selectnodes ("// book [price <10]")
This example returns all
If you want to get the text content in the element, such as
price> When the NodeValue property in the Element object is incorrect, the return result is NULL instead of expected 9.95. Elements containing text content contain a child node of a text type, so only the NodeValue attribute in the TEXT object can really access the text content.
The steps to add elements are as follows:
● Create a Text node and assign a value;
● Create an Element node;
● Hang the Text node under the ELEMENT node as its child node;
● Insert the ELEMENT node into the appropriate location of the XML document.
For the deletion and alternative operation of the element node, the operation object is first positioned, and the RemoveChild method of the parent node belonging should be performed accordingly.
The various operations of the Attr node are in principle the same as the ELEMENT node. The AtTR object also inherits the various methods and properties in the Node object, and the NAME attribute and the value property are also available in MSXML, which can access the attribute information more directly. In addition, you can also access attributes through the relevant method of the attribute to the element, such as reading the attribute value or modify the attribute value by getAttribute and SetAttribute, or returning the Attr object directly with the GetAttributeEnode method.
The most direct way to create new properties is to use the SetAttribute method in the ELEMENT object. You can also set the attribute value in the CreateAttribute method in the Document object, and then add the new node to the DOM tree using the SetAttributeNode method in the ELEMENT object. Similarly, the most direct way to delete attributes is to call the RemoveAttribute method in ELEMENT. Another solution is to position the operation object first with the GetAttributeEnode method, and then execute the RemoveAttributeNode operation.
As can be seen from the above introduction, users can easily find a set of object operations that are suitable for themselves.
Bamboo
3, DOM display function
DOM technology can also be used to display XML data. The XSL style is single-sided is the conversion of the XML document. It is used to display the transformation of the format is one aspect of its application, so there is still some shortcomings in the display function:
● Not easy to complete complex processing of XML data, such as transitioning all English letters to uppercase, intercepting strings of the specified length, ignoring some specific punctuation, etc.
● It is not easy to calculate the values in XML data;
● One XSL is usually still active on an XML document, which cannot convert the data in multiple XML documents to an output result.
Using DOM can solve the above problems well, and the scripts written can be on the server side and the client.