Java and XML joint programming DOM
Rick1126
Java (2001-10-16 12:58:32)
Java and XML joint programming DOM
(Sail October 08, 2001 17:16)
DOM preliminary
DOM is the abbreviation of Document Object Model, that is, document object model. As mentioned earlier, XML will organize the data into a tree, so DOM is a subject to this tree. It is a popular saying that by parsing the XML document, the XML document is logically established a tree model, and the node of the tree is an object. We can access the contents of the XML document by accessing these objects.
Let's take a look at a simple example and see how we do an XML document in the DOM.
This is an XML document, and it is also the object we have to operate:
XML Version = "1.0" encoding = "UTF-8"?>
messages>
Below, we need to resolve the content of this document into a Java object to go to the program, using JAXP, we can do this only one line of code. First, we need to build a parser factory to use this factory to get a specific parser object:
DocumentBuilderFactory DBF = DocumentBuilderFactory.newinstance ();
We are here to use DocumentBuilderFacotry to create a program that is unrelated to the specific parser. When the static method of the DocumentBuilderFactory class is called, it determines which parser specifically used according to a system variable. Also because all parsers are obeyed from the interface defined by JAXP, the code is the same regardless of which parser is used. Therefore, when switching between different parsers, only the value of the system variable is required without changing any code. This is the benefits of the factory. The specific implementation of this plant mode can be found in the following class diagram.
DocumentBuilder DB = dbf.newdocumentbuilder ();
When a factory object is obtained, use its static method NewDocumentBuilder () method to get a DocumentBuilder object, which represents a specific DOM parser. But which parser, Microsoft or IBM is not important for the program.
Then we can use this parser to parse the XML document:
Document doc = db.parse ("c: / xml/message.xml);
DocumentBuilder's PARSE () method accepts an XML document name as an input parameter, returns a Document object, which represents a tree model of an XML document. All the operations of the XML document are not related to the parser, and it can be operated directly on this Document object. The method of specific operations of the Document is defined by the DOM.
JAXP supports the DOM 2 recommended by W3C. If you are familiar with the DOM, then the following content is very simple: just need to call according to the DOM specification. Of course, if you don't know DOM, don't worry, we will have a detailed introduction. Here, what you want to know and keep in mind: DOM is the model used to describe the data in the XML document, and all the reasons for introducing the DOM is to use this model to operate the data in the XML document. The DOM specification defines a node (ie object), attributes, and methods, and we pass the access of these nodes to access xml data. From the Document object you get above, we can start our DOM trip. With the getElementsBytagName () method of the Document object, we can get a NodeList object, a Node object represents a label element in an XML document, and the Nodelist object, knows its name, which is representative of a node object. List:
Nodelist NL = Doc.GetElementsBytagname ("Message");
We have obtained a list of all
Node my_node = nl.Item (0);
When a Node object is created, the data saved in the XML document is extracted and packaged in this node. In this example, you want to extract the content within the Message tag, we usually use the getNodeValue () method of the Node object:
String message = my_node.getfirstchild (). Getnodevalue ();
Note that a getFirstChild () method is also used to get the first sub-Node object under Message. Although there is no other child tag or attribute outside the Message tag, we insist on using the getFirseChild () method, mainly related to the W3C's definition of the DOM. W3C defines the text section in the tag into a node, so we must get the Node representing the text, we can use getNodeValue () to get the text content.
Now, since we have been able to extract data from the XML file, we can use these data in a suitable place to construct an app.
The following content, we will pay more attention to the DOM, make a more detailed analysis for DOM, so that we use it more handless.
DOM detailed
1. Basic DOM object
The basic objects of the DOM have 5: Document, Node, Nodelist, Element, and Attr. The following is a general introduction to the functions and implementations of these objects.
The Document object represents the entire XML document. All other node are included in the Document object in a certain order, arranged as a tree structure, and the programmer can get all the contents of the XML document by traversing this tree. This is also the starting point for the XML document operation. We always get a Document object by parsing the XML source file, and then perform subsequent operations. In addition, the Document also includes methods for creating other nodes, such as CREATTRIBUT () to create an ATTR object. The main methods it contains:
CreateAttribute (String): Create an Attr object with a given property name and can be placed on a ELEMENT object with the SetAttributeNode method. CreateElement: Create a Element object with a given tag name, representing a label in the XML document, then adding an attribute on this ELEMENT object or other actions.
CreateTextNode (String): Create a Text object with a given string, and the Text object represents the plain text string included in the label or property. If there is no other label in a tag, the text object represented by the text within the tag is the unique sub-object of this ELEMENT object.
getElementsBytagname (String): Returns a NodeList object that contains a label for all given tag names.
GetDocumentElement (): Returns a Element object that represents the root node of this DOM tree, which is the object representing the XML document root element.
The Node object is the most basic object in the DOM structure, which represents an abstract node in the document tree. When actually use, rarely use Node objects, but use the sub-objects such as Element, Attr, Text, other Node objects to operate the document. The Node object provides an abstract, common root for these objects. Although the method of accessing its sub-node is defined in the Node object, there are some Node sub-objects, such as a Text object, which does not have a child node, which is to pay attention to. The main methods included in the Node object are:
Appendchild (org.w3c.dom.not): Add a child node to this node and placed in all child nodes, if this child has already existed, then delete it and add it.
GetFirstChild (): If there is a child node, return the first child node, peer, and the getLastChild () method returns the last child node.
getnextsibling (): Returns the next brother node, peer, peer, and getPrevioussibling () method returns to its previous brothers node.
GetNodeName (): Returns the name of the node based on the type of node.
GetNodeType (): Returns the type of node.
GetNodeValue (): Returns the value of the node.
HaschildNodes (): Judging is that there is a child node.
Hasattributes (): Judging whether this node has attributes.
GetownerDocument (): Returns the Document object where the node is located.
INSERTBEFORE (org.w3c.dom.node new, org.w3c.dom.node ref): Insert a child object before a given child object.
RemoveChild (Org.w3c.dom.Node): Delete a given child node object.
Replacechild (org.w3c.dom.node new, org.w3c.dom.node ": instead of a given child node object with a new Node object.
Nodelist objects, as the name suggestions, represents a list containing one or more NODEs. You can simply see it as an array of NODE, we can get the elements in the list by way:
GetLength (): Returns the length of the list.
Item (int): Returns the Node object of the specified location.
The ELEMENT object represents the label element in the XML document, inherits in Node, is also the most important child object of Node. In the label, you can include an attribute, so there is a method of accessing its properties in the Element object, and the method defined in any Node can also be used on the ELEMENT object. getElementsBytagname (String): Returns a NodeList object that contains labels with a given tag name in the sub-fence nodes under this tab.
GetTagname (): Returns a string representing this tag name.
GetAttribute (String): Returns the value of the properties of the given property name in the tab. What is maintained here is that there should be an entity attribute to appear in the XML document, and this method does not apply to these entity properties. At this time, you need to use the GetAttributeEnodes () method to get an Attr object for further operations.
GetAttributeNode (String): Returns an Attr object that represents a given property name.
The AtTR object represents the properties in a tab. Attr is inherited in Node, but because Attr is actually included in ELEMENT, it does not be seen as the child object of Element, so attr is not part of the DOM tree in the DOM, so getParentNode (), getPrevioussibling () And getNextSibling () return will be NULL. That is, the attr is actually considered part of the ELEMENT object that contains its Element object, which does not appear as a separate node in the DOM tree. This is different from other Node sub-objects when used.
It should be noted that the DOM object described above is defined in the DOM, which is defined when defined is the IDL language that is independent of the specific language when defined. Thus, the DOM can actually be implemented in any object-oriented language, as long as it implements the interface and functionality defined by the DOM. At the same time, some methods are not defined in the DOM, which is expressed with the properties of the IDL. When mapped to a specific language, these attributes are mapped to the corresponding method.
2. DOM instance
With the above introduction, I believe that you have more understanding the DOM. The following example will allow you to be more familiar with the DOM.
Let's talk about what this example is to do, we want to save some URL addresses in a Link.xml file, through a simple program, we can read these URLs and displayed through the DOM. It can be reversed to write to the add-in URL address in this XML file. It is very simple, but it is very practical, and it is also enough to exemplify the most of the DOM.
The XML file itself is not complicated, and it does not give its DTD. Link.xml:
XML Version = "1.0" Standalone = "YES"?>
date>
link>
date>
link>
date>
link>
links>
The first program we call XMLDisplay.java, and the specific program list can be found in the attachment. The main function is to read the contents of each node in this XML file, and then on the format output on System.out, let's take a look at this program:
Import javax.xml.parsers. *;
Import org.w3c.dom. *;
This is an introduction of the necessary class because the XML parser provided here is required, thus need to introduce the Java.xml.Parsers package, which contains specific implementations of the DOM parser and SAX parsers. The org.w3c.dom package defines the DOM interface set by W3C.
DocumentBuilderFactory Factory = DocumentBuilderFactory.newinstance ();
DocumentBuilder Builder = Factory.NewDocumentBuilder ();
Document doc = builder.parse ("links.xml");
Doc.normalize ();
In addition to what to say, there is a small trick, call Normalize () to the Document object, which can remove blank as a formatted content in the XML document to map unnecessary text node objects in the DOM tree. Otherwise, the DOM tree you get may not be as you think. Especially when the output is output, this Normalize () is more useful.
Nodelist Links = Doc.GetElementsBytagname ("link");
Just said, the blank characters in the XML document will also be mapped in the DOM tree as an object. Thus, the getChildNodes method that directly calls the Node method sometimes some problems, sometimes it is possible to return the desired NodeList object. The solution is to use the Element's getElementBytagname (String), and the Nodelise returned is the object you are looking forward to. Then, you can extract the desired elements with item () methods.
For (int i = 0; i System.out.print ("Content:"); System.out.println (Link.GetElementsBytagname ("text"). Item (0) .GetfirstChild (). Getnodevalue ()); System.out.print ("URL:"); System.out.println (Link.GetElementsBytagname ("URL"). Item (0) .GetfirstChild (). Getnodevalue ()); System.out.print ("Author:"); System.out.println (Link.getElementsBytagname ("author"). Item (0) .GetfirstChild (). Getnodevalue ()); System.out.print ("DATE:"); Element linkdate = (element) link.getElementsBytagname ("date"). Item (0); String day = linkdate.getElementsBytagname ("day"). Item (0) .GetfirstChild (). Getnodevalue (); String month = linkdate.getElementsBytagname ("MONTH"). Item (0) .GetfirstChild (). Getnodevalue (); String year = linkdate.getlementsBytagname ("year"). Item (0) .Getfirstchild (). Getnodevalue (); System.out.println (Day "-" MONTH "-" Year); System.out.print ("Description:"); System.out.println (Link.GetElementsBytagname ("Description"). Item (0) .GetfirstChild (). Getnodevalue ()); SYSTEM.OUT.PRINTLN (); } The above code snippet completed the formatting output of the XML document content. As long as you pay attention to some details, such as the use of the GetFirstchile () method and the getElementsByTagname () method are easier. The following content is the problem that the DOM tree is revoked to the XML documentation after modifying the DOM tree. This program name is XMLWRITE.JAVA. In the JAXP1.0 version, there is no direct class and method to process the write problem of the XML document, and you need to use some of the other auxiliary classes. In the JAXP1.1 version, support for XSLT is introduced, so-called XSLT is a new document structure after translation, which is translation to the XML document. With this newly added function, we can easily return newly generated or modified DOM trees to the XML file. Let's take a look at the implementation of the code. This code is the main function of this code is to Links. Add a new Link node in the .xml file: Import javax.xml.parsers. *; Import javax.xml.transform. *; Import javax.xml.transform.dom.domsource; import javax.xml.transform.stream.streamResult; Import org.w3c.dom. *; Several classes in the newly introduced java.xml.Transform package are used to process XSLT transformations. We want to join a new Link node in the XML file above, but you still have to read into the links.xml file, build a DOM tree, then modify this DOM tree (add node), and finally change the modified DOM Write back to the links.xml file: DocumentBuilderFactory Factory = DocumentBuilderFactory.newinstance (); DocumentBuilder Builder = Factory.NewDocumentBuilder (); Document doc = builder.parse ("links.xml"); Doc.normalize (); // --- Number of variables ---- String text = "hanzhong's homepage"; String Url = "www.hzliu.com"; String author = "hzliu liu"; String discrtion = "a site from hanzhong liu, give u ods of suprise !!!"; In order to see the key, simplify the program, we use the content you want to join to the memory String object, and actual operations often use an interface to extract the user input, or extract the desired content from the database through JDBC. TEXT TEXTSEG; Element link = doc.createElement ("link"); It should be clear that no matter what type of Node, Text type is good, Attr type is also good, Element type is good, and their creation is created through the createxxxx () method in the Document object (XXX representative " Specific types to create), so we have to add a link item to the XML document, first create a LINK object: Element LinkText = Doc.createElement ("text"); TextSeg = Doc.createTextNode (Text); LinkText.Appendchild (TextSeg); Link.appendchild (linktext); Element LinkURL = Doc.createElement ("URL"); TextSeg = Doc.createTextNode (URL); LinkURL.Appendchild (TextSeg); Link.Appendchild (LinkURL); Element LinkAuthor = Doc.createElement ("Author"); TextSeg = Doc.createTextNode (Author); LinkAuthor.Appendchild (TextSeg); Link.Appendchild (LinkAuthor); Java.util.calendar RightNow = java.util.calendar.GetInstance (); String day = integer.tostring (rightnow.glendar.day_of_month); string month = integer.tostring (rightnow.Get (java.util.calendar.month); String year = integer.tostring (rightnow.get (java.util.calendar.year); Element Linkdate = Doc.createElement ("Date"); Element LinkDateday = doc.createElement ("day"); TextSeg = Doc.createTextNode (day); LinkdAy.Appendchild (TextSeg); Element LinkDateMonth = Doc.createElement ("Month"); TextSeg = Doc.createTextNode (Month); LinkdateMonth.Appendchild (TextSeg); Element linkdateyear = doc.createElement ("year"); TextSeg = Doc.createTextNode (year); Linkdateyear.Appendchild (TextSeg); Linkdate.Appendchild (LinkdatedAy); Linkdate.Appendchild (LinkDateMonth); Linkdate.Appendchild (LinkdateYear); Link.appendchild (linkdate); Element LinkDiscription = Doc.createElement ("Description"); "DESCRIPTION"); TextSeg = doc.createtextNode (discrtion); LinkDiscription.Appendchild (TextSeg); Link.Appendchild (LinkDiscription); The process of creating nodes may be a thousand articles, but the place you need to pay attention is that the text containing the ELEMENT (in the DOM, these text also represents a Node, so you must create the corresponding node, you can't use it directly The setNodeValue () method of the ELEMENT object sets the contents of these text, and you need to set the text with the setNodeValue () method of the created text object, which can be added to the DOM tree created by the created ELEMENT and its text content. Take a look at the previous code, you will better understand this: Doc.getDocumentelement (). appendchild (link); Finally, don't forget to add the created node to the DOM tree. The Document class GetDocumentElement () method returns the Element object that represents the root node of the document. In the XML document, the root node must be unique. TransformerFactory Tfactory = TransformerFactory.newInstance (); Transformer Transformer = TFActory.NewTransformer (); Domsource Source = New Domsource (DOC); StreamResult result = new streamResult (new java.io.file ("links.xml"); Transformer.Transform (Source, Result); then use XSLT to output the DOM tree. The TransformerFactory here also applies the factory model to make the specific code independent of the specific transducer. The implementation method is the same as DocumentBuilderFactory, and it will not be described here. Transformer class TRANSFROM method accepts two parameters, a data source Source and an output target Result. The DOMSource and StreamResult are used here, which can output the contents of the DOM to an output stream. When this output stream is a file, the content of the DOM is written to the file. Ok, about the topic of the DOM first here, the next article will introduce the contents of SAX. Please see: Java and XML joint programming SAX