1. Java XML programming overview
Here, the relationship between SAX, DOM, JDOM, and JXAP is mainly obtained by giving some examples. First, SAX and DOM are the most basic interpreter interface. JDOM is unified on the model, so that the results of SAX and DOM processes are a JDocument. JAXP is unified on the API, each of the SAX and DOM, his role is a bit similar to JDBC, hiding the specific SAX or DOM interpreter used in the underlying, and provides a unified API programming interface.
2. XML file
1.XML (for display below)
XML Version = "1.0" encoding = "GB2312"?>
header>
x>
x>
data>
statistic>
3. SAX 2.0
a) Model:
XmlReader
ContentHandler
DTDHandler
Errorhandler
b) Model Description:
The SAX implementation model uses the Builder design mode. XmlReader is responsible for reading and identifying XML files, and generating corresponding event notifications based on the identification result of the text, and then sends these events to various Handler registered on XmlReader. The advantage of doing this is that the system only needs a separate XmlReader responsible for completing the interpretation of the XML file, and what is the use of these interpretable content, and the Handler implemented by the user is defined.
c) code:
MyContentHandler's implementation code:
Package sax;
Import org.xml.sax. *;
Import org.xml.sax.helpers. *;
/ **
*
* @Author cenyongh@mails.gscas.ac.cn
* /
Public class myhandler extends defaulthandler {
Private int indent = 0;
Public Void StartElement (String Uri, String LocalName, String Qname,
Attributes attrs {
Emit ();
System.out.print ("<" localname);
INT count = attrs.getlength ();
For (int i = 0; i System.out.print (" attrs.getlocalname (i) " = / "" attrs.getValue (i) "/"); } System.out.println (">"); Indent ; } Public void endelement (String Uri, String LocalName, String Qname) { Indent -; Emit (); System.out.println ("" LocalName "); } Public void characters (char [] ch, int start, innet) { String text = String.Valueof (ch, start, length); IF (! ("". Equals (text.trim ())))))))))))))))) SYSTEM.OUT.PRINTLN (); } Private void emit () { For (int i = 0; i System.out.print ("/ t"); } } Test code: Package sax; Import java.io. *; Import org.xml.sax. *; Import org.xml.sax.helpers. *; / ** * * @Author cenyongh@mails.gscas.ac.cn * / Public class test { Public static void main (String [] args) { Try { XmlReader Reader = XmlReaderFactory.createxmlReader (); Reader.SetContentHandler (New MyHandler ()); Reader.Parse (New FileReader ("1.xml")) } catch (exception e) { E.PrintStackTrace (); } } } Output: column> column> column> header> y> y> x> y> y> x> data> statistic> Note: MyHandler's role is a simple XML file display. She outputs all element tags in the XML file, as well as the properties and text in the elements. d) XMLFilter As the name suggests, XMLFilter is mainly used to filter the contents of the XML file. I. Model: Event Event Event Parse () Parse () XmlReader XMLFilter1 XMLFilter2 ContentHandler Ii. Model Description: The implementation of the XMLFilter model is mainly the mode of responsibility. XMLFitler extends the XMLReader interface, and XMLFilter's implementation class XMLFilterIMPL implements interfaces such as ContentHandler, DTDHandler, so XMLFilterIMPL can be registered as normal Handler to XmlReader. In another XMLFilterIMPL constructor, she can make multiple XMLFilter into a responsible chain by accepting an XMLReader instance, so that multiple XMLFILTERs can be stringed into a chain to multi-level filtering. III. Code: MyFilter's implementation code: Package sax; Import org.xml.sax. *; Import org.xml.sax.helpers. *; / ** * * @Author cenyongh@mails.gscas.ac.cn * / Public class myfilter extends XMLFilterImpl { Public myfilter (XmlReader Reader) { Super (reader); } Public Void StartElement (String Uri, String LocalName, String Qname, Attributes atts) { Try { IF ("Data". Equals (QNAME)) { Return; } else { Super.StartElement (URI, LocalName, QNAME, ATTS); } } catch (exception e) { E.PrintStackTrace (); } } Public void endelement (String Uri, String LocalName, String Qname) { Try { IF ("Data". Equals (QNAME)) { Return; } else { Super.Endelement (URI, LocalName, QNAME); } } catch (exception e) { E.PrintStackTrace (); } } } Test code: Package sax; Import java.io. *; Import org.xml.sax. *; Import org.xml.sax.helpers. *; / ** * * @Author cenyongh@mails.gscas.ac.cn * / Public class test { Public static void main (String [] args) { Try { XmlReader Reader = XmlReaderFactory.createxmlReader (); Myfilter filter = new myfilter (reader); Filter.SetContentHandler (new myhandler ()); Filter.Pars (New FileRead (New FileReader ("1.xml")));} catch (exception e) { E.PrintStackTrace (); } } } Output: column> column> column> header> y> y> x> y> y> x> statistic> Note: MyFilter's role is the label named DATA, not output in the results. IV. Note: Since the XMLFilter does not have the ability to explain the XML document, it must be an XMLReader instance in the highest layer when constructing filter chain. e) Handler introduction: In addition to the common contentHandler, DTDHandler and other Handler. There are also the following belongings to SAX 2.0, new Handler. i. Lexicalhandler Used to process events triggered by DTD, comment, and cdata fields. Ii. Declhandler Used to process Element and Attribute definitions triggered in DTDs. f) Feature and Properties The properties on the XMLReader can be operated via setProperty () and getProperty (). The features on the XMLReader can be operated by setFeature () and getfeature (). It is mainly to check the properties of Validate, Namespace, etc. Feature and Properties are interpreter-dependent, so specifically set which Feature and Properties need to view the interpreter's implementation documentation. 4. DOM 1.0 a) Model: DOMPARSER Document b) Model Description: The DOM's implementation model is the most important difference between SAX is that the user is not directly operating on the XML file. Instead, the DOM interpreter first explains the file, and then generates a DOM tree that meets the XML file structure, and finally the user program interacts with this DOM tree. c) code: DOM program: / * * CREATED ON 2004-12-24 * * / Package Dom; Import Org.w 3C . Dom. *; Import javax.xml.parsers. *; Import java.io. *; Import org.apache.xerces.Parsers. *; Import org.xml.sax. *; / ** * * @Author cenyongh@mails.gscas.ac.cn * / Public class test { Private static int indent = 0; Public static void main (String [] args) { DocumentBuilderFactory Factory = DocumentBuilderFactory.newinstance (); Try { DOMPARSER PARSER = New DOMPARSER (); Parser.Parse (New FileRead (New FileReader)))); Document doc = parse.getdocument (); Output (doc); } catch (exception e) { E.PrintStackTrace (); } } Public static void output (node node) { INT type = node.getnodetype (); IF (Type == Node.Document_node) { Int length; Nodelist childs = node.getchildnodes (); Length = childs.getlength (); For (INT i = 0; i Output (Childs.Item (i)); } } else if (Type == node.text_node) { String value = node.getnodevalue (); IF (! ". Equals (value.trim ())) { Emit (); System.out.println (Value); } } else if (Type == node.element_node) { Emit (); System.out.print ("<" node.getnodename ()); NamedNodeMap attrs = node.getattributes (); INT length = attrs.getlength (); For (INT i = 0; i System.out.print (" attrs.Item (i) .Getnodename () " = " attrs.Item (i) .GetnodeValue ()); } System.out.println (">"); Indent ; Nodelist childs = node.getchildnodes (); Length = childs.getlength (); For (INT i = 0; i Output (Childs.Item (i)); } Indent -; Emit (); System.out.println ("" node.getnodename () ">"); } } Public static void emit () { For (int i = 0; i System.out.print ("/ t"); } } Output: column> column> column> header> y> y> x> y> y> x> statistic> Note: The role of this DOM program is to output the main node of the DOM tree corresponding to the XML file. The focus of writing DOM programs is how to conduct traversal to the tree, and the most important programming techniques here are how to write recursive procedures. Since the Apache's Xerces interpreter we use, the above DOM program is a way of implementation. d) Creating a tree Here mainly demonstrates how to create a DOM tree in a manner. The following programs use Apache's Xerces interpreter. Create a tree program: / * * CREATED ON 2004-12-26 * * / Package Dom; Import Org.w 3C . Dom. *; Import org.apache.xerces.dom. *; / ** * * @Author cenyongh@mails.gscas.ac.cn * / Public class crete { Public static void main (String [] args) { Domillmentation Impl = New DomIMplementationImpl (); Document Doc = Impl.createdocument (NULL, ITEM ", NULL; ELEMENT E = Doc.createElement ("Name"); Text T = Doc.createTextNode ("Hello"); Element root = doc.getdocumentelement (); Root.setttribute ("ID", "NICK"); E.Appendchild (T); Root.Appendchild (e); Test.output (DOC); } } Output: Hello name> item> Note: When you create a DOM tree in use. We can notice that we need to get a DomImplementationImpl instance, then above, we can get the DOCUMENT object of the Dom tree, then above the document, we can get all the elements we need, the text node, and this It is also the key to the entire program to reflect. e) Modify the tree: Here mainly demonstrates how to modify the contents and structures of the DOM tree through the way. Modify the procedure of the tree: / * * CREATED ON 2004-12-24 * * / Package Dom; Import Org.w 3C . Dom. *; Import javax.xml.parsers. *; Import java.io. *; / ** * * @Author cenyongh@mails.gscas.ac.cn * / Public class modify { Public static void main (String [] args) { DocumentBuilderFactory Factory = DocumentBuilderFactory.newinstance (); Try { DocumentBuilder Builder = Factory.NewDocumentBuilder (); Document Doc = Builder.Parse (New File ("1.xml"); Test.output (DOC); Element root = doc.getdocumentelement (); Nodelist List = root.getlementsBytagname ("DATA"); ELEMENT E = (Element) List.Item (0); / * Modify node content * / Text t = (text) E.GETFIRSTCHILD (); T.SETDATA ("Hello World"); / * Modify the structure of the tree * / Nodelist headlist = root.getElementsBytagname ("header"); For (int i = 0; i Root.removechild (HeadList.Item (i)); } Test.output (DOC); } catch (exception e) { E.PrintStackTrace (); } } } Output: prior to: column> column> column> header> y> y> x> y> y> x> data> statistic> after that: Hello World y> y> x> y> x> data> statistic> Note: The difference between the program to modify the tree and the program for creating a tree is that in the program of modifying the tree, the main thing to do is to get the DOCUMENTELEMENT of the DOM tree through the Document instance, the root node of the DOM tree. Then in this way, we can get the reference reference to the node to get through traversal or look up, and perform related operations. It is important to note that the two nodes must be a parent child relationship when calling the RemoveChild operation. From the above programs, we can find that DOM 1.0 needs to know the overall structure of the tree when modifying the tree. 5. DOM 2.0 DOM 2.0 provides a new set of updates, more flexible APIs. a) Traversal Traversal is a new feature belonging to DOM 2.0, and his role is mainly to traverse the node. However, since J2SE does not provide support for Traversal, you can only use interpreter-related operations. code: / * * CREATED ON 2004-12-27 * * / Package Dom; Import java.io. *; Import org.xml.sax. *; Import Org.w 3C . Dom. *; Import org.apache.xerces.Parsers. *; Import org.apache.xerces.dom. *; Import Org.w 3C . Dom.Traversal. *; / ** * * @Author cenyongh@mails.gscas.ac.cn * / Public class traversal { Public static void main (String [] args) { Try { DOMPARSER PARSER = New DOMPARSER (); Parser.Parse (New FileRead (New FileReader)))); Document doc = parse.getdocument (); Element root = doc.getdocumentelement (); Nodeiterator i = (DocumentTraversal) DOC) .createnodeEiterator (root, Nodefilter.show_all, null, true); Node node = NULL; While ((node = i.nextnode ())! = null) { System.out.println (Node.GetNodetyPE () "" node.getnodeename ()); } } catch (exception e) { E.PrintStackTrace (); } } } Output: 1 statistic 3 #Text 1 Header 3 #Text 1 column 3 #Text 1 column 3 #Text 1 column 3 #Text 3 #Text 1 Data 3 #Text 1 x 3 #Text 1 y 3 #Text 1 y 3 #Text 3 #Text 1 x 3 #Text 1 y 3 #Text 1 y 3 #Text 3 #Text 3 #Text Note: The role of NodeItemrator is with the ITerator, and the elements inside are Node, and the program can operate directly on these Node references. Where the first parameter of CreateNodeTeiterator () is the root position that starts searching, the second is the system predefined filter, the third is a user-defined filter. Among them, user-defined filters work after the system defined by the filter. The fourth is "whether an entity is needed." In DOM 1.0 we can get a list of nodes in a name through getElementByname; and we can achieve the same effect through the Traversal interface in DOM 2.0. The advantage of Traversal is that his filtration capacity is stronger than getElement in 1.0. Because, in a custom filter, we can implement a very complex filtering logic, not just a limited number, a single attribute filtering. b) Traversal with filtered filter: / * * CREATED ON 2004-12-27 * * / Package Dom; Import Org.w 3C . Dom.Traversal. *; Import Org.w 3C . Dom. *; / ** * * @Author cenyongh@mails.gscas.ac.cn * / Public Class Traversalfilter Implements Nodefilter { Public short acceptnode (node node) { IF ("header" .equals (node.getnodename (). TRIM ())) Return nodefilter.filter_reject; Else if (node.getnodetype () == node.erayent_node) { Return nodefilter.filter_accept; Else Return nodefilter.filter_skip; } } code: / * * CREATED ON 2004-12-27 * * / Package Dom; Import java.io. *; Import org.xml.sax. *; Import Org.w 3C . Dom. *; Import org.apache.xerces.Parsers. *; Import org.apache.xerces.dom. *; Import Org.w 3C . Dom.Traversal. *; / ** * * @Author cenyongh@mails.gscas.ac.cn * / Public class traversal { Public static void main (String [] args) { Try { DOMPARSER PARSER = New DOMPARSER (); Parser.Parse (New FileRead (New FileReader)))); Document doc = parse.getdocument (); Element root = doc.getdocumentelement (); Nodeiterator i = (DocumentTraversal) DOC) .createnodeEiterator (root, Nodefilter.show_all, new traversalfilter (), true); Node node = NULL; While ((node = i.nextnode ())! = null) { System.out.println (Node.GetNodeType () " node.getnodename ()); } } catch (exception e) { E.PrintStackTrace (); } } } Output: 1 statistic 1 column 1 column 1 column 1 Data 1 x 1 y 1 y 1 x 1 y 1 y Note: Nodefilter only has an acceptnode () method, her return value type is a short, her value has only three nodefilter.filter_skip, indicating that only the current node, Nodefilter.Filter_reject, indicates that the current node and all of them Node, nodefilter.filter_accept, indicates accepting the current node. The above user-defined FILTER demonstrates the role of these three values. For nodes named Header, we filth her entire substroven, and the node type is not node.ement_node node, we just filter out the node. . From the above programs, we can see that through the user implementation method, and returns a specific Short value, we can provide more complex filtering logic than DOM 1.0. c) TreeWalker TreeWalker is also a new feature that belongs to DOM 2.0, and his role is also traversing nodes. However, since J2SE does not provide support for TreeWalker, you can only use the interpreter-related operation here. code: / * * CREATED ON 2004-12-27 * * / Package Dom; Import java.io.fileReader; Import org.apache.xerces.Parsers.Dompival; Import Org.w 3C . Dom. *; Import Org.w 3C . Dom.Traversal. *; Import org.xml.sax.inputsource; / ** * * @Author cenyongh@mails.gscas.ac.cn * / Public class walker { Public static void main (String [] args) { Try { DOMPARSER PARSER = New DOMPARSER (); Parser.Parse (New FileRead (New FileReader)))); Document doc = parse.getdocument (); Element root = doc.getdocumentelement (); Treewalker TW = (DocumentTraversal) Doc) .createtreewalker (root, Nodefilter.show_Element, Null, True; Node node = NULL; While ((node = tw.nextnode ())! = null) { System.out.println (Node.GetNodetyPE () "" node.getnodeename ()); } } catch (exception e) { E.PrintStackTrace (); } } } Output: 1 Header 1 column 1 column 1 column 1 Data 1 x 1 y 1 y 1 x 1 y 1 y Note: The meaning of the four parameters of the CreateTreeWalker () method called when creating the TreeWalker instance is the same. The difference between them is NodeItemrator is a linear list of nodes. TreeWalker is based on the tree organization node. d) Range Range is also a new feature provided by DOM 2.0. He can cut out a certain segment in the DOM tree through the way, and then make some specific operations. code: / * * CREATED ON 2004-12-27 * * / Package Dom; Import java.io.fileReader; Import org.apache.xerces.Parsers.Dompival; Import Org.w 3C . Dom. *; Import org.xml.sax.inputsource; Import Org.w 3C . / ** * * @Author cenyongh@mails.gscas.ac.cn * / Public class myrange { Public static void main (String [] args) { Try { DOMPARSER PARSER = New DOMPARSER (); Parser.Parse (New FileRead (New FileReader)))); Document doc = parse.getdocument (); Element Data = (ELEMENT) DOC.GETELEMENTSBYTAGNAME ("DATA"). Item (0); Range Range = ((DocumentRange) doc) .createralge (); Range.setStartBefore (Data.GetFirstChild ()); Range.seetendAfter (Data.GetlastChild ()); Range.deleteContents (); Range.detach (); Test.output (DOC); } catch (exception e) { } } } Output: column> column> column> header> data> statistic> Note: The role of the above procedure is to set the Data's descendant node into a Range, then remove this Range entire. 6. JDOM JDOM design goals are mainly unified on the model. a) Model: Dombuilder B = New Dombuilder (); Document D = B.BUILD (...); SAXBUILDER B = New Saxbuilder (); Document D = B.BUILD (...); SAX Builder Dom Builder Jdom Document XML OUTPUTTER SAX OUTPUTTER Dom Outputter XML Document Dom Tree b) Model Description: As can be seen from the above figure, JDOM mainly converts an XML file or has established a DOM tree through SAX and DOM's Builder, which is converted into a JDOM's internal document, and then uses other Outputter to output this internal document. c) code: code: / * * CREATED ON 2004-12-27 * * / Package JDOM; Import java.io. *; Import Org.jdom. *; Import org.jdom.Input. *; Import Org.jdom.Output. *; / ** * * @Author cenyongh@mails.gscas.ac.cn * / Public class test { Public static void main (String [] args) { Try { SAXBUILDER B = New Saxbuilder (); Document D = B.BUILD (New File ("1.xml"); XMloutPutter o = new xmloutputter (); Format f = format.getPrettyFormat (); F.setencoding ("GB2312"); O.SetFormat (f); O.Output (d, system.out); } catch (exception e) {} } } Output: XML Version = "1.0" encoding = "GB2312"?> header> x> x> data> statistic> Note: When using XMLOUTPUTTER, we need to provide an output stream. When using SAXOUTPUTTER, we need to set the corresponding ContentHandler, when we use the domoutputter, we will get the Document node. Therefore, in general, I think that JDOM is more in SAX, DOM, and flow. 7. JAXP As mentioned in this article, the meaning of JAXP is that he provides a unified API, regardless of which vendor provided by the underlying, so the meaning of JAXP is similar to JDBC. a) SAX interpreter: SAXPARSERFAACTORY FACTORY = SAXPARSERFACTORY.NEWINSTANCE (); SAXPARSER PARSER = factory.newsaxparser (); Parser.Pars (Input, ContentHandler); b) DOM interpreter: DocumentBuilderFacotry Factory = documentbuilderfacotry.newinstance (); DocumentBuilder Builder = Factory.NewDocumentBuilder (); Document Doc = Builder.Parse (Input); 8. Xslt As the last part of this article, it is mainly to introduce how to combine XML and XSL files in the Java program. a) code: code: / * * CREATED ON 2004-12-27 * * / Package XSL; Import javax.xml.parsers.documentbuilder; Import javax.xml.parsers.documentBuilderFactory; Import javax.xml.transform. *; Import javax.xml.transform.stream. *; Import javax.xml.transform.dom. *; Import java.io. *; Import Org.w 3C . Dom. *; / ** * * @Author cenyongh@mails.gscas.ac.cn * / Public class test { Public static void main (String [] args) { Try { DocumentBuilder Builder = DocumentBuilderFactory.newInstance () .newdocumentbuilder (); Document Doc = Builder.Parse (New File ("1.xml"); Domsource Sor = New Domsource (DOC); StreamResult res = new streamResult (system.out); TransformerFactory Factory = TransformerFactory.newinstance (); Transformer T = Factory.NewTransformer (); T.TRANSFORM (Sor, RES); } catch (exception e) {} } } Note: We can see it by comparing with JDOM. When we don't use the XSL file, we can easily achieve the conversion between SAX, DOM, and flow through the method provided by Transformer. If we need to introduce XSL files, then when getting a new Transformer instance, it is passed to the NewTransformer () method as a parameter.