Handling XML files using JAXP
Author: Bighead
About the Author
Jia Bo, programmer, you can contact him through Mosaic@hotmail.com.
Introduction
JAXP is an abbreviation for Java API for XML Processing. The main part of the JAXP API is in the javax.xml.parsers package. In this package, two of the most important factories, SAXPARSERFACTORY and DocumentBuilderFactory, which provide both SAXPARSER and DocumentBuilder.
SAX is defined by XML-DEV; DOM is defined by W3C. Let's take a look at these API libraries.
Javax.xml.parsersjaxp API, define a general interface of SAX and DOM org.w3c.dom defines all components in the DOM org.xml.sax defines all API Javax.xml.transform defines the XSLT API, use It you can convert XML into a general visual page.
SAX refers to an "event-driven" processing method, he operates an object-based object to an XML file, because it is characterized, it can be used for a server-side or a special requirement for speed.
Comparison of DOM is easier to use. He is to read all XML data in memory, and then use the "Tree" structure to organize these data, and the user can operate any of the XML data.
As for XSLT, we don't have too much here. If you are interested, please refer to the appropriate information. We still look at SAX first.
SAX
SAX frame outline
The system begins with the instance of the Parser generated from SaXParserFactory. A Parser contains a SAXReader object. When this Parser calls the Parse method, this reader is called the callback method, and what about these methods? It is defined in the ContentHandler, ErrorHandler, DTDHandler and EntityResolver interface.
The following is an overview of the SAX API library:
SAXPARSERFACTORYSAXPARSERFACTORY is an object that generates a Parser instance according to the system properties. SAXPARSERSAXPARSER is an interface that defines different types of PARSER () methods. In general, after passing XML data to the Parser, use the defaultHandler to process again, and the system will call some suitable methods to process the XML file, such a processing method is the most simple. SaxReadersaxParser contains a SAXReader, usually you don't need to care about it, but when you want to use saxreader's getXmlReader () method, you need to configure him. In short, SAXPARSER is a processor communication with SAX event, so you can use custom Handler. DefaultHandlerDefaultHandler implements ContentHandler, Errorhandler, DTDHandler, and EntityResolver interface (of course, some NULL methods), if you are interested, you can overload it in your program. ContentHandler When reading XML TAG, CONTENTHANDLER is called to STARTDocument, EndDocument, StartElement, and overElement methods in this interface. At the same time, this interface also defines the CHARACTERS and Processinginstruction, Method, and separately, when Parser encounters XML ELEMENT or INLINE Processing Instruction. ErrorHandler calls the corresponding "error" method when encountering different types of errors, including: Error, Fatalerror and Warning. DTDHandler This interface defined by the interface is only used when processing DTD information. EntityResolver is called only when the Parser encountered the URI identity data in Parser. For more detail, please refer to the official API documentation of SAX.
example:
In our example, we handle an XML file and then set it to the object. This is a very common usage. Here is the XML file we need to process.
Test.xml
XML Version = "1.0"?>
Customer>
Customer>
Customer>
Customer>
Customers>
This is a very simple XML file, there are several Customer in the middle of Customers, and each Customer contains three attribute id, name, address. According to this XML file, we set the Date Object as follows.
/ *
* Customers.java
* Create @ 2004-4-27 22:04:45
* by jiabo
* /
Import java.util. *;
/ **
* Customers
* Create @ 2004-4-27 22:04:45
* by jiabo
* /
Public class customer {
Private Vector Customers;
Public customers () {
Customers = new vector ();
}
Public void addcustomer (Customer Customer) {
Customers.Add (Customer);
}
Public string toString () {
String newline = system.getProperty ("line.separator");
StringBuffer buf = new stringbuffer ();
For (INT i = 0; i BUF.Append (Customers.ementat (i)). Append (newline); } Return buf.tostring (); } } Class customer { Private string id; PRIVATE STRING NAME; PRIVATE STRING ADDRESS; / ** * @Return * / Public string getaddress () { Return Address; } / ** * @Return * / Public string getId () { Return ID; } / ** * @Return * / Public string getname () { Return Name; } / ** * @Param String * / Public void setaddress (String string) { Address = String; } / ** * @Param String * / Public void setid (String String) { ID = String; } / ** * @Param String * / Public void setname (String string) { Name = string; } Public string toString () { Return "Customer: ID = '" ID "' Name = '" Name "'Address ='" address "'"; } } Next is the processor of XML. / * * Test.java * Created on 2004-4-10 * by jiabo * / Import java.util. *; Import org.xml.sax. *; Import org.xml.sax.helpers.defaulthandler; / ** * Test * CREATE ON 2004-4-10 19:20:27 * by jiabo * / Public class unmarshaller extends defaulthandler { Private Customers Customers; Private staik stack; Private boolean isstackreadyfortext; PRIVATE LOCATOR; / ** * init * / Public unmarshaller () { Stack = new stack (); ISSTACKREADYFORTEXT = FALSE; } / ** * @Return Customers * / Public Customers getCustomers () { Return Customers; } / ** * Callbacks * / Public void setDocumentlocator (Locator RHS) { Locator = rhs; } / / =========================================== // SAX DocumentHandler Methods / / =========================================== Public void startelement String Uri, String sname, String Qname, Attributes attrs { ISSTACKREADYFORTEXT = FALSE; IF (Sname.equals ("Customers")) { Stack.push (New Customers ()); } else IF (Sname.equals ("Customer")) { Stack.push (New Customer ()); Else IF Sname.equals ("id") || Sname.equals ("name") || Sname.Equals ("address")) { Stack.push (new stringbuffer ()); ISSTACKREADYFORTEXT = TRUE; } else { } } Public void endelement (String name, string qname) { ISSTACKREADYFORTEXT = FALSE; Object temp = stack.pop (); IF (Sname.equals ("Customers")) { Customers = (CUSTOMERS) TEMP; } else IF (Sname.equals ("Customer")) { (Customers) stack.peek ()). AddCustomer (CUSTOMER) TEMP); } Else IF (Sname.Equals ("ID")) {(Customer) stack.peek ()). setId (Temp.tostring ()); } else IF (Sname.Equals ("name")) { (Customer) stack.peek ()). SetName (Temp.tostring ()); } else IF (Sname.Equals ("address")) { (Customer) stack.peek ()). Setdress (Temp.tostring ()); } } Public void characters (char [] data, int start, int layth) { IF (isStackReadyfortext == true) { (STRINGBUFFER) stack.peek ()). Append (data, start, length); } else { } } } Here we handle the idea of the XML file is very simple, that is, use a stack, encounter "<" indicates the beginning of Element, and then look at whether the name of our established Data Object is in line with the NEW, and will Its stack; do not match anything, SAX's processing framework will deal with the next ELEMENT. And when we encountered "/>", we still see if his name is compliant with the name of the DataObject, and if you meet, you will put it in the SET. In this loop, it handles the simple XML file above us. What we need to do is only these. Other how to deal with, Handler returns to the corresponding startelement, endelement, and other methods to process. The following is the entry of the program: / * * main.java * Create @ 2004-4-27 22:18:41 * by jiabo * / Import java.io. *; Import javax.xml.parsers. *; Import org.xml.sax. *; / ** * main * Create @ 2004-4-27 22:18:41 * by jiabo * / Public class main { Public static void main (string args []) { Customers Customers = NULL; IF (args.length! = 1) { System.err.Println ("USAGE: CMD FileName); System.exit (1); } Try { UNMARSHALLER HANDLER = New unmarshaller (); SAXPARSERFAACTORY FACTORY = SAXPARSERFACTORY.NEWINSTANCE (); SAXPARSER SAXPARSER = factory.newsaxparser (); File file = new file (args [0]); InputSource SRC = New InputSource (New FileInputStream (file); SAXPARSER.PARSE (SRC, HANDLER); Customers = Handler.getCustomers (); } catch (throwable t) { T.PrintStackTrace (); } System.out.println (Customers); } } As mentioned earlier, get an instance of a SAXPARSER by a factory method, and then compile this XML file. This way you can get the following results: Customer: ID = '# 001' Name = 'micke' address = 'najing' Customer: ID = '# 002' name = 'car' address = 'suzhou' Customer: ID = '# 003' name = 'jimmy' address = 'chengdu' Customer: ID = '# 004' name = 'Henry' Address = 'xi'an' There are other ways in the SAX system framework, and the reader may wish to try how they use it, which will have a great convenience in the actual processing XML file. DOM DOM frame outline DOM API Overview In general, we use javax.xml.parsers.DocumentBuilderFactory to get an instance of DocumentBuilder. Of course, you can also get an empty Document object that implements the org.w3c.dom.Document interface. DocumentBuilderFactory It can generate a Builder instance based on the system properties. DocumentBuilder is used to process the generation of Document. For more detail, please refer to the official API document of the DOM. So we can simply this: DocumentBuilderFactory Factory = DocumentBuilderFactory.newInstance (); DocumentBuilder Builder = Factory.NewDocumentBuilder (); Document Document = Builder.Parse ("Test.xml"); You can get a document. Example: We still handle Test.xml. Like SAX, there is also a PASER. In fact, the idea is very simple and clear. We have already said that the DOM is to read all XML into memory, deal with the structure of the tree, so, the analysis of the node is the key to solving the problem, as follows. code show as below: / * * Test.java * Created on 2004-4-10 * by jiabo * / Import org.w3c.dom. *; / ** * Test * CREATE ON 2004-4-10 19:20:27 * by jiabo * / Public class unmarshaller { Public unmarshaller () { } Public Customers Unmarshallcustomers (Node Rootnode) { Customers Customers = New Customers (); Node N; Nodelist nodes = rootnode.getchildNodes (); For (int i = 0; i n = nodes.Item (i); IF (n.GetNodetyPE () == node.ement_node) { IF ("Customer" .Equals (N.GETNODENAME ())) {Customers.Addcustomer (UNMARSHALLCUSTOMER (N)); } else { } } } Return Customers; } Public Customer unmarshallcustomer (node customernode) { Customer Customer = New Customer (); Node N; Nodelist nodes = Customernode.getChildNodes (); For (int i = 0; i n = nodes.Item (i); IF ("ID" .Equals (n.GETNODENAME ())) { Customer.SetId (unmarshalltext (n)); } else if ("name" .Equals (n.GETNODENAME ())) { Customer.setName (unmarshalltext (n)); } else if ("address" .equals (n.GETNODENAME ())) { Customer.SetAddress (unmarshalltext (n)); } } Return Customer; } Public String unmarshalltext (node textnode) { StringBuffer buf = new stringbuffer (); Node N; Nodelist nodes = textNode.getChildNodes (); For (int i = 0; i n = nodes.Item (i); IF (n.GetNodetyPE () == node.text_node) { BUF.Append (n.getnodevalue ()); } else { } } Return buf.tostring (); } } Here's how to drive the DOM to process the XML file section. Still first get a DocumentBuilderFactory factory, use him to generate an instance of DocumentBuilder, and you can analyze this XML file in calling the Parse method. / * * main.java * Create @ 2004-4-27 22:18:41 * by jiabo * / Import java.io. *; Import org.w3c.dom. *; Import javax.xml.parsers. *; / ** * main * Create @ 2004-4-27 22:18:41 * by jiabo * / Public class main { Public static void main (string args []) { Customers Customers = NULL; Document Doc = NULL; IF (args.length! = 1) { System.err.Println ("USAGE: CMD FileName); System.exit (1); } Try { UNMARSHALLER HANDLER = New unmarshaller (); DocumentBuilderFactory Factory = DocumentBuilderFactory.newInstance (); DocumentBuilder Builder = Factory.NewDocumentBuilder (); DOC = Builder.Pars (New File (Args [0])); Customers = Handler.unmarshallCustomers (Doc.getDocumentElement ()); } catch (throwable t) { T.PrintStackTrace (); } System.out.println (Customers); } } to sum up: Here is a profile for XML processing, strive for profile, clear, to help the reader get started with the fastest speed, so there is no fully used method in the library. The processing of XML files is based on WebService. And SAX and DOM are the foundation of XML processing. Turbid text, please readers laugh. reference: http://java.sun.com/xml/jaxp/docs.html