Shallow Java XML programming
Author: He Shuangjiang (yarshray)
For XML, my understanding is, a standard format used to store data. How do I think XML and HTML, it is completely different, maybe just use the tag as a document explanation. So people prefer to put XML and HTML are contrast. For I personally, XML is a simple and convenient data file. Because. Its different and general relational databases, see the data two-dimensional table. Through the basic relationship operation, get 2D The data in the table is only used as a document, and then the document parsing data is positively parsing the document. So, what is the view is to operate the XML document, as long as a document interpreter that explains XML. Then explain the content The data required to transition into procedures can be written back when the operation is completed. Therefore, this mainly introduces the two XML document interpreters, and Java-related API. They are DOM and SAX. And JDOM.
For the sake of convenience, we need a simple XML as an example of this article: This XML's main purpose is to store my book information. Including the title of the book, the author of the book, the number of books. Several basic information, XML examples are as follows: XML Version = "1.0" Encoding = "GB2312"?>
This XML records three books and its related information. Based on the XML document can be divided into three steps. mark
The type of XML interpreter. Basically, it can be divided into:
• Verification and non-verified parsers? Support one or more parsers for XML Schema language? Support Document Object Model (DOM) parser? Support SIMPLE API for XML (SAX) parser
Here mainly introduces the latter two. Interpreter.
Document Object Model (DOM) Interpreter: The DOM interpreter is a standard interpreter API, which is officially established by W3C. As long as the programming interface is compliant, it can be used to operate XML. The interpreter is currently three levels .level1 Level2 Level3 Here Only discussed to the Level2.dom model actually converts the data in the XML file into a tree in memory. The tree is constructed in the object of Doument Nodelist Element. And the DOM is responsible for analyzing the structure of this tree. Then use the tree to understand the role of the XML document. It is worth noting that the interpreter is not clearly clear because of some cases. For example, attribute value: id = "1" id = " 1 "ID = '1' This is exactly the same for DOM. Because. For the format DOM of the document, do not do any explanation. It is only to convert the data in the document into a tree, the original document format and No DOM appears.
SIMPLE API for XML (SAX) parser SAX interpreter, can be said to be a programming model that leaves the interpretation to the programmer to make an interpretation. It does not load the entire XML document to memory as to the DOM. Instead, the progressive explanation is then notified by the event, and the specific program uses these notifications, and then processes, it seems that the code-driven code is the same. Therefore, SAX is applicable to the occupation and interpretation efficiency of the memory. Sexual. For specific environment DOM and SAX have its specific application. SAX events mainly include startDocument events for the start-Document Documents for elements for the endDDOCMENT event for handling characters text, and the endDDocment event of the document.
Although Jdom API DOM and SAX provide very rich features. However, this also brings a huge burden on developers. Because too complicated, Jason Hunter and Brett McLaughlin, the open source community, launched a JAVA expert. JDOM as a relatively simple API.jdom will provide an adapter to select a specific XML interpreter. Similarly it provides a tree structure to process XML, while JDOM's tree is more briefly. Therefore, it takes care of DOM and SAX. Of course, flexibility is also reduced. Therefore, if you use JDom, you use JDom, it is also a good choice.
Three APIs
DOM model: The situation in which the document structure needs to be modified as a whole requirement to modify the document structure.
SAX Model: The memory has limited conditions to read only the XML part of the elements need to reference the XML document.
XML Document Object Model (DOM) Introduction: The DOM is launched by W3C to provide a good analytical structure for operation of markers in XML. This model is similar to the HTML analysis model. It provides a set of operational interfaces It can be implemented by each specific platform. Therefore, any application implemented by the programming interfaces that meet the interpresers of the DOM will be selected. It will be easily ported to other platforms that implement the DOM programming interface. For standard DOM Words. It provides an interface to describe an object that describes the XML document tree structure, and methods that can be operated with the XML document through these objects. The application will call these objects and object operations. It is equal to operation XML files. This simplifies the operation of the XML file. The programming interface is enough. Then, it is to be familiar with the DOM programming model. Our primary is to familiarize yourself with the objects and methods provided by the DOM.
Below we are familiar with the objects and methods provided by DOM: Document: The object is described through the entire XML document. That is, the tree can be represented as an overall object containing many nodes. And these nodes are collectively referred to For node.
Node: Node is a relatively abstract concept, so Node is described as an interface in Java's definition. This interface extends several sub-interfaces. Element: Indicates the element attribute in the document: Indicates the attribute text in the XML file: indicates the node The other node types of the text include: Comment Represents a comment in the XML file, ProcessingInstruction represents the processing instruction, and CDASECTION represents the CDATA section. For these interested readers, you can refer to the relevant document. This is not a detailed explanation.
Handling DOM often uses the following method:
Document.getDocumentelement (): Returns the root of the DOM tree. (This function is a method of document interface, no other Node subtype.) Node.getfirstchild () and node.getlastchild (): Returns the first and last child of a given Node. Node.getnextsibling () and node.getprevioussibling (): Returns the next and one brother on the given node. ELEMENT.GETATTRIBUTE (STRING Attrname): Returns the value of the attribute named attrname for a given Element. If you need the value of the "ID" attribute, you can use Element.GetaTribute ("ID"). If this property does not exist, the method returns an empty string (""). Let's take an example of the basic DOM, which is simple to display the content in the book.xml document above. This is also the basic thing done.
code show as below:
Import javax.xml.parsers. *; import org.w3c.dom. *;
public class DomXml {public static void main (String [] args) throws Exception {DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance (); DocumentBuilder bulider = factory.newDocumentBuilder (); Document doc = bulider.parse ( "BookXml.xml"); NodeList NL = Doc.GtelementsBytagname ("BOOK"); for (int i = 0; i The code is very simple. Just print out the results In this code, we can see that DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance (); DocumentBuilder bulider = factory.newDocumentBuilder (); Document doc = bulider.parse ( "BookXml.xml"); this model three sections of the code .DOM A factory is provided by this factory to load an XML interpreter. This makes programmers from the specific programming environment. No need to know about the interpreter. And DocumentBuilder provides a way to create an XML document. By this object, The program does not need to be called with the underlying I / O system. And the object generates the correct XML document tree through the interpreter. This XML is loaded into memory and generates the correct tree format. So second step is to read correct The content of the document is. Read the correct document content as the tree of the tree. First traverse the required nodes. Then read the data. So the whole content of the XML document through the DOM can get it. Steps: 1. Create an XML interpreter via DocumentBulderFoctory. 2. Create a documentBuilder 3 that can load and generate XML through the interpreter. DocumentBuilder loads and generates an XML tree. And read the contents in the corresponding node. Regarding the loading of the interpreter, the loading of the generator is placed in the Import Javax.xml.Parsers package. This package has a SAX interpreter and a generator other than the DOM. So below we will discuss topics about SAX . Creating an XML document via a DOM model In addition to reading data in an XML document, you can also create an XML document with the DOM programming interface. Because the object provided by the DOM can generate a number structure describing the XML document, it is common to write this tree structure generated in the memory to any output stream. This requires the Write method through XMLDocument. This method is used to write an XML document to the output stream. Instance code: import javax.xml.parsers. *; Import org.w3c.dom. *; Import org.apache.crimson.tree.xmldocument; import java.io. *; Public class doml {public static void main (String " ] args) throws Exception {DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance (); DocumentBuilder builder = factory.newDocumentBuilder (); Document doc = builder.newDocument (); Element books = doc.createElement ( "Books"); Element book = doc. CreateElement ("book"); book.setttribute ("ID", "001"); book.Appendchild (Doc.createTextNode ("J2EE Book"); Book.Appendchild (Book); Doc.Appendchild (Books); (XMLDocument) .write (New FileoutPutStream (XMLDocument)); (xmlDocument)) .write (system.out);}} The purpose of the program is to create an XML document, Create the structure of this document, then output the created structure, use two streams here, one is for the output stream of the screen, and one is the output stream for the file. The XML document is proposed in JAXP via the DOM, and the XML produced by the DOM can be used as a source. You can write the modified source back to the XML document via the output stream. The DOM tree explained from the XML file can function as synchronous update. This process first relies on the interpreter of the rain XML file. The interpreter first analyzes the XML document then produces a tree structure, and the application can operate this tree, and then write the results as a source back to the XML file. It is used here. The DOMSource object converts the trees described in Document to the source object, and then write the source back to the XML file via a Transformer. A program instance file is given below. The document is as follows, the XML files operated are Bookxml.xml we are familiar. The example code is as follows: import javax.xml.parsers. *; Import javax.xml.transform.dom.domsource; import javax.xml.Transform.Stream.StreamResult; import javax.xml.transform. *; Import org.w3c.dom . *; public class ModifyXml {public static void main (String [] args) throws Exception {DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance (); DocumentBuilder builder = factory.newDocumentBuilder (); Document doc = builder.parse ( "BookXml.xml" ); Element book; Element bookName; Element bookAuthor; Element bookISBN; Element bookPrice; // insertbook = doc.createElement ( "Book"); book.setAttribute ( "id", "4"); bookName = doc.createElement ( " bookName "); bookName.appendChild (doc.createTextNode (" J2EE Programme begin Book ")); book.appendChild (bookName); bookAuthor = doc.createElement (" bookAuthor "); bookAuthor.appendChild (doc.createTextNode (" Hesj " )); book.appendChild (bookAuthor); bookISBN = doc.createElement ( "bookISBN"); bookISBN.appendChild (doc.createTextNode ( "7-145-10241-3")); book.appendChild (bookISBN); bookPrice = Doc.createElement ("BookPrice"); bookprice.a Ppendchild ("77.8"); book.Appendchild (BookPrice); Node Books = Doc.getElementsBytagname ("Books"). Item (0); Books.Appendchild (Book); // delete // books. removeChild (book); // modifyNodeList allBook = doc.getElementsByTagName ( "Book"); Element opBook = null; for (int i = 0; i TransformerFactory.newInstance (); Transformer tr = tf.newTransformer (); DOMSource ds = new DOMSource (doc); StreamResult sr = new StreamResult (System.out); tr.transform (ds, sr);}} Here, first create A BOOK node. Then add to the Document, followed by deleting and modifying the node. Simple XML Programming Interface (SAX) In order to solve the problem that the DOM interpreter needs to have a large amount of memory, there is a problem that does not load the entire XML document. Generate a tree. But a segment read the XML document and throw an interpretation scheme. Therefore The SAX programming interface is proposed. Such SAX does not need to generate a tree structure in memory, but there is no need to create any object to describe XML, save memory overhead. Just explain XML files simple. In order to understand the events that occurred in SAX It is necessary to program the use of SAX. For SAX, the throw of the event is done by the interpreter, and the processing is to handle the application, so the event interface and fill in the event code are our things. For The event can handle or do not process, but since SAX is not created any object. So the status of the event is not maintained. If you want to maintain a status in multiple events, it is also the application's thing. SAX Event Description: For SAX, an event is the core of its programming. The commonly used events mainly have five kinds. STARTDocument () This event represents the interpreter begins to explain the document. It does not pass any parameters, so there is no need to process XML document data. EndDocument. () This event indicates the interpretation of the interpreter to end the interpretation of the XML document. So the above two events do not do too much actual work StarTelement (...) tells you that the parser discovers a starting tag. This event tells you the name of the element, the name and value of all attributes of this element, will also tell you some namespace information. Characters (...) tells you that the parser discovers some text. You get a character array, an offset of the array, and a length variable, and these three variables You can access the text discovered by the parser. endelement (...) tells you that the parser discovers an end tag. This event tells you the name of the element, as well as the relevant namespace information. StartElement (), character (), and endelement () are the most important events, which will be set in this three events. All of these events belong to the ContentHandler interface, the other SAX interface processing errors, entities, and other content rarely used. StarTelement () Event StartElement () Event tells you that the SAX parser discovers the starting tag of an element. The event has four parameters: String Uri Name Space URI. Since my XML document does not use the namespace, I don't discuss its meaning, so I can ignore it. About the namespace can refer to other XML related information. String localname is a definitive name of the element name String QualifiedName element without the namespace, That is, a combination of namespace prefix and element local names. Org.xml.sax.attributes Attributes A collection containing all properties of this element. This object provides several ways to get the name and value of the attribute, and the number of properties of this element. If your XML application finds the content of an element, the StarTelement () event can tell you when the element begins. Characters () Event Characters () event contains characters found in the source file. This event contains a character array in the spirit of minimizing memory, which is much easier than Java String objects. The following is the parameters of the characters () event: Char [] Characters parser found array. INT Start belongs to the index number of a character in the Characters array of the event. INT Length's number of characters in the event. If your XML application needs to store the content of a particular element, you can put the code that stores those content in the characters () event handler. endElement () Event endelement () Event tells you that the parser discovers the end tag of an element. It has three parameters: String Uri string localname string QualifiedName The three parameters are the same as StartElement. Typical response to this event is to change status information in the XML application. For example, in my program will print the read XML document to the screen. Then we will write an application about SAX. Similarly, the same use of the DOM used to print the XML document information explained. The code is as follows: import java.io.Parsers. *; Import Org.xml.sax.helpers. *; import org.xml.sax. *; public class SaxXml extends DefaultHandler {private String ElementName; private int id; private String bookName; private String bookAuthor; private String bookISBN; private String bookPrice; public SaxXml () {this.ElementName = ""; this.id = 0; this. BookName = "" "; this.bookisbn ="; "this.bookprice =";} public void startdocument () {system.out.println ("document begin");} public void startElement String uri, String localName, String qName, Attributes attributes) {this.ElementName = qName; if (qName.equals ( "Book")) System.out.println (attributes.getValue (0));} public void characters (char [] CH, int Start, intlength) {string str = new string (ch, start, length); if (this.ElementName.equals ("bookname") &&! str.trim (). Equals ("")) {this.BookName = Str;} IF ("BookAuthor" &&! Str.trim (). Equals (")) {this.bookauthor = Str;} if (this.Elementname.Equals "bookisbn") &&! Str.trim (). Equals ("")) {this.bookisbn = Str;} if (this.ElementName.Equals ("BookPrice" &&! Str.trim (). Equals (")) {this.bookprice = Str;}} public void endelement (String Uri, String localname, string qname) {if (qname.equals (" bookname ")) {system.out.println (this.bookname) } IF (qname.equals ("BookAuthor")) {system.out.println (this.bookauthor);} if (qname.equals ("bookisbn")) {system.out.println (this.bookisbn);} IF (QName.Equals ("BookPrice")) {system.out.println (this.bookprice);} if (qname.equals ("book")) {system.out.println ();} this.ElementName = " "} public void enddocument () {system.out.println (" Document End "); System.exit (0);} public static void main (String [] args) throws Exception {SAXParserFactory factory = SAXParserFactory.newInstance (); SAXParser parser = factory.newSAXParser (); parser.parse (new File ("Bookxml.xml"), new saxml ());}} This code has implemented the role of reading an XML document through an event. Here you should pay attention to the processing of the blank area, I am solved by the TRIM method of String. Of course, you can also use other methods. This program is used in StarTelement to set this state in this, and the CHARACTER has made a corresponding action through this state. Mainly to put the read information to us. The object's properties are going to. This property is read in the Endelement event. The development steps of SAX are as follows: 1. Implement the ContentHandler interface and fill in the event code (here I use inherited the defaulthandler class, which implements the ContentHandler interface) 2. Create a SAX interpreter factory 3. Create a SAX interpreter through the factory 4. Use The SAX interpreter loads an XML document that is loaded into an interpreter with a class instance object that has implemented the ContentHandler interface.