Dom4j Introduction

zhaozj2021-02-11  249

Dom4j Introduction

Author: ice clouds icecloud (AT) sina.com

Time: 2003.12.15

Copyright Notice:

This article is completed by ice clouds, started 9CBS, without permission, no commercial use.

The code section is referenced from the DOM4J document.

Welcome to the reprint, but please keep the article and copyright statement complete.

For contact, please send an email: Icecloud (at) sina.com

Dom4j is an open source XML parsing package for Dom4j.org, which is defined in its website:

DOM4J IS An Easy To Use, Open Source Library for Working with XML, XPath and Xslt On The Java Platform Using The Java Collections Framework and with with favor. Sax and Jaxp.

DOM4J is an easy-to-use, open source library for XML, XPath and XSLT. It is applied to the Java platform, using the Java collection framework and fully supports DOM, SAX, and JAXP.

Dom4j is very simple to use. As long as you understand the basic XML-DOM model, you can use it. However, his own guide is only a short page (html), but it is quite full. Domestic Chinese information is very small. Therefore, it is easy to use this short tutorial. This article only talks about basic usage. If you need to use in-depth use, please ... you can explore or find other information.

I have seen an article of IBM Developer Community (see Appendix), mentioned some of the performance comparisons of some XML parsing packs, where DOM4J's performance is very good, and among the multi-test. (In fact, this comparison in the official document of DOM4J) is also used), in this project, I used DOM4J as an XML parsing tool.

In China, it is more popular with JDOM as a parser. The two are all excellent, but the DOM4J's largest feature is the use of a large number of interfaces, which is also considered to be more flexible than JDM. The master didn't say it, "Interface Programming". There are more and more DOM4J. If you are good at using JDOM, continue to use it, just look at this article as an understanding and comparison, if you are using a parser, it is better to use DOM4J.

Its main interface is defined in org.dom4j:

Attribute Attribute Defines an XML's Properties Branch Branch to define a public behavior that can contain nodes such as XML elements such as XML elements (docuemnts), and CDATA CDATA defines the XML CDATA area CharacterData CharacterData is an identifier excuse. Character's node. As CDATA, Comment, Text. Comment Comment defines the XML annotation behavior Document defines the XML document DocumentType DocumentType defined XML DOCTYPE declaration Element Element custom XML elements ElementHandler ElementHandler defines the Element object processor ElementPath is ElementHandler use, is currently being used to obtain Processing path hierarchical information Entity Entity Defining XML Entity Node Node defines multi-state behavior nodefilter nodefilter to define a filter or predicate process processing processinginStructionStruction, definitely processinginstruction projectinginStruct Processing Center Processing InstructionStruction Processing Center Text text Defines the XML text node. Visitor Visitor is used to implement Visitor mode. XPath XPath will provide an XPath expression after analyzing a string to see the names.

If you want to understand this interface, the key is to understand the inheritance relationship of the interface:

Interface java.lang.cloneable interface org.dom4j.node

Interface org.dom4j.attribute interface org.dom4j.branch

Interface org.dom4j.document interface org.dom4j.element

Interface org.dom4j.characterData interface org.dom4j.cdata interface org.dom4j.comment interface org.dom4j.text

Interface org.dom4j.documenttype interface org.dom4j.Entity Interface Org.Dom4j.Processinginstruction

At a glance, many things are clear. Most of them are inherited by Node. Know these relationships, will not appear ClassCastException in the future.

Some examples are given below (partially taken from DOM4J's own document), simply say how to use it.

1. Read and parse the XML document:

Reading and writing XML documents mainly depends on org.dom4j.io package, which provides two different ways of DomReader and SaxReader, and the mode of calling is the same. This is the benefits of the interface.

// read from the XML file, enter the file name, returns an XML document public Document read (String fileName) throws MalformedURLException, DocumentException {SAXReader reader = new SAXReader (); Document document = reader.read (new File (fileName)); return Document;

Among them, Reader's READ method is overloaded, and can be read from inputs such as InputStream, File, URL. The obtained Document object is tatched the entire XML. According to my own experience, the read character encoding is converted according to the encoding defined by the XML file header. If you encounter garbled problems, be careful to keep the encoded names of each place.

2. Get root node

The second step after reading is to get the root node. People who are familiar with XML know that all XML analysis begins with root elements.

Public element getrootelement (Document Doc) {Return Doc.GetrooteElement ();

3. Traverse XML tree

DOM4J provides at least three ways of traversal nodes:

1) Enumeration (Iterator)

// Enumerate all child nodes for (Iterator i = root.elementiterator (); i.hasnext ();) {ELEMENT ELEMENT = (element) i.next (); // do something} // Enumeration name foo Node for (iTerator i = root.elementiterator (foo); I.hasNext ();) {Element foo = (element) i.next (); // do something} // enumeration property for (Iterator i = root .attributeiterator (); i.hasnext ();) {attribute attribute = (attribute) i.next (); // do something}

2) Recurrence

Recursive can also use Iterator as an enumeration method, but additional practices are provided in the documentation.

Public void treewalk () {treewalk ()} public void treewalk (ELEMENT Element) {for (int i = 0, size = element.nodecount (); i

3) Visitor mode

The most exciting thing is that DOM4J supports Visitor, which can greatly reduce the amount of code and clearly understand. People who know the design pattern know that Visitor is one of the GOF design modes. Its main principle is that two classes are referenced to each other, and one is to access many Visitable as a Visitor. Let's take a look at Visitor mode in Dom4j (not available in the fast document)

Just customize a class to implement the Visitor interface.

public class MyVisitor extends VisitorSupport {public void visit (Element element) {System.out.println (element.getName ());} public void visit (Attribute attr) {System.out.println (attr.getName ());} } Call: root.accept (New Myvisitor ()) Visitor interface provides a variety of Visit () overloads, depending on the XML different objects, will be accessed in different ways. The above is a simple implementation given in Element and Attribute, which is generally more commonly used. Visitorsupport is the default adapter mode provided by the DOM4J, the Default Adapter mode of the Visitor interface, which gives an empty implementation of various Visit (*) to simplify the code.

Note that this Visitor is automatically traversed to all child nodes. If it is root.accept (myvisitor), it will pass through the middle node. When I used it for the first time, I think it is necessary to traverse it, and I will call Visitor in recursive, and the results can be known.

4. XPath support

Dom4j has good support for XPath, such as accessing a node, can be selected directly with XPath.

Public void bar (Document Document) {list list = document.selectnodes (// foo / bar); node node = document.selectsinglenode (// foo / bar / author); string name = node.valueof (@name);}

For example, if you want to find all the hyperlinks in the XHTML document, the following code can be implemented:

public void findLinks (Document document) throws DocumentException {List list = document.selectNodes (// a / @ href); for (Iterator iter = list.iterator (); iter.hasNext ();) {Attribute attribute = (Attribute) Iter.next (); string url = attribute.getValue ();}}

5. String and XML conversion

Sometimes you often use a string to convert to XML or vice

// XML turning string Document Document = ...; string text = document.asxml (); // Character string XML string text = James Document Document = Documenthelper.parsetext (text);

6 convert XML with XSLT

public Document styleDocument (Document document, String stylesheet) throws Exception {// load the transformer using JAXP TransformerFactory factory = TransformerFactory.newInstance (); Transformer transformer = factory.newTransformer (new StreamSource (stylesheet)); // now lets style the given document DocumentSource source = new DocumentSource (document); DocumentResult result = new DocumentResult (); transformer.transform (source, result); // return the transformed document Document transformedDoc = result.getDocument (); return transformedDoc;.} 7 created XML

Generally create XML is the work before writing files, which is as easy as StringBuffer.

public Document createDocument () {Document document = DocumentHelper.createDocument (); Element root = document.addElement (root); Element author1 = root .addElement (author) .addAttribute (name, James) .addAttribute (location, UK) .addText (James Strachan); Element Author2 = root .addelement (author) .addattribute (name, bob) .addttribute (location, us) .addText (Bob McWhirter); Return Document;

8. Document output

A simple output method is to output a document or any Node through the Write method.

FileWriter Out = New FileWriter (foo.xml); Document.write (OUT);

If you want to change the format of the output, such as beautifying output or reduction format, you can use XMLWRITER Class

public void write (Document document) throws IOException {// the specified file XMLWriter writer = new XMLWriter (new FileWriter (output.xml)); writer.write (document); writer.close (); // beautification format OutputFormat format = OutputFormat .createPrettyPrint (); writer = new XMLWriter (System.out, format); writer.write (document); // reduced format format = OutputFormat.createCompactFormat (); writer = new XMLWriter (System.out, format); writer. Write (Document);}, Dom4j is simple enough, of course, there are still some complex applications that are not mentioned, such as ElementHandler, etc. If you are a heart, then use DOM4J.

DOM4J official website: (I am not old)

http://www.dom4j.org

DOM4J Download (Sourceforge), the latest version is 1.4

http://sourceforge.net/projects/dom4j

Reference:

DOM4J document

XML in Java: Document Model, Part 1: Performance

Http://www-900.ibm.com/developerWorks/cn/xml/x-injava/index.shtml

XML in Java: Usage of Java Document Model

http://www-900.ibm.com/developerWorks/cn/xml/x-injava2/index.shtml

Java XML API Tromel by Robbin

Http://www.hibernate.org.cn:8000/137.html

转载请注明原文地址:https://www.9cbs.com/read-4954.html

New Post(0)