Java update XML four common ways

xiaoxiao2021-03-06  34

This article briefly discusses four common methods for updating XML documents in Java language program, and analyzes the advantages and disadvantages of these four methods. Secondly, this article also made the format of how to control the XML document output by the Java program.

JAXP is the English header abbreviation of Java API for XML Process, which is the programming interface written using Java language for XML document processing. JAXP supports DOM, SAX, XSLT and other standards. In order to enhance the flexibility in JAXP, developers have designed a PlugGability Layer for JAXP. Under the support of Pluggability Layer, JAXP can implement the DOM API, SAX API's various XML parsers (XML Parser, for example Apache Xerces) Joint work, and the XSLT processor (XSLT Processor, such as Apache Xalan), which is specifically implemented. The benefit of applying Pluggability Layer is that we only need to be familiar with the definition of JAXP's respective programming interfaces, without having to understand the specific XML parsers used, the XSLT processor has a deep understanding. For example, in a Java program, the XML parser Apache Crimson is called via JAXP to process the XML document, if we want to use other XML parsers (such as apache Xerces) to increase the performance of the program, then the original program code may not Need to change, you can use it directly (what you need to do is just JAR files containing the Apache Xerces code to the environment variable ClassPath, and the JAR file containing the Apache Crimson code is removed in the environment variable ClassPath).

At present, JAXP has been applied very common, which can be said to have a standard API for processing an XML document in a Java language. Some beginners are learning such a problem in learning using JAXP: I have written updates to the Dom Tree, but when the program exits, the original XML document does not change, or the old look, how to implement it The original XML document and the Synchronization update of the DOM Tree? At first, there seems to be no corresponding interface / method / class in JAXP, which is a problem that many beginners are confused. The main purpose of this article is to solve this problem, simply introduce several commonly used synchronous update of the original XML document and the Dom Tree method. In order to narrow the scope of the discussion, the XML parsers involved herein include only Apache Crimson and Apache Xerces, while the XSLT processor only uses Apache Xalan.

Method 1: Direct reading and writing XML documents

This may be the most stupid and most original way. After the program acquires the DOM Tree, each method of the Node interface to the DOM model is updated, and the next step should be updated the original XML document. We can use the recursive approach or to apply the TreeWalker class. Traverse the entire Dom Tree, while writing every node / element of the Dom Tree, in a pre-opened original XML document, when the DOM Tree is traveled, Dom Tree and the original XML documentation implement synchronous updates. In practice, this method is rarely used, but if you want to program your own XML parser, this method is still possible to use.

Method 2: Using the XMLDocument class

Use the XMLDocument class? Dividing this class in JAXP! Is the author mistaken? No mistake! That is to use the XMLDocument class, exactly, is the Write () method using the XMLDocument class.

The above has been mentioned above, JAXP can be used in combination with a wide variety of XML parsers, this time we choose the XML parser is apache crimson. XMLDocument (org.apache.crimson.tree.xmldocument) is a class of Apache Crimson, not included in the standard JAXP, can't find the XMLDocument class in the JAXP documentation. Now the problem came out, how to apply the XMLDocument class to implement the function of updating the XML document? The following three write () methods are available in the XMLDocument class (the latest version of CRIMSON ---- Apache Crimson 1.1.3):

Public void write (OutputStream Out) THROWS IOEXCEPTION

Public void Write (Writer out) throws oews (Writer out, string encoding) THROWS IOEXCEPTION

The main role of the above three WRITE () methods is to output content in the DOM Tree to a specific output medium, such as file output stream, application console, etc. So how do you use the above three Write () methods? Please see the following Java program code snippet:

String name = fancy;

DocumentBuilder Parser;

DocumentBuilderFactory Factory = DocumentBuilderFactory.newinstance ();

Try

{

Parser = factory.newdocumentbuilder ();

Document doc = parser.parse (user.xml);

Element newlink = doc.createElement (name);

Doc.getDocumentelement (). appendchild (newlink);

(XMLDocument) DOC) .write (New FileoutputStream (New File (Xuser1.xml));

}

Catch (Exception E)

{

// TO log it

}

In the above code, first create a Document object DOC, get the complete DOM Tree, then apply the appendchild () method of the Node interface, and add a new node in the DOM Tree, finally call the XMLDocument class Write (OUTPUTSTREAM OUT) Method, output content in the DOM Tree into xuser.xml (actually output to user.xml, updating the original XML document, here for easy contrast, so output to XUSER.XML files) . It should be noted that the WRITE () method is directly called directly to the Document object DOC, because the JAXP's Document interface does not define any Write () method, so you must force the DOC to convert the Document object to the XMLDocument object, and then call Write () Method, in the above code is the Write (OutputStream out) method, this method uses the default UTF-8 encoded output DOM Tree to a specific output medium, if the Chinese characters are included in the DOM Tree, then output The result may be garbled, that is, there is so-called Chinese character issues, the solution is the use of the Write (Writer Out, String Encoding) method, explicitly specify the encoding of the output, for example, set the second parameter to GB2312, at this time That is, there is no Chinese character problem, the output results can display Chinese characters normally.

For a complete example, please refer to the following files: addRecord.java (see attachment), user.xml (see attachment). The operating environment of this example is: Windows XP Professional, JDK 1.3.1. To be able to run the addRecord.java, you need to download Apache Crimson to the URL http://xml.apache.org/dist/crimson/ and add the acquired crimson.jar file to the environment variable ClassPath.

note:

The predecessor of Apache Crimson is Sun Project X Parser, and later I don't know why, the X Parser evolved into Apache Crimson, and many of the code of Apache Crimson have been directly transplanted from X Parser. For example, the XMLDocument class used above, it is com.sun.xml.xmldocument in X Parser, and it has become an org.apache.crimson.tree.xmldocument class, in fact they have The code is the same, it may be different from the package statement and the import statement, and the beginning of the file. Early JAXP is bundled with X Parser, so some old programs use com.sun.xml package, if you recompile them now, it is possible to pass, it is definitely because of this reason. Later JAXP and Apache Crimson bundle together, such as JAXP 1.1, if you use JAXP 1.1, then you don't need to download Apache Crimson, you can also properly compile the example (AddRecord.java). The latest JAXP 1.2 EA (Early Access) is changing the string, using performance better Apache Xalan and Apache Xerces as the XSLT processor and XML parser, can not directly support Apache Crimson, so if your development environment uses JAXP 1.2 EA or It is Java XML PACK (including JAXP 1.2 EA), then you will not be able to directly compile the above example (AddRecord.java), you need to download and install Apache Crimson. Method 3: Use TransformerFactory and Transformer class

The method of updating the original XML document in JAXP is to call the XSLT engine, which is also uses the TransformerFactory and Transformer classes. Please see the Java code snippet below:

// First create a DomSource object, the parameter of the constructor can be a Document object

// DOC represents the changed Dom Tree.

Domsource DOMS = New Domsource (DOC);

// Create a File object, which represents the output media of the data contained in the DOM Tree, which is an XML file.

File f = new file (xmloutput.xml);

// Create a streamResult object, the parameter of the constructor can be taken as a File object.

StreamResult Sr = New StreamResult (f);

// The XSLT engine in JAXP is called to implement the data in the DOM Tree to the XML file.

The input of the XSLT engine is a DomSource object and outputs it as a StreamResut object.

Try

{

// First create a TransformerFactory object, then create a Transformer object. TRANSFORMER

// Class is equivalent to an XSLT engine. Usually we use it to process the XSL file, but here we have made

// Use it to output an XML document.

TransformerFactory TF = TransformerFactory.newInstance ();

Transformer T = TF.NEWTRANSFORMER ();

// Step by one, call the Transform () method of the Transformer object (XSLT engine), the first

// The parameter is a DomSource object, and the second parameter is a StreamResult object.

T.Transform (DOMS, SR);

}

Catch (TransformerConfigurationException TCE)

{

System.out.println (Transformer Configuration Exceptionn -----);

Tce.PrintStackTrace ();

}

Catch (Transformerexception TE)

{

System.out.println (Transformer Exceptionn ---------); TE.PrintStackTrace ();

}

In practical applications, we can apply traditional DOM API to get DOM Tree from the XML document, then perform various operations to Dom Tree according to actual needs, get the final Document object, next to this Document object to create Domsource objects The rest of the thing is to move the above code. After the program is running, XMLOUTPUT.XML is the result you need (of course, you can change the parameters of the StreamResult class constructor, specify different output media, not necessarily Thousands of XML documents).

The greatest advantage of this method is that the contents of the control Dom Tree you can output into the format in the output medium, but the light relying on the TransformerFactory class and the Transformer class do not implement this feature, but also rely on the help of the OutputKeys class. For a complete example, please refer to the following files: addRecord2.java (see attachment), user.xml (see attachment). The operating environment of this example is: Windows XP Professional, JDK 1.3.1. In order to be able to function properly, you need to go to the URL http://java.sun.com to download and install JAXP 1.1 or Java XML Pack (Java XML Pack has JAXP).

OutputKeys class

Javax.xml.transform.outputKeys class and java.util.properties class work with the JAXP's XSLT engine (Transformer class) to output an XML document format. Please see the following code segment:

// First create a TransformerFactory object, then create a Transformer object.

TransformerFactory TF = TransformerFactory.newInstance ();

Transformer T = TF.NEWTRANSFORMER ();

/ / Get the output attribute of the Transformser object, which is the default output attribute of the XSLT engine, this is a

//java.util.properties object.

Properties Properties = T. GetOutputProperties ();

/ / Set the new output attribute: Output character is encoded as GB2312, which can support Chinese characters, and the XSLT engine is output.

// The XML document If the Chinese characters are included, it can be displayed normally without so-called Chinese character issues.

/ / Please pay attention to the string constant OutputKeys.Encoding of the OutputKeys class.

Properties.SetProperty (OutputKeys.Encoding, GB2312);

/ Update the output attribute of the XSLT engine.

T. SetputputProperties (Properties);

// Call the XSLT engine, output according to the settings in the output attribute, output the contents in the DOM Tree to the output medium.

T.TRANSFORM (Domsource_Object, StreamResult_Object);

From the above program code, we are not difficult to see that by setting the output attribute of the XSLT engine (Transformer class), you can control the output format of the content in the DOM Tree, which is very helpful for our custom output content. Then the JAXP's XSLT engine (Transformer class) is available for output properties? Javax.xml.transform.outputKeys classes define a lot of string constants, they are all freely set output properties, and the common output properties are as follows:

Public Static Final Java.lang.String Method

It can be set to XML, HTML, and Text Equivalence.

Public Static Final Java.lang.String Version

Follow the specified version number, if Method is set to XML, then its value should be set to 1.0, if Method is set to HTML, then its value should be set to 4.0, if Method is set to text, then this output property will be ignored. . Public Static Final Java.lang.String Encoding

The encoding mode used in the output is set, such as GB2312, UTF-8, etc., if set to GB2312, the so-called Chinese character problem can be solved.

Public static final java.lang.string omit_XML_Declaration

Set whether the XML declaration is ignored when the output is output to the XML document, namely:

Such a code. Its optional value has yes, NO.

Public static final java.lang.string Indent

Ident Settings whether the XSLT engine is output when an XML document is output, and its optional value is YES, NO.

Public Static Final Java.lang.String Media_TYPE

Media_Type Sets the MIME type of output document.

If you set the output attribute of the XSLT engine? Let's summarize:

The first is a collection of the default output attribute of the XSLT engine (Transformer class), which requires the GetoutputProperties () method of the Transformer class, and the return value is a Java.util.Properties object.

Properties Properties = Transformer.getOutputProperties ();

Then set a new output attribute, such as:

Properties.SetProperty (OutputKeys.Encoding, GB2312);

Properties.SetProperty (OutputKeys.method, HTML);

Properties.SetProperty (OutputKeys.Version, 4.0);

...............................................................

Finally, update the set of the default output attribute of the XSLT engine (Transformer class), which requires the setoutputproperties () method of the Transformer class, parameter is a Java.util.Properties object.

We wrote a new program, which applied an OutputKeys class to control the output attribute of the XSLT engine. The schema's architecture and the previous program (AddRecord3.java) are roughly the same, but the output results are slightly different. For complete code, please refer to the following files: addRecord3.java (see attachment), user.xml (see attachment). The operating environment of this example is: Windows XP Professional, JDK 1.3.1. In order to be able to function properly, you need to go to the URL http://java.sun.com to download and install JAXP 1.1 or Java XML PACK (Java XML Pack contains JAXP).

Method 4: Using Xalan XML Serializer

Method 4 It is actually a variant of method three, which requires support for Apache Xalan and Apache Xerces to run. The example code is as follows:

// First create a DomSource object, the parameter of the constructor can be a Document object

// DOC represents the changed Dom Tree.

Domsource Domsource = New Domsource (DOC);

// Create a DomResult object, temporarily save the output of the XSLT engine.

DomResult DomResult = New DomResult ();

// The XSLT engine in JAXP is called to implement the data in the DOM Tree to the XML file.

The // The input of the XSLT engine is a DomSource object, which is output as a DomResut object.

Try

{

// First create a TransformerFactory object, then create a Transformer object. Transformer // class is equivalent to an XSLT engine. Usually we use it to process the XSL file, but here we have made

// Use it to output an XML document.

TransformerFactory TF = TransformerFactory.newInstance ();

Transformer T = TF.NEWTRANSFORMER ();

/ / Set the properties of the XSLT engine (essential, otherwise Chinese character issues).

Properties Properties = T. GetOutputProperties ();

Properties.SetProperty (OutputKeys.Encoding, GB2312);

T. SetputputProperties (Properties);

// Step by one, call the Transform () method of the Transformer object (XSLT engine), the first

// The parameter is a DomSource object, and the second parameter is the domresult object.

T.Transform (Domsource, DomResult);

// Create the default Xalan XML Serializer, use it to store it in the DomResult object

The content in // (DomResult) is output to the output medium in the form of output stream.

Serializer Serializer = SerializerFactory.getSerializer

OutputProperties.getDefaultMethodproperties (XML);

// Set the output attribute of the Xalan XML Serializer, which is essential, otherwise it may also be produced

// The so-called Chinese character problem.

Properties Prop = Serializer.getOutputFormat ();

Prop.SetProperty (Encoding, GB2312);

Serializer.setOutputFormat (Prop);

// Create a File object, which represents the output media of the data contained in the DOM Tree, which is an XML file.

FILE F = New File (XUSER3.XML);

// Create a file output flow object FOS, please pay attention to the parameters of the constructor.

FileOutputStream Fos = New FileoutputStream (f);

// Set the output stream of the Xalan XML Serializer.

Serializer.SetOutputStream (FOS);

// Serialized output results.

Serializer.ASDomserializer (). serialize ()); DomResult.getNode ());

}

Catch (Exception TCE)

{

Tce.PrintStackTrace ();

}

This method is not common, and it seems to have a little painting snake to add, so we don't discuss it. For a complete example, please refer to the following files: addRecord4.java (see attachment), user.xml (see attachment). The operating environment of this example is: Windows XP Professional, JDK 1.3.1. In order to be able to compile the addRecord4.java this program, you need to go to the URL http://xml.apache.org/dist/ to download and install Apache Xalan and Apache Xerces.

Or go to the URL http://java.sun.com/xml/download.html to download and install Java XML PACK. Because the latest Java XML PACK (Winter 01) contains Apache Xalan and Apache Xerces technology.

in conclusion:

This paper discusses four ways to update the XML document in Java language programming. The first method is to read and write XML files directly, this method is very cumbersome, and it is more likely to be wrong, unless you need to develop your own XML Parser, do not use this method. The second method is to use the XMLDocument class of Apache Crimson, which is very simple, easy to use, if you choose Apache Crimson as an XML parser, then use this method, but this method seems to be high efficiency (from efficiency Low Apache Crimson, additionally, high version of JAXP or Java XML PACK, JWSDP does not directly support Apache Crimson, ie this method is not universal. The third method is to use JAXP's XSLT engine (Transformer class) to output an XML document, which may be a standard method, which is very flexible, especially if it can be controlled, and we recommend this method. The fourth method is a variant of the third method. It is used in Xalan XML Serializer. It introduces serialization operations. It has superiority for the modification / output of a large number of documents. Unfortunately, it is necessary to repeat the properties of the XSLT engine and XML Serializer. Output attributes, trouble, and rely on Apache Xalan and Apache Xerces technology, a slightly short of versatility. In addition to the four methods discussed above, there are many ways to apply other APIs (such as JDOM, CASTOR, XML4J, Oracle XML Parser V2), and there are many ways to update XML documents, limited to space, and here is not discussed here.

转载请注明原文地址:https://www.9cbs.com/read-62827.html

New Post(0)