Update XML Document in Java Programming

xiaoxiao2021-03-14  214

Author: Huang Li

This article briefly discusses four common methods for updating XML documents in Java language program, and analyzes the advantages and disadvantages of these four methods. Secondly, this article also made the format of how to control the XML document output by the Java program. JAXP is the English header abbreviation of Java API for XML Process, which is the programming interface written using Java language for XML document processing. JAXP supports DOM, SAX, XSLT and other standards. In order to enhance the flexibility in JAXP, developers have designed a PlugGability Layer for JAXP. Under the support of Pluggability Layer, JAXP can implement the DOM API, SAX API's various XML parsers (XML Parser, for example Apache Xerces) Joint work, and the XSLT processor (XSLT Processor, such as Apache Xalan), which is specifically implemented. The benefit of applying Pluggability Layer is that we only need to be familiar with the definition of JAXP's respective programming interfaces, without having to understand the specific XML parsers used, the XSLT processor has a deep understanding. For example, in a Java program, the XML parser Apache Crimson is called via JAXP to process the XML document, if we want to use other XML parsers (such as apache Xerces) to increase the performance of the program, then the original program code may not Need to change, you can use it directly (what you need to do is just JAR files containing the Apache Xerces code to the environment variable ClassPath, and the JAR file containing the Apache Crimson code is removed in the environment variable ClassPath). At present, JAXP has been applied very common, which can be said to have a standard API for processing an XML document in a Java language. Some beginners are learning such a problem in learning using JAXP: I have written updates to the Dom Tree, but when the program exits, the original XML document does not change, or the old look, how to implement it The original XML document and the Synchronization update of the DOM Tree? At first, there seems to be no corresponding interface / method / class in JAXP, which is a problem that many beginners are confused. The main purpose of this article is to solve this problem, simply introduce several commonly used synchronous update of the original XML document and the Dom Tree method. In order to narrow the scope of the discussion, the XML parsers involved herein include only Apache Crimson and Apache Xerces, while the XSLT processor only uses Apache Xalan. Method 1: Direct reading and writing XML documents this may be the most stupid and most original way. After the program acquires the DOM Tree, each method of the Node interface to the DOM model is updated, and the next step should be updated the original XML document. We can use the recursive approach or to apply the TreeWalker class. Traverse the entire Dom Tree, while writing every node / element of the Dom Tree, in a pre-opened original XML document, when the DOM Tree is traveled, Dom Tree and the original XML documentation implement synchronous updates. In practice, this method is rarely used, but if you want to program your own XML parser, this method is still possible to use. Method 2: Using the XMLDocument class using the XMLDocument class? Dividing this class in JAXP! Is the author mistaken? Nothing! That is to use the XMLDocument class, it is a Write () method using the XMLDocument class. The above has been mentioned above, JAXP can be used in combination with a wide variety of XML parsers, this time we choose the XML parser is apache crimson. XMLDocument (org.apache.crimson.tree.xmldocument) is a class of Apache Crimson, not included in the standard JAXP, can't find the XMLDocument class in the JAXP documentation.

Now the problem came out, how to apply the XMLDocument class to update the function of the XML document? The following three Write () methods are available in the XMLDocument class (based on the latest version of Crimson ---- Apache Crimson 1.1.3): public the main effect of void write (OutputStream out) throws IOException public void write (Writer out) throws IOException public void write (Writer out, String encoding) throws IOException above three write () method is that the contents of the DOM Tree output to a specific output In the medium, such as file output flow, application console, etc. ? So how to use these three write () methods do look at the following Java code fragment: String name = "fancy"; DocumentBuilder parser; DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance (); try {parser = factory.newDocumentBuilder () ; Document doc = parser.parse ( "user.xml"); Element newlink = doc.createElement (name); doc.getDocumentElement () appendChild (newlink);. ((XmlDocument) doc) .write (new FileOutputStream (new File ("XUSER1.XML")))))));} catch (exception e) {// to log it} In the above code, first create a Document object DOC, get the complete DOM Tree, then apply the Node interface AppendChild ( The method, in the final addition of a new node (FANCY), finally call the XMLDocument class method, output the content in the DOM Tree into Xuser.xml (actually output to user.xml Update the original XML document, here in order to facilitate comparison, so it is output to the xuser.xml file). It should be noted that the WRITE () method is directly called directly to the Document object DOC, because the JAXP's Document interface does not define any Write () method, so you must force the DOC to convert the Document object to the XMLDocument object, and then call Write () Method, in the above code is the Write (OutputStream out) method, this method uses the default UTF-8 encoded output DOM Tree to a specific output medium, if the Chinese characters are included in the DOM Tree, then output The result may be garbled, that is, there is so-called "Chinese character problem //", the solution is the use of the Write (Writer Out, String Encoding) method, explicitly specifying the encoding, such as setting the second parameter to "GB2312", that is, there is no "Chinese character problem //", and the output results can display Chinese characters normally. For a complete example, please refer to the following files: addRecord.java (see attachment), user.xml (see attachment). The operating environment of this example is: Windows XP Professional, JDK 1.3.1. To be able to run the addRecord.java, you need to download Apache Crimson to the URL http://xml.apache.org/dist/crimson/ and add the acquired crimson.jar file to the environment variable ClassPath. Note:

The predecessor of Apache Crimson is Sun Project X Parser, and later I don't know why, the X Parser evolved into Apache Crimson, and many of the code of Apache Crimson have been directly transplanted from X Parser. For example, the XMLDocument class used above, it is com.sun.xml.xmldocument in X Parser, and it has become an org.apache.crimson.tree.xmldocument class, in fact they have The code is the same, it may be different from the package statement and the import statement, and the beginning of the file. Early JAXP is bundled with X Parser, so some old programs use com.sun.xml package, if you recompile them now, it is possible to pass, it is definitely because of this reason. Later JAXP and Apache Crimson bundle together, such as JAXP 1.1, if you use JAXP 1.1, then you don't need to download Apache Crimson, you can also properly compile the example (AddRecord.java). The latest JAXP 1.2 EA (Early Access) is changing the string, using performance better Apache Xalan and Apache Xerces as the XSLT processor and XML parser, can not directly support Apache Crimson, so if your development environment uses JAXP 1.2 EA or It is Java XML PACK (including JAXP 1.2 EA), then you will not be able to directly compile the above example (AddRecord.java), you need to download and install Apache Crimson. Method 3: The method of updating the original XML document using the standard update of the TRANSFORMERFACTORY and TRANSFORMER Class in JAXP is to call the XSLT engine, which is the TRANSFORMERFAACTORY and TRANSFORMER classes. Please see the Java code snippet below: // First create a DomSource object, the parameter of the constructor can be a Document object // DOC representative of DOM Tree. DOMSource Doms = New DomSource (DOC); // Create a File object, which represents the output media of the data contained in the DOM Tree, which is an XML file. FILE F = New File ("XMLoutput.xml"); // Create a streamResult object, the parameter of the constructor can be taken as a File object. StreamResult Sr = New StreamReamResult (f); // The XSLT engine in JAXP is called to implement the data in the data in the DOM Tree to the XML file. The input of the XSLT engine is a DomSource object and outputs it as a StreamResut object. Try {// First create a TransformerFactory object, then create a Transformer object. Transformer // class is equivalent to an XSLT engine. Usually we use it to handle the XSL file, but here we use it to output an XML document. TransformerFactory TF = TransformerFactory.newInstance (); Transformer T = TF.NEWTRANSFORMER (); // Critical step, calling the Transform () method of the Transformer object (XSLT engine), the first // parameter of this method is a DomSource object, The second parameter is a streamResult object.

t.transform (doms, sr);} catch (TransformerConfigurationException tce) {System.out.println ( "Transformer Configuration Exception / n -----"); tce.printStackTrace ();} catch (TransformerException te) {System .out.println ("Transformer Exception / N ---------"); TE.PrintStackTrace ();} In actual applications, we can apply traditional DOM API to get Dom Tree from the XML document, The Dom Tree is then performed on the Dom Tree, and the final document object is obtained, and the Document object can be created by this Document object. The rest is to move the above code. After the program is running, XMLoutput.xml is The results you need (of course, you can change the parameters of the StreamResult class constructor, specify different output media, not necessarily a thousand XML documentation). The greatest advantage of this method is that the contents of the control Dom Tree you can output into the format in the output medium, but the light relying on the TransformerFactory class and the Transformer class do not implement this feature, but also rely on the help of the OutputKeys class. For a complete example, please refer to the following files: addRecord2.java (see attachment), user.xml (see attachment). The operating environment of this example is: Windows XP Professional, JDK 1.3.1. In order to be able to function properly, you need to go to the URL http://java.sun.com to download and install JAXP 1.1 or Java XML Pack (Java XML Pack has JAXP). OutputKeys class javax.xml.transform.outputKeys class and java.util.properties class work with the JAXP's XSLT engine (Transformer class) to output an XML document format. Please see the following code snippet: // First create a TransformerFactory object, then create a Transformer object. TransformerFactory TF = TransformerFactory.newInstance (); Transformer T = TF.NEWTRANSFORMER (); // Get the output attribute of the Transformser object, which is the default output attribute of the XSLT engine, which is an //java.util.properties object. Properties Properties = T. GetOutputProperties (); // Settings New Output Properties: Output Characters Encoding is GB2312, which can support Chinese characters, and XML documents output // of the XSLT engine, if the Chinese characters are included, it can be displayed normally, not The so-called "Chinese characters //" appears. / / Please pay attention to the string constant OutputKeys.Encoding of the OutputKeys class. Properties.SetProperty (OutputKeys.Encoding, "GB2312"; / Update the output attribute of the XSLT engine. T.SETOTPUTPROPERTIES (Properties); // Call the XSLT engine, output according to the settings in the output attribute, output the contents in the DOM Tree to the output medium. T.TRANSFORM (DOMSource_Object); From the above program code, we are not difficult to see, by setting the output attribute of the XSLT engine (Transformer class), you can control the output format of the content in the DOM Tree, which is for us to customize the output content. Very helpful.

Then the JAXP's XSLT engine (Transformer class) is available for output properties? Javax.xml.transform.outputKeys classes define a lot of string constants, they are all freely set output properties, and the common output properties are as follows: Public static final java.lang.String Method can be set to "XML", "HTML", "Text" equivalent. Public static final java.lang.string Version Follow the specification version number, if Method is set to "XML", then its value should be set to "1.0", if Method is set to "HTML", then its value should be set to "4.0", if Method is set to "Text", then this output attribute is ignored.

Public static final java.lang.String Encoding The encoding method used in the output, such as // "GB2312", "UTF-8", etc., if it is set to "GB2312", the so-called "Chinese character problem / . Public static final java.lang.string omit_XML_Declarative settings Whether to ignore the XML declaration when output to the XML document, that is,: This Code. Its optional value has "Yes", "NO". Public static final java.lang.string Indent Ident Settings whether the XSLT engine is automatically added to an additional space when outputting an XML document, and its optional value is "Yes", "NO". Public static final java.lang.String Media_type media_type Sets the MIME type of the output document. If you set the output attribute of the XSLT engine? Let's summarize: First, get a collection of the default output attribute of the XSLT engine (Transformer class), which requires the GetoutputProperties () method of the Transformer class, and the return value is a Java. Util.Properties object. Properties properties = transformer.getOutputProperties (); then the new set of output attributes, such as: properties.setProperty (OutputKeys.ENCODING, "GB2312"); properties.setProperty (OutputKeys.METHOD, "html"); properties.setProperty ( OutputKeys.version, "4.0"); ......................................................................................................... The SetOutputProperties () method of the Transformer class is a java.util.properties object. We wrote a new program, which applied an OutputKeys class to control the output attribute of the XSLT engine. The schema's architecture and the previous program (AddRecord3.java) are roughly the same, but the output results are slightly different. For complete code, please refer to the following files: addRecord3.java (see attachment), user.xml (see attachment). The operating environment of this example is: Windows XP Professional, JDK 1.3.1. In order to be able to function properly, you need to go to the URL http://java.sun.com to download and install JAXP 1.1 or Java XML PACK (Java XML Pack contains JAXP). Method 4: Using the Xalan XML Serializer Method is actually a variant of method three, it requires support for Apache Xalan and Apache Xerces to run. The example code is as follows: // First create a DomSource object, the parameter of the constructor can be a Document object // DOC representative change Dom Tree. DomSource DomSource = New DomSource (DOC); // Create a DomResult object, temporarily saving the output result of the XSLT engine. DomResult DomResult = New DomResult (); // The XSLT engine in JAXP is called to implement the data in the DOM Tree to the XML file. The // The input of the XSLT engine is a DomSource object, which is output as a DomResut object.

Try {// First create a TransformerFactory object, then create a Transformer object. Transformer // class is equivalent to an XSLT engine. Usually we use it to handle the XSL file, but here we use it to output an XML document. TransformerFactory TF = TransformerFactory.newInstance (); Transformer T = TF.NEWTRANSFORMER (); // Setting the properties of the XSLT engine (essential, otherwise // "Chinese character problem //"). Properties properties = t.getOutputProperties (); properties.setProperty (OutputKeys.ENCODING, "GB2312"); t.setOutputProperties (properties); // critical step, call the Transformer object (XSLT engine) of the transform () method The first // parameter is the DomSource object, and the second parameter is the DomResult object. T.TRANSFORM (DomSource, DomResult); // Create the default Xalan XML Serializer, use it to output the content in the DomResult object // (DomResult) to the output medium in the form of output stream. Serializer Serializer = SerializerFactory.getSerializer ("XML")); // Settings the output attribute of the Xalan XML Serializer, which is essential, otherwise it may also produce // so-called "Chinese character issues //". Properties Prop = Serializer.getputPutFormat (); Prop.SetProperty ("Encoding", "GB2312"); Serializer.setputFormat (Prop); // Create a File object, representing the output media of the data contained in the DOM Tree, this is a XML file. File f = new file ("xuser3.xml"); // Creating a file output stream object FOS, please pay attention to the parameters of the constructor. FileOutputStream Fos = New FileOutputStream (f); // Sets the output stream of the Xalan XML Serializer. Serializer.SetOutputStream (fos); // Serial output result. Serializer.AssDomserializer (). Serialize ();} catch (Exception TCE) {Tce.PrintStackTrace ();} This method is not common, and it seems to have a bit drawing snake to add, so we don't discuss it. For a complete example, please refer to the following files: addRecord4.java (see attachment), user.xml (see attachment). The operating environment of this example is: Windows XP Professional, JDK 1.3.1. In order to be able to compile the addRecord4.java this program, you need to go to the URL http://xml.apache.org/dist/ to download and install Apache Xalan and Apache Xerces. Or go to the URL http://java.sun.com/xml/download.html to download and install Java XML PACK. Because the latest Java XML PACK (Winter 01) contains Apache Xalan and Apache Xerces technology. in conclusion:

转载请注明原文地址:https://www.9cbs.com/read-129411.html

New Post(0)