Java XML tutorial

xiaoxiao2021-03-06 152

Chapter 1 Introduction

About this tutorial

In this tutorial, we will discuss how to use an XML parser:

Handling an XML document

Create an XML document

Operation an XML document

We will also discuss some of the XML parser characteristics known for everyone. Most importantly, each tool we discussed can be available free of charge from IBM's AlphaWorks Site (www.alphaworks.ibm.com) and other websites.

Not discussed:

Some important programming concepts are not here:

1 Use a visual tool to build an XML app

2 Convert an XML document from one form to another

3 Create an interface for end users or other processes, and store data for backend

All of these concepts are important when you build an XML app. We are preparing new tutorials to discuss them, so please often visit our website!

XML application architecture

An XML application is usually constructed based on an XML parser. It provides an interface for its users, as well as a interface to the backend stored data.

This tutorial focuses on writing Java code using the XML parser to operate the XML document. As shown in the picture below, this tutorial focuses on the middle.

Chapter II parser foundation

basis

An XML parser is a piece of code that can be read into a document and analyzing its structure. In this chapter, we will discuss how to use an XML parser to read an XML document. We will also discuss different types of parsers and when you use them.

The chapters behind this tutorial will discuss what you can get from the parser and how to use these results.

How to use a parser

We will discuss it in a later chapter, but in general, you will use it as follows:

1 Create a parser object

2 Pass your XML document to the parser

3 processing results

Build an XML app Obviously be far exceeds these, but usually an XML application will contain these processes.

Parser type

There are different ways to divide the parallel species:

Verification or non-verification parser

Support for Document Object Model (DOM) parser

Support for analyzers for Simple API for XML (SAX)

Parcer (Java, C , Perl, etc.) written in a specific language

Verification or non-verification parser

As we mentioned in the first tutorial, the XML document will be referred to as a valid document (Valid Document if using a DTD and the rules in the DTD will be used. XML documents that meet the basic marking rules are called a format correct document (Well-Formed Document).

The XML specification requires all parsers to report error when it finds that a document is not formatted correctly. Validation is another problem. Validating Parser is validated simultaneously in parsing XML documents. Non-Validating Parser ignores all validation errors. In other words, if an XML document is correct, a non-verified parser does not pay attention to whether the document meets the rules specified by its corresponding DTD (if any).

Why use non-verified parsers?

Speed and efficiency. To process a XML parser to process DTD and make sure that each XML element is in line with the rules in the DTD require considerable overhead. If you are sure an XML document is valid (may come from a data source), then there is no need to verify it at a time.

Similarly, sometimes what you need is just a marker of XML from one document. Once you have these tags, you can extract data from it and then processed. If this is what you need, a non-verified parser is the correct choice.

Document Object Model (DOM)

Document object model (Document Object Model) is a formal recommendation for World Wide Web Consortium (W3C). It defines an interface that allows programs to access and update the style, structure, and content of XML documents. The XML parser that supports the DOM implements the interface. The first edition of this specification, DOM Level 1, available from http://www.w3.org/tr/rec-dom-level-1, if you are willing to read the specification.

What does the DOM parser provide

When you use a DOM parser to resolve an XML document, you will get a tree structure that contains all elements in the document. DOM provides different features to check the contents and structures of the document.

About standard

Now we will discuss the development of XML applications, we also pay attention to the standard of XML. Formally, XML is a product of Mit (MAC) trademark and WORLD WIDE Web Consortium (W3C).

The XML specification, the official recommendation of W3C, can be downloaded from www.w3.org/tr/rec-xml. The W3C site contains the specifications of XML, DOM, and a lot of XML related standards.

SIMPLE API for XML (SAX)

SAX API is another method for processing XML document content. A reasonable standard, which is developed by David Megginson and XML-DEV mail lists.

To view the full SAX standard, see www.megginson.com/sax/. To participate in the XML-DEV mailing list, send an email to Majordomo@ic.ac.uk which contains: subscribe XML-dev.

What can SAX parsers provide

When you use the SAX parser to resolve the XML document, the parser will generate an event from different documents. Decide how to handle each event.

The SAX parser will generate events in the following situations: At the beginning and end of the document, when the element starts and end, or when it is found in an element, and other several points. You can write Java code to handle each event and how to handle information obtained from the parser.

When is SAX? When is DOM?

We will discuss this problem in a later chapter, but in general, you should use a DOM parser at the following:

You need to know the structure of the document very well

You need to operate some parts in the document (for example, you may want to sort some elements)

You need more than once to use the information in the document

When you only need to extract a number of elements from an XML document, you can use the SAX parser. The SAX parser is when you don't have most memory, or if you only need to use the information in the document (instead of parsing the document once, it will use it repeatedly).

XML parser in different languages

Most languages used on the Web have their corresponding XML parsers and libraries, including Java, C , Perl, and Python. The next page describes the links of the parsers provided by IBM or other companies.

Most examples in this tutorial are the use of IBM's XML4J parser. All code we discussed uses standard interfaces. In the final chapter of this tutorial, we will show you how to write the code that can use the different parsers.

Java

IBM parser, XML4J, can be obtained from www.alphaworks.ibm.com/tech/xml4j.

James Clark parser, XP, available from www.jclark.com/xml/xp.

Sun's XML parser can be downloaded from developer.java.sun.com/developer/products/xml/ (you must be a member of Java Developer Connection).

Datachannel's xjparser can be obtained from xdev.dataachannel.com/downloads/xjparser/. C

IBM's XML4C parser can be obtained from www.alphaworks.ibm.com/tech/xml4c.

James Clark's C parser, EXPAT, you can get from www.jclark.com/xml/expat.html.

Perl

There are a variety of Perl language XML parsers. For more information, see www.perlxml.com/faq/perl-xml-faq.html.

Python

To get more Python language XML parsers, see www.python.org/topics/xml/.

to sum up

The core of any XML application is an XML parser. To process an XML document, your app will create a Parser object, pass an XML Document to it, and then process the result returned from the Parser object.

We discussed different types of XML parsers, and why you choose one. We classify parsers in different ways:

Verification or non-verification parser

Support for Document Object Model (DOM) parser

Support for analyzers for Simple API for XML (SAX)

Parcer (Java, C , Perl, etc.) written in a specific language

In our next chapter, we will explore the DOM parser and how to use them.

Chapter 3 DOM (Document Object Model)

DOM, DOM, DOM, DOM, DOM,

DOOBIE, DOOBIE,

DOM, DOM, DOM, DOM, DOM ...

The DOM is a general interface for operating a document structure. One goal it is designed is to use the Java code written for a DOM compatible parser to use any other DOM-compatible parser without modifying the code. (We will show this later.)

As we mentioned earlier, a DOM parser will return the structure of your entire document in a tree form.

Sample code

Before we continue, please download our sample XML app. Unlock this file xmljava.zip, you can! (Blueski: *** or view the appendix of this tutorial)

DOM interface

The DOM defines multiple Java interfaces. The following is common:

Node: DOM basic data type.

ELEMENT: You will be the most important object being ELEMENT.

Attr: represents the properties of an element.

TEXT: Actual content of Element or Attr.

Document: Represents the entire XML document. A Document object is often also referred to as a DOM tree.

Common DOM method

When you use a DOM, the following is the method you will usually use:

Document.getdocumentelement ()

Returns the root of the document (root) element.

Node.getfirstchild () and node.getlastchild ()

Returns the first child for a given Node.

Node.getnextsibling () and node.getprevioussibling ()

It will delete everything in the DOM tree, format your hard drive, and then send you a snap-in message for each person in your address book. (Not true. These methods return to the next or the previous given compatriots.)

Node.getaTribute (attrname)

Returns the properties of a given name for a given Node. For example, if you want to get an object named attribute, you can call GetAttribute ("ID").

Our first DOM app! Introduced a lot of concepts, let us continue. Our first application simply reads an XML document and outputs its content to standard output.

In a command line window, run the following command:

Java Domone Sonnet.xml

This command will load our app and let it resolve the SONNET.XML file. If everything is working properly, you will see the content of the XML document is output to the standard output.

Shakespeare

William

British

1564

1616

SONNET 130 </ Title></p> <p><line></p> <p><LINE> My MISTRESS? Eyes Are ...</p> <p>Domone profile</p> <p>Domone's source code is very straight. We create a new class domone; it has two methods, ParseandPrint, and PrintDomtree.</p> <p>In the main method, we handle the command line, create a Domone object, and then pass the file name to the Domone object. Domone object Creates a Parser object, parsing the document, and then processes the DOM tree (ie Document object) via the PrintDomtree method.</p> <p>We will study each step in detail.</p> <p>Public Class Domone</p> <p>{</p> <p>Public void ParseandPrint (String Uri)</p> <p>...</p> <p>Public void printdomtree (Node Node)</p> <p>...</p> <p>Public static void main (String Argv [])</p> <p>...</p> <p>Handling the command line</p> <p>Processing the code of the command line is displayed on the left. We will check if the user enters the parameter on the command line. If not, we print usage and launch; otherwise, we assume the first parameter on the command line (Argv [0] in the Java language) is a document name. We ignore other parameters that users may enter.</p> <p>We use command line arguments to simplify our example. In most cases, an XML application may use servlet, java beans, and other types of components; and use command line arguments are not a problem.</p> <p>Public static void main (String Argv [])</p> <p>{</p> <p>IF (argv.length == 0)</p> <p>{</p> <p>System.out.println ("USAGE: ...");</p> <p>...</p> <p>System.exit (1);</p> <p>}</p> <p>Domone D1 = New Domone ();</p> <p>D1.Parseandprint (Argv [0]);</p> <p>}</p> <p>Create a Domone object</p> <p>In our sample code, we create a separate class Domone. To resolve files and print results, we create an instance of a Domone class, then let our Domone objects to parse and print an XML document.</p> <p>Why do we do this? Since we want to use a recursive approach to traverse the DOM tree and print out the results. We cannot handle it with a static method such as Main, so we create a separate class to handle it. Public static void main (String Argv [])</p> <p>{</p> <p>IF (argv.length == 0)</p> <p>{</p> <p>System.out.println ("USAGE: ...");</p> <p>...</p> <p>System.exit (1);</p> <p>}</p> <p>Domone D1 = New Domone ();</p> <p>D1.Parseandprint (Argv [0]);</p> <p>}</p> <p>Create a Parser object</p> <p>Now we have let Domone instances pars and process our XML documents, and its first process is to create a new Parser object. In this case, we will use a DOMPARSER object, a Java class that implements the DOM interface. There are other Parser objects, such as SaxParser, ValidativeSaxParser, and NonvalidAnParser, in the XML4J package.</p> <p>Note We put this code in a TRY module. Parser will throw an exception in some cases, including an invalid URI, not finding a DTD or an XML document is not a valid or format error. To handle it well, we have to capture exceptions.</p> <p>Try</p> <p>{</p> <p>DOMPARSER PARSER = New DOMPARSER ();</p> <p>Parser.Parse (URI);</p> <p>Doc = parse.getdocument ();</p> <p>}</p> <p>Analysis XML document</p> <p>Analysis of the document is just a simple line of code. When the resolution ends, we get the Document object generated by the parser.</p> <p>If the Document object is not NULL (if the resolution process is wrong, it will be null), we pass it to the PrintDomtree method.</p> <p>Try</p> <p>{</p> <p>DOMPARSER PARSER = New DOMPARSER ();</p> <p>Parser.Parse (URI);</p> <p>Doc = parse.getdocument ();</p> <p>...</p> <p>IF (doc! = null)</p> <p>PrintDomtree (DOC);</p> <p>}</p> <p>Treat DOM tree</p> <p>Now that the resolution has been completed, we will traverse the DOM tree. Note that this code is recursive. For each node, we handle itself, then we recursively call the Printdomtree method for children of each node. Recursive calls are as shown left.</p> <p>Remember that when some XML documents are very large, they don't have too many tags. Taking a Shanghai phone book as an example, there may be millions of records, but its mark may not exceed several layers. Considering this reason, the stack of recursive algorithms is not a problem.</p> <p>Public void printdomtree (Node Node)</p> <p>{</p> <p>INT nodetype = node.getnodetype ();</p> <p>Switch (nodetype)</p> <p>{</p> <p>Case Document_Node:</p> <p>PrintDomtree ((document).</p> <p>GetDocumentElement ());</p> <p>...</p> <p>Case element_node:</p> <p>...</p> <p>Nodelist children =</p> <p>Node.getChildNodes ();</p> <p>if (Children! = NULL)</p> <p>{</p> <p>For (int i = 0;</p> <p>I <children.getlength ();</p> <p>i )</p> <p>PrintDomtree (Children.Item (i));</p> <p>}</p> <p>A lot of NODE</p> <p>If you look at SonNet.xml, there are twenty-four nodes. You may think this means twenty-four nodes. However, this is incorrect. There are 69 nodes in SonNet.xml; a document node, 23 element node, and 45 text node. We run Java Domcounter Sonnet.xml to get the results shown below. Domcounter.java</p> <p>This code parses an XML document and traverses the DOM tree to collect data about the document. Output it to the standard output when data acquisition.</p> <p>Statistics SONNET.XML data:</p> <p>=====================================</p> <p>Document Nodes: 1</p> <p>Element Nodes: 23</p> <p>Entity Reference Nodes: 0</p> <p>CData Sections: 0</p> <p>Text Nodes: 45</p> <p>Processing instructions: 0</p> <p>------------</p> <p>Total: 69 Nodes</p> <p>Node column</p> <p>For the following pieces,</p> <p><SONNET TYPE = "Shakespearean"></p> <p><author></p> <p><Last-Name> Shakespeare </ last-name></p> <p>The following is the node returned from the parser:</p> <p>Document node</p> <p>The ELEMENT node corresponds to the <SONNET> tag</p> <p>A TEXT node corresponds to the carriage return after the <SONNET> node and two spaces before the <Author> tag</p> <p>The ELEMENT node corresponds to <author> tag</p> <p>A TEXT node corresponds to the paragraph of the <author> node and four space characters before the <last-name> tag</p> <p>ELEMENT node corresponds to <Last-Name> tag</p> <p>All text nodes</p> <p>If you view all the nodes returned by the parser, you will find that most of them are useless. The space starting at each row consists of a negligible TEXT node.</p> <p>Note If you put all the nodes on a row, we will not get these useless nodes. We enhance the readability of the document by adding branches and space characters.</p> <p>When you build an XML document, you don't need to consider readability, you can omit the branch and space characters. This makes your document smaller and does not need to build those useless nodes when handling your document.</p> <p>All text nodes</p> <p>If you view all the nodes returned by the parser, you will find that most of them are useless. The space starting at each row consists of a negligible TEXT node.</p> <p>Note If you put all the nodes on a row, we will not get these useless nodes. We enhance the readability of the document by adding branches and space characters.</p> <p>When you build an XML document, you don't need to consider readability, you can omit the branch and space characters. This makes your document smaller and does not need to build those useless nodes when handling your document.</p> <p><SONNET TYPE = "Shakespearean"></p> <p><author></p> <p><Last-Name> Shakespeare </ last-name></p> <p><first-name> William </ first-name></p> <p><nationality> British </ nationality></p> <p><Year-of-Birth> 1564 </ year-of-birth> <year-of-death> 1616 </ year-of-death></p> <p></ author></p> <p><title> SONNET 130 </ Title></p> <p><line></p> <p><Line> My Mistress' Eyes Are Nothing Like the Sun, </ Line></p> <p>A TEXT node corresponds to the "shakespeare" character</p> <p>If you see all the space characters between tags, you can find why we have so much nodes you imagined.</p> <p>Know your Node</p> <p>We finally put it on the Node that is processed on the DOM tree is that we have to check the type of each Node before processing it. Some methods, such as GetAttribute, returns a NULL value for some specific node types. If you do not check the node type, you will get an incorrect result (best) and an exception (worst).</p> <p>The Switch statement described hereually often occurs in the use of the DOM parser.</p> <p>Switch (nodetype)</p> <p>{</p> <p>Case Node.Document_Node:</p> <p>...</p> <p>Case node.element_node:</p> <p>...</p> <p>Case Node.Text_Node:</p> <p>...</p> <p>}</p> <p>to sum up</p> <p>No matter what you believe, this is all what we need to know using the DOM object. Our Domone code has completed the following work:</p> <p>Create a Parser object</p> <p>Pass an XML document to Parser to resolve</p> <p>Get the Document object from Parser and check it.</p> <p>In the last chapter of this tutorial, we will discuss how we don't need XML original files to build a DOM tree and show how to sort an element in an XML document. And those are based on the concept we discussed here.</p> <p>We will explore the SAX API in detail before we continue those more advanced applications. We will also use similar examples to show the difference between SAX and DOM.</p></div><div class="text-center mt-3 text-grey"> 转载请注明原文地址:https://www.9cbs.com/read-103260.html</div><div class="plugin d-flex justify-content-center mt-3"></div><hr><div class="row"><div class="col-lg-12 text-muted mt-2"><i class="icon-tags mr-2"></i><span class="badge border border-secondary mr-2"><h2 class="h6 mb-0 small"><a class="text-secondary" href="tag-2.html">9cbs</a></h2></span></div></div></div></div><div class="card card-postlist border-white shadow"><div class="card-body"><div class="card-title"><div class="d-flex justify-content-between"><div><b>New Post</b>(<span class="posts">0</span>) </div><div></div></div></div><ul class="postlist list-unstyled"> </ul></div></div><div class="d-none threadlist"><input type="checkbox" name="modtid" value="103260" checked /></div></div></div></div></div><footer class="text-muted small bg-dark py-4 mt-3" id="footer"><div class="container"><div class="row"><div class="col">CopyRight © 2020 All Rights Reserved </div><div class="col text-right">Processed: <b>0.038</b>, SQL: <b>9</b></div></div></div></footer><script src="./lang/en-us/lang.js?2.2.0"></script><script src="view/js/jquery.min.js?2.2.0"></script><script src="view/js/popper.min.js?2.2.0"></script><script src="view/js/bootstrap.min.js?2.2.0"></script><script src="view/js/xiuno.js?2.2.0"></script><script src="view/js/bootstrap-plugin.js?2.2.0"></script><script src="view/js/async.min.js?2.2.0"></script><script src="view/js/form.js?2.2.0"></script><script> var debug = DEBUG = 0; var url_rewrite_on = 1; var url_path = './'; var forumarr = {"1":"Tech"}; var fid = 1; var uid = 0; var gid = 0; xn.options.water_image_url = 'view/img/water-small.png'; </script><script src="view/js/wellcms.js?2.2.0"></script><a class="scroll-to-top rounded" href="javascript:void(0);"><i class="icon-angle-up"></i></a><a class="scroll-to-bottom rounded" href="javascript:void(0);" style="display: inline;"><i class="icon-angle-down"></i></a></body></html><script> var forum_url = 'list-1.html'; var safe_token = 'N044BJwKMuT1tQgMLyfL1iHsE9Hubi5Wq0MnQDNRNMSIlfTpdtEhjNhuQyZGgQ0vKmP2wes4pEb56W3InMWMaQ_3D_3D'; var body = $('body'); body.on('submit', '#form', function() { var jthis = $(this); var jsubmit = jthis.find('#submit'); jthis.reset(); jsubmit.button('loading'); var postdata = jthis.serializeObject(); $.xpost(jthis.attr('action'), postdata, function(code, message) { if(code == 0) { location.reload(); } else { $.alert(message); jsubmit.button('reset'); } }); return false; }); function resize_image() { var jmessagelist = $('div.message'); var first_width = jmessagelist.width(); jmessagelist.each(function() { var jdiv = $(this); var maxwidth = jdiv.attr('isfirst') ? first_width : jdiv.width(); var jmessage_width = Math.min(jdiv.width(), maxwidth); jdiv.find('img, embed, iframe, video').each(function() { var jimg = $(this); var img_width = this.org_width; var img_height = this.org_height; if(!img_width) { var img_width = jimg.attr('width'); var img_height = jimg.attr('height'); this.org_width = img_width; this.org_height = img_height; } if(img_width > jmessage_width) { if(this.tagName == 'IMG') { jimg.width(jmessage_width); jimg.css('height', 'auto'); jimg.css('cursor', 'pointer'); jimg.on('click', function() { }); } else { jimg.width(jmessage_width); var height = (img_height / img_width) * jimg.width(); jimg.height(height); } } }); }); } function resize_table() { $('div.message').each(function() { var jdiv = $(this); jdiv.find('table').addClass('table').wrap('<div class="table-responsive"></div>'); }); } $(function() { resize_image(); resize_table(); $(window).on('resize', resize_image); }); var jmessage = $('#message'); jmessage.on('focus', function() {if(jmessage.t) { clearTimeout(jmessage.t); jmessage.t = null; } jmessage.css('height', '6rem'); }); jmessage.on('blur', function() {jmessage.t = setTimeout(function() { jmessage.css('height', '2.5rem');}, 1000); }); $('#nav li[data-active="fid-1"]').addClass('active'); </script>