Learning and use Jakarta Digesterby Philipp K. Janert, ph.d.10 / 23/2002 Turning An XML Document Into a Corresponding Hierarchy Of Java Bean Objects Is A Fairly Common Task. In A Previous Article, i
Described How To Accomplish this using the standard sax and dom apis.
Although Powerful and Flexible, Both Apis Are, In Effect, TOO LOW-level for the specific Task At Hand. Furthermore, The
UNMARSHALLING Procedure Itself Requires A Fair Amount of Coding: a parse-stack must be maintained by Sax, and the
DOM-Tree Must Be NaviGated When Using DOM.
This is where the apache jakarta commons digester framework comes in.
The Jakarta Digester FrameworkThe Jakarta Digester Framework Grew Out of The Jakarta Struts Web Toolkit. Originally Developed to Process The Central
Struts-config.xml configuration file, it is soon recognized what the framework was more generally useful, and moved to the
Jakarta Commons Project, The Stated Goal of Which Is To Provide A "Repository of Reusable Java Components" The Most Recent
Version, Digester 1.3, Was ReleaseD on August 13, 2002.
The Digester Class Lets The Application Programmer Specify A Set of Actions To Be Performed WHENEVER The Parser Encounters
Certain Simple Patterns in the XML Document. The Digester Framework Comes with 10 prepackaged "Rules," Which Cover Most of
The Required Tasks When Unmarshalling XML (Such As Creating A Bean Or Setting A Bean Property), But Each User is Free To
Define and Implement His Or Her Own Rules, as Necessary.
THE EXAMPLE Document and Beansin this Example, We will unmarshall the Same XML Document That We buy in the prepious article:
XML Version = "1.0"?>
Catalog> The Bean Classes Are Also The Same, Except for One Important Change: in The Previous Article, i Had Declared these Classes To
Have Package Scope - Primarily So That I Could Define All of Them in The Same Source File! Using The Digester Framework,
THIS NO LONGER POSIBLE; The Classes Need To Be Decland AS PUBLIC (As IS Required For Classes Conforming to the Javabeans
Specification:
Import java.util.vector;
Public class catalog {private vector books; private vector magazines;
Public catalog () {books = new vector (); magazines = new vector ();
Public void addbook (book rhs) {books.addelement (rhs);} public void addmagazine (magazine rhs) {magazines.addelement (rhs)
Public string toString () {string newline = system.getProperty ("line.separator"); stringbuffer buf = new stringbuffer (); buf.append ("--- books ---") .append (newline); for INT i = 0; i BUF.Append ("--- magazines ---") .append (newline); for (int i = 0; i Return buf.tostring ();}} -------------------------------------------------- ------------------------------ Public class book {private string author; private string title; Public book () {} Public void setauthor (string rhs) {author = rhs;} public void settitle (string rhs) {title = rhs; Public string toString () {return "book: author = '" author "' title = '" title "}} -------------------------------------------------- ------------------------------ Import java.util.vector; Public class magazine {private string name; private vector articles; Public magazine () {articles = new vector (); Public void setname (string rhs) {name = rhs;} Public void addAndicle (articles.addelement (a);} Public string toString () {stringbuffer buf = new stringbuffer ("magazine: name = '" name "" "); for (int i = 0; i -------------------------------------------------- ------------------------------ Public Class Article; Private String Page; Public articles () {} Public void setheadline (string rhs) {headline = rhs;} public void setpage (string rings) {page = rhs; Public string toString () {return "article: headline = '" headline "' on page = '" "";}} Specthe Digester Class Processes The Input Xml Document Based on Patterns and Rules. The Patterns Match XML Elements, Based ON Their name and location in the document tree. The syntax used to describe the mathing pattern resembles the xpath match Patterns, a little: the pattern catalog matches the top-level Element Nested Directly Inside a All Patterns Are Absolute: The Entire Path from The Root Element on Down Has To Be Specified. The Only Exception Are Patterns Containing The Wildcard Operator *: The pattern * / name will match a .................. .. WHENEVER THE Digester Encounters One of The Specified Patterns, IT Performs The Actions That Have Been Associated with It. In THIS, The Digester Framework Is of Course Related to A Sax Parser (AND in Fact, The Digester Class Implements Org.xml.sax.contenthandler and maintains the parse stack). All rules to be used with the digester must extend org.apache.commons.digester.Rule - which in itself exposes methods similar to the SAX ContentHandler callbacks: begin () andend () are called when the opening and closing tags of the matched element are encountered. ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Ltd., Which is caled overce processing of the closing tag is completion, to provide a hook to do pictible final clean-up chores. MOST Application Developers Will NOT Have to Concern Themsels with these functions, however, Since the Standard Rules That Ship With the framework area likely to provide all desired functionality. To unmarshal a document, tell, create an instance of the org.apache.commons.digester.digester Class, Configure IT IF Necessary, Specify THE REQUIRED PATTERNS AND RULES, AND FINALL, PASS A Reference To The XML File To The Parse () Method. this IS Demonstrated In The DigesterDriver Class Below. (The FileName of The Input XML Document Must Be Specified On The Command Line.) Import org.apache.commons.digester. *; Import java.io. *; import java.util. *; Public class digesterDriver { Public static void main (String [] args) { Try {Digester Digester = New Digester (); Digester.SetValidating (false); Digester.addObjectcreate ("Catalog", Catalog.class; digester.addObjectCreate ( "catalog / book", Book.class); digester.addBeanPropertySetter ( "catalog / book / author", "author"); digester.addBeanPropertySetter ( "catalog / book / title", "title"); digester .addsetnext ("Catalog / Book", "Addbook"); digester.addObjectCreate ( "catalog / magazine", Magazine.class); digester.addBeanPropertySetter ( "catalog / magazine / name", "name"); digester.addObjectCreate ( "catalog / magazine / article", Article.class); digester .addSetProperties ( "catalog / magazine / article", "page", "page"); digester.addBeanPropertySetter ( "catalog / magazine / article / headline"); digester.addSetNext ( "catalog / magazine / article", "addArticle" ); Digester.AddsetNext ("Catalog / Magazine", "Addmagazine"); File INPUT = New File (Args [0]); Catalog C = (Catalog) Digester.Parse (Input); System.out.println (c.toString ()); } catch (Exception EXC) {Exce (Exception Exc) {EXC.PrintStackTrace ();}}} After Instantiating The Digester, We Specify That Should Not Validate The XML Document Against A DTD - Because We Did NOT Define One for Our Simple Catalog Document. Then We Specify THE PATTERNS AND THE Associated Rules: The ObjectCreaterule Creates An Instance of the Specified Class and Pushes It ONTO The Parse Stack. The SetPropertiesRule Sets a bean property To THE VALUE OF An Xml Attribute Of The Current Element - The First Argument To The Rule Is The Name of The Attribute, THE SECOND, The name of the property. WHEREAS setPropertiesrule Takes the value from an attribute, beanpropertysetterrule Takes the value from the raw character Data Nested Inside of the Current Element. it is not necessary to specify the name of the matterty to set when using BeanpropertySetterrule: IT defaults to the name of the capital XML Element. In The Example Above, this Default is being used . In the rule definition matching the catalog / magazine / article / headline pattern Finally, the SetNextRule pops the object ontop of the parse stack and passes it to the named method on the object below it - it is commonly used to insert a finished Bean INTO ITS Parent. Note That It is Possible To Register Several Rules for The Same Pattern. If this Occurs, The Rules Are EXECUTEDION THE In which they area added to the digester - for instance, to deal with the Catalog / Magazine / Article, WE First Create The Appropriate Article Bean, Then Set The Page Property, And Finally Pop The Completed Article Bean and INSERT IT ITO ITS Magazine Parent. Invoking Arbitrary Functionsit Is Not Only Possible To Set Bean Properties, But To Invoke Arbitrary Methods On Objects in The Stack. This is Accomplished Using the callmethodrule to specify the method name and iptionally, the number and type of arguments passed to it. Subsequent Specifications of The Callparamroule Define The Parameter Values To Be Passed To The Invoked Functions. The VALUES Can Be Taken Either from named attributes of the current xml element, or from the rwage character data contained by the current Element. for Instance, Rather Than Using The BeanpropertySetterrule in The DigesterDriver IMPLEMENTATION Above, We Could Have Achieved The Same Effect by Calling The Property Setter Explicitly, and passing the data as parameter: Digester.addCallmethod ("Catalog / Book / Author", "SetAuthor", 1); Digester.AddcallParam ("Catalog / Book / Author", 0); The First Line Gives The Name of the Method To Call (SETAUTHOR ()) And the expected number of parameters (1). The second line says to take the value of the function parameter from the character data contained in the (E.G., Digester.Addcallparam ("Catalog / Book / Author", 0, "Author");), The Value Would Have Been Taken from the Respective Attribute of the current element instead. One Important Caveat: Confusingly, Digester.addCallMethod ("Pattern", "MethodName", 0); Does Not Specify A Call To a Method Taking No Arguments - Instead, IT Specifies a call to a method taking one argument, the value of which is taken from the val Character Data of The Current XML Element! We Therefore have yet another way to import a replacement forment for BeanpropertySetterrule: Digester.addCallmethod ("Catalog / Book / Author", "SetAuthor", 0); To Call a Method That Truly takes no parameters, Use digester.addcallmethod ("pattern", "methodname") ;. Summary of Standard Rules Below Are Brief Descriptions of All of The Standard Rules. CreationAlObjectCreaterule: Creates An Object of The Specified Class Using Its Default Construction; IS; IS; IS Popped. The class to instantiate can be be given through a class object or the full-qualified class Name. FactoryCreaterule: Creates An Object Using a Specified Factory Class and Pushes It Onto The Stack. This Can Be Useful for Classes That Do Not Provide a Default Constructionor. The Factory Class Must Implement THE Org.apache.commons.digester.ObjectCreationFactory Interface. Property SettersSetPropertiesRule: Sets one or several named properties in the top-level bean using the values of named XML elementattributes Attribute names and property names are passed to this rule in String [] arrays (Typically used to handle XML.. Constructs like BeanpropertySetterrule: sets a named property on the top-level bean to the character data enclosed by the current XML ELEMENT. (Example: SetPropertyRule: Sets a property on the top-level bean. Both The Property Name, As Well As The Value to Which this property Will Be set, Are Given as attributes to the current xml element. (example: PARENT / CHILD ManagementSetNextrulect: POPS The Object On Top of the Stack and Passes It to a named method on the object immediately below. Typepically Used to INSERT A Completed Bean ITO ITS Parent. Settoprule: Passs the second-to-top object on the stack to the top-level object. This is useful if the child object exposes A setParent Method, Rather Than The Other Way Around. SetRootrule: Calls a Method on the Object At the Bottom of The Stack, Passing The Object On Top of the Stack As Argument. Arbitrary Method Callscallmethodrule: Calls An Arbitrary Named Method on The Top-Level Bean. The Method May Take an Arbitrary Set of Parameters. . Callparamrule: Repesents the value of a method parameter. The value of the parameder is each taken from a named XML Element attribute, or from the raw character data enclosed by the current element. this rule required The parameter list is specified by an integer index.specifying rules in xml: using the xmlrules package SO far, we have specified the patterns and rules program, while call time. While Conceptually Simple and SIMPLE and PROTECALLY StraightForward, this Feels A bit Odd: The Entire Framework is About Recognizing and Handling Structure and Data At Run Time, But Here We Go Fixing The Behavior At Compile Time! Large Numbers of Fixed Strings in Source Code Typical IND THAT Something is Being Configured (Rather Than Programmed), Which Could Be (and Probably Should B) Done At Run Time INSTEAD. The org.apache.commons.digester.xmlRules Package Addresses this issesue. It provides the digesterloader Class, Which Reads The Pattern / rule-pairs from an XML Document and Returns A Digester Already Configured ACCORDINGLY. The XML Document Configuring The Digester Must Comply with the digester-rules.dtd, Which is part of the xmlrules package. BELOW Is The Contents of The Configuration File (named rules.xml) for the example application. I want to point SEveral Things here. Patterns Can Be Specified in Two Different Ways: Either As Attributes To Each XML Element Representing a Rule, Or Using To Defined in the enclosing The Finally, Using The Current Release of The Digester Package, It Is Not Possible To Specify THE Beanpropertysetterrule in The BeanpropertySetterRule In Configuration File. INSTEAD, WE Are Using The CallmethodRule To Achieve The Same Effect, as Explained Above. XML Version = "1.0"?> Second. (Confusingly, The DigesterLoader Will NOT RULES.XML File from a file or an org.xml.sax.inputsource, but Requires a Url - The File Reference In The Code Below Is Therefore Transformed Into An Equivalent URL.) Import org.apache.commons.digester. *; import org.apache.commons.digester.xmlrules. *; Import java.io. *; import java.util. *; Public class xmlrulesdriver {public static void main (string [] args) {TRY { File INPUT = New File (Args [0]); file rules = new file (args [1]); Digester Digester = DigesterLoader.createdigester (Rules.tourl ()); Catalog catalog = (Catalog) digester.parse (input); System.out.println (catalog.toString ());} catch (Exception exc) {exc.printStackTrace ();}}} ConclusionThis concludes our brief overview of the Jakarta Commons Digester Package. Of Course, There Is More. One Topic Ignored in THIS Introduction Are XML Names: Digester Allows You to Specify Rules That Only Act On Elements Defined within a Certain Namespace. We Mentioned Briefly The Possibility Of Developing Custom Rules, by Extending The Rule Class. The Digester Class Exposes the Customary push (), peek (), AND pop () Methods, Giving The Individual Developer Freedom to manipulate the Parse Stack Directly. Lastly, Note That There is An Additional Package Providing A Digester Implementation Which DEALS with RSS (Rich-Site-Summary) -Formatted Newsfeeds. The Javadoc Tells The Full Story. ReferencesJakarta Commons Digester Homepage "Simple XML Parsing with SAX and DOM" Jakarta Struts Homepage "Java & XML Data Binding" - XML Data Binding addresses the general problem of making XML data available in applications Programming Jakarta Struts -. The upcoming book on Jakarta Struts at O'Reilly. Philipp K. Janert, Ph.d. Is A Software Project Consultant, Server Programmer, And Architect.