Transparent cache XSL converter in JAXP
Use built-in cache functions in Transformer Factory to improve performance while maintaining easyibility
Summary
When you reuse the XSLT (EXTENSIBLE STYLESHEET LANGUAGE TRANSFORMATION), a style table cache can fundamentally improve the performance of web applications. However, for pure JAXP users, it is often inconvenient to use style table cache in most JAXP (Java API for XML PARSING). This article describes how to achieve increasing cache functions to Transformer Factory, and can use caches fully transparent. (May 2, 2003)
Author: Alexey Valikov
Don't doubt, XSLT is indeed a powerful technology, people use it to build a lot of XML applications. Specifically, numerous web developers can use XSLT to increase the ease of use and scalability of Web applications in the advantage of the presentation layer. However, this advantage is more memory and higher CPU load, which makes developers more attention to optimizing and cache technology when using XSLT. The cache is especially important when a web environment with many concurrent processes sharing style sheets is.
In these cases, the appropriate Transformation cache is proven to have great help to improve performance. The most common recommendation is to load Transformations into a Templates object when using JAXP, and use this object to create a TRANSFORMER object, which has a better effect than instantiation Transformer objects directly from the factory. Such a Templates object can be reused later to create more Transformers, which saves every time the style table resolution and compile. In "Top Ten Java and Xslt Tips, Eric Burke listed the following code in TIP1
Source xsltSource = new StreamSource (xsltFile); TransformerFactory transFact = TransformerFactory.newInstance (); Templates cachedXSLT = transFact.newTemplates (xsltSource); Transformer trans = cachedXSLT.newTransformer ();
In this example, Xsltfile's Transformation is first loaded into the CachedxSLT Templates object. Then this object is used to create a new Transformer object Transformer object. The advantage of this use is that when we need another Transformer object, the resolution and compilation process may be skipped.
Transformer anothertrans = cachedxslt.newtransformer ();
Although this skill will definitely improve performance (especially when repeatedly use the same style sheet, in a web application), it is not convenient for developers. The reason is that in addition to Templates-based Transformer instances, you must also pay attention to the last update time of the style sheet, reloading the outdated transformations, providing secure and high-efficiency cache when multi-threaded access style sheets, Many other details. Even the need to encapsulate a separate Transformer cache implementation that cannot help the developer using a third party module. A good example is JSTL X: Transform Mark: The current in org.apache.taglibs.standard.tag.common.xml.transformsupport and org.apache.taglibs.standard.tag.el.xml.TransFormTag class directly with TransformerFactory NEWTRANSFORMER (...) method. Obviously, X: Transform cannot improve performance with external cache implementation, but there is still a simple, nice way to solve this problem. JAXP always allows us to replace its Transformer Factory, why don't we write a CACHING capable solution to solve this problem?
This idea is not difficult to achieve. We can extend any suitable TransformerFactory (I use Michale Key's saxon 7.3) and the NEWTRANSFORMER (...) method of the parent class makes the Transmissions loaded from the file-based StreamSources can be cached and read from the cache. Of course, the premise is that Transmation is not modified after loading. Below is a new version of the newtransformer (...) method:
public Transformer newTransformer (final Source source) throws TransformerConfigurationException {// Check that source in a StreamSource if (source instanceof StreamSource) try {// Create URI of the source final URI uri = new URI (source.getSystemId ()); // IF URI POINTS TO A File, Load Transformer from The File // ("File" .Equalsignore (Uri.getscheme ()) Return NewTransformer (NEW FILE (URI));} catch (urorntaxexception urise ) {Throw new transformerConfigurationException (urise);} return super.newtransformer (source);}
As you can see, if Transformer's Source is not a streamsource or not pointing to a file, the parent class's newTransformer (...) will return Transformer. If Source is a file-based streamsource so we can use caching to implement smarter Transformation load processes.
File-based style table cache algorithm is very simple: according to a given file, we first check the same templates object in the cache. If you don't exist, we create and cache a new Templates object for this file. If you already exist in the cache, we check that the file has been modified by the file after the Template load, the time in the final modification time and cache of the comparison file. If the file is updated, Templates must be loaded again, otherwise it is removed from the cache. Finally, use the Template object (depending on the situation, it may be loaded from the cache or disk) to generate a new Transformer. There is a method implemented according to this algorithm:
protected Transformer newTransformer (final File file) throws TransformerConfigurationException {// Search the cache for the templates entry TemplatesCacheEntry templatesCacheEntry = read (file.getAbsolutePath ()); // If entry is found if (! templatesCacheEntry = null) {// Check timestamp of modification if (templatesCacheEntry.lastModified Although Java provides advanced synchronization, the problem is not synchronized, but how to balance synchronization and performance. The easiest way is all synchronization: We declare the entire newTransformer (...) method as synchronized, but this method is very efficient. Usually we only have a limited number of style sheets and are not often modified, and the number of cache reads is much more than the number of times. All synchronization (Full Synchronization) will block other concurrent readers. First, this is not a need at any time. Second, this can lead to bottlenecks. On the other hand, it is very dangerous to store cache content using unmatched containers (such as HashMap). If we don't take any measures, then the simultaneous read and write operations (inevitable problems) will cause the system to be unstable. We will mainly encounter a typical read / write problem: a given resource, there may be a write operation and several read operations occur at a certain time. This typical problem also has a typical solution from Doug LEA's Concurrent Programming in Java. This method is to track execution status (determined according to the number of read / write threads in the calculation activity), only allowing the read operation when there is no write operation thread without any activity. Similarly, write operations are allowed only when there is no read operation thread in the activity. To achieve the above, we write the content of the cache in two ways, read () and write (): Two pairs of BERFORE / AFTER, the READ / WRITE method performs the steps that need thread synchronization to ensure security while ensuring access to the efficiency of the cache. protected synchronized void beforeRead () {while (activeWriters> 0) try {wait ();} catch (InterruptedException iex) {} activeReaders;} protected synchronized void afterRead () {--activeReaders; notifyAll ();} protected synchronized void beforeWrite () {while (activeReaders> 0 || activeWriters> 0) try {wait ();} catch (InterruptedException iex) {} activeWriters;} protected synchronized void afterWrite () {--activeWriters; notifyAll (); } After understanding the above code, we finally got a transparent implementation of the Transformer Factory that cache functionality to file-based style sheets (you can download all source code at the resource). Here only shows the parts that make the Factory are fully applicable to the standard JAXP program. There are some ways to make the transformerfactory.newinstance () method return a customized Transformer Factory implementation instance. The easiest way to understand is the class name of Factory in javax.xml.transform.transformerFactory in System Property. The advantage of this approach is to have the highest priority, and the disadvantage is that it must be manually operated. Another way is to use a JRE (Java Runtime Environment) profile $ {jre_home} /lib/jaxp.properties to specify your own class name ... # specifies transformer factory usavax.xml.transform.transformerfactory = de .fzi.dbs.transform.cachingtransformerfactory ... The last way is to provide the Transformer Factory name in the original information (Meta-Information) through the Services API. Just create a file called Javax.xml.Transform.TransformerFactory in the meta-inf / service directory. The content of this file should be a single line string specified by a custom TRANSFORMER FACTORY class. This method has a problem: Another JAR may also try to set the Factory class through the Services API. For example, if you put your Jar and Saxon's Jar into your web-inf / lib directory, actually JAXP usage will be determined according to the order of load according to JAR files. Avoiding this method is simple to set Factory in a web-inf / class / meta-inf / service / javax.xml.transform.transformerFactory file. In this article, the file needs to include a single string de.fzi.dbs.transform.cachingTransformerFactory. Make the cache can be used transparent Now, when everything is completed. You don't have a headache. You no longer need to worry about loading, cache, and reload style sheets. You can make sure you can use the standard JAXP's third-party library that can use the cache function. You can definitely no more concurrent conflict, the cache will not be a bottleneck. However, there is still some shortcomings using this implementation. First, this factory only caches a file-based style sheet. The reason is that we can easily view the date of the final modification of the file, but other resources can not be implemented. Another problem is that style sheet import and include other style sheets. Modifying the style sheets that are Import and include do not reload the main style sheet. Finally, an existing Factory implementation is tied to a specific XSLT Processor (unless you write a cache extension for each of your possible Factory). However, it is worth a little. In most cases, these issues are more important than we get from Factory-based cache: use transparency, convenient, high performance. About author Alexey Valikov is a computer scientist with a wide programming background, especially Java and XML technology. He is now in the FZI (Computer Science Research Center, Karlsruhe / Germany) to study "Efficiency in Web Application". He worked in FZI's XML Competence Center, he also served as an XML Technical Consultant and participated in the European Commsion Projects. Alexey is a very popular XSLT application guide "The Technology of XSLT" author, the book published in Russia. Resource Download this article: http://www.javaworld.com/javaworld/jw-05-2003/xsl/jw-0502- xi.zip "Top Ten Java and Xslt Tips," Eric M. Burke (java.ore.com, August 2001): http://java.oreilly.com/news/javaxslt_0801.html Concurrent Programming in Java, Second Edtion: Design Principles and Patterns, Doug Lea (Addison-Wesley Pub Co., 1999; ISBN: 0201310090): http://www.amazon.com/exec/obidos/ASIN/0201310090/javaworldConcurrent Programming In java online supplement: http://gee.cs.oswego.edu/dl/cpj/ Saxon, an xslt processor written by michael kay: http://saxon.sourceforge.net/ JAXP Documentation: http://java.sun.com/xml/jaxp/index.html JAR File Specification, Service Provider Documentation: http://java.sun.com/gUide/jar/jar.html#service Provider Log4j, java logging package (used in proposed factory importation): http://jakarta.apache.org/log4j Apache Ant, Java-Based Build Tool (You May Use It To Build The ProPOSED Factory Implementation): http://ant.apache.org/ More Javaworld Stories on Xslt: "Generate Javabean Classes Dynamically with xslt," Victor Okunev (February 2002) "Boost Struts with xslt and XML," Julien Mercay and Gilbert Bouzeid (February 2002) "Xslt blooms with java," Taylor Cowan (December 2001) Browse the java and xml section of javaworld's topical index: http://www.javaworld.com/channel_content/jw-xml-index.shtml Talk more about xslt in ot xml & java discussion: http://forums.devworld.com/webx?50@@.ee6b78f Sign Up for JavaWorld's Free Weekly Email Enterprise Java Newsletter: http://www.javaworld.com/subscribe