Convert DOCBOOK documentation using XSLT

xiaoxiao2021-03-06 64

David Mertz, Doctoral Conversion Expert, Gnosis Software, Inc. November 2000

Content: The primary thing method Selects the XSLT tool to write the XSLT specification to match the repetition of the child's future reference information about the author.

by using

Docbook example, David Mertz demonstrates how to convert an XML document into HTML via XSLT (Extensible Styles Language Conversion). This fearless column writer discussed four options to convert XML documents, and shared the experience of some open source tools with us. The sample code includes the XSLT document segment, which is simple in XSLT.

The effective HTML outputer code of the DocBook section, as well as a brief XSLT loop example.

Welcome to the world of XML conversion! I think you may not be smooth. The standard is being connected and revised, and the tool is not mature and often there is an error, and it is inconsistent, and it is very confusing. But don't panic. I can guide you at least one way to get out of the maze. And over time, the situation will definitely improve, although the speed of improvement is often better than what we wish.

The last two "XML Question" columns described in the last two "XML Question" columns I converted academic articles into XML projects, specifically transplates to Docbook DTD projects. Those articles provide a good starting point for writing your own Docbook document, which is the purpose of this column.

In this column, we assume that you already have some good structure, correct, effective DocBook XML documentation. First, very good things are to have them, but the next step is to convert them into more convenient end-user formats: such as HTML pages, PDF files, and printed pages (readers actually read), etc. This is exactly what I faced after converting some archived works into DocBook. This article provides my own solution.

My main goal - at least currently - is the correct conversion of HTML. But I don't want to be limited to HTML output. There are still some smaller goals. I hope that there is no need to perform many operations without performing many operations, and there is no need to understand many new languages and technologies. I also want to use the free tools and cross-platform tools. Finally, I hope to reduce the dependency to the smallest. Even if all required contributions are free and cross-platform, a large amount of complex dependence is also a disadvantage. Basically, my ideal is to have an independent executable, run, reliably run, transfer my Docbook document with the style I want to html. Dreams, but why can't you think so?

There are at least four possible ways to perform a conversion to convert the DocBook document - or almost all XML documents - into the end user format. I am very serious about all four ways for my small project. This column only discusses the last option in detail, but it is necessary to remember all how to remember when you plan to repeatedly convert.

Write custom conversion code. It is best to start from a programming language (such as SAX and DOM) with basic XML method libraries. However, even if the basic syntax analysis is a black box, custom code can also perform all the elements that you want to perform. In the final analysis, this is the most flexible and powerful way, but it will often bring more work, whether it is maintained in advance. Use the cascading style sheets and Docbook documents. This is an idea. It is best to complete the typesetting specification to be separated from the structured marker and only allow client devices (such as browsers) to produce a good output. This is likely to happen, but it seems to support limited - only in IE 5.5, Opera 4, and some of the latest Mozilla developer releases. This is not like a method that can rely on end users to perform these tasks for them. Use Document Style Seminges and Specifications Language (DSSSL) to specify conversions in the target format. From a good aspect, some DOCBOOK (and other formats) DSSSL style sheets have already existed. DSSSL is basically a new programming language that needs to be learned, and is similar to LISP functional language. To take advantage of DSSSL, you need to start from Jade or Openjade, but both tools are complex, so many people must write packages (such as SGML-Tools Lite). In order to obtain useful systems - Although there is a report is a very good and effective system - you really need to meet all kinds of system dependencies and install all kinds of tools and libraries. For some good wishes, it may not be put into sufficient efforts, I have not allowed to work with Jade to work smoothly on my system. Obviously, many other people are using these systems every day, so do some work will be able to do things well. (If you can point to a fast, simple, unique DSSSL processor, please tell me. I really want to try it.) However, in addition to the difficulty installation, DSSSL feels like a traditional and thinking method from XML technology. In contrast, the last method is basically pure XML and is from formal (effective) W3C specification. Use Scalable Style Table Language Conversion (XSLT). In a sense, XSLT is actually a specification of the XML document class. That is, the XSLT style sheet itself is the correct XML document format and has some specialized content that allows you to "template" you expect the output format (continue reading its meaning). There are many tools to support XSLT from nominal: My premonition is, this is indeed the technical direction of XML conversion - because, or although it is "formal" relative to W3C. XSLT can specify a conversion to any target format. But give me a general feeling that most developers have found that when the target format is another XSLT format (such as XHTML), it is easier to use XML. Select the XSLT Tool Reference section contains links to many XSLT tools. I tried many, but I found that Sablotron was mostly me. It is free software (GNU). It is a platform. It has many independent executables that can be simply run from the command line. Most importantly, it looks correctly, at least for my simple test case.

Many other XSLT tools listed in xslt.com are also free software. However, most of them are Java programs, but also rely on a variety of additional Java libraries. The user seems to give some Java tools, so these tools may be a good choice for you. I chose Sablotron because the compiled C speed is faster, installed and used is simple. Norman Walsh creates a series of complete XSLT style sheets for Docbook. Unfortunately, when used in style sheets, Sablotron crashed, and XML SPY could not match anything in the valid Docbook document. This is likely to be a limit for tools instead of Walsh style sheets. It may be good to use other tools. This problem still gives us the opportunity to develop custom (not very complete) XSLT style sheets, which is what I really hope (for demo technology).

Sablotron is very simple. Basic usage is:

Listing 1: Basic Law of Sablotron

X: / mydocs> x: / sabl / bin / sabcmd mystyle.xsl mydoc.xml

MyDoc.html

It is said: Use the rules in MyStyle.xs to convert MyDoc.xml to myDoc.html, and if you want, you can use the pipes and redirects. Setting Sablotron and unbearable its files (it also provides libraries that can be called from programs, but it is best to use from the command line utility.) So you can adjust the path and file name by the environment.

Write the XSLT specification for the entry of XSLT, please read the official Recommendation of W3C (see Resources). This article is intended to provide more informal details that make it work.

"XML problem # 3" and "XML problem # 4" a specific DocBook document (chap5.xml) is a chapter. The example uses a small part of all possible DocBook tags in the chapter. So now, we actually need a chapter.xsl file, which will do something useful for each tag actually used in Chap5.xml. This is the right beginning, but it is easy to build because XSLT has an open and scalable essence. Let's take a look.

Starting from the skeleton of Chapter.xsl - "How to convert DocBook chapters to html templates:

Listing 2: Skeleton XSLT Document (EMPTY.XLS)

XMLns: XSL = "http://www.w3.org/1999/xsl/transform"

XMLns = "http://www.w3.org/tr/xhtml1/strict">

It can be seen that chapter.xsl is a correct XML file. You will also notice that mode is the name of many tags in the XSLT document. In fact, all tags belonging to the instruction are like this. Various other tags will be seen during the transition to a similar XML format (eg HTML). These other tags belong to the target format, only in the element.

Basically, the namespace properties (XMLns: XSL and XMLns) pointed out should be used exactly. May also want to retain the output line; although the XML or Text method can be used. The above XSLT file is very good as the processing template. But this may not be what you need. It can be assumed that there is no output due to the lack of output specification. This is not completely correct: it still captures all text nodes and provides a simple ASCII version of the chapter (using the above style sheet). If you really want to have no output, you need to have an XSLT document in Listing 3:

Listing 3: NULL Output XSLT Document (NULL.XLS)

XMLns: XSL = "http://www.w3.org/1999/xsl/transform"

XMLns = "http://www.w3.org/tr/xhtml1/strict">

NULL output makes our conversion more useful. The real style sheet actually describes a series of patches to try to match, templates inside each of the elements that provide output content templates. As shown in the example, "*" can match any mode. Our examples happen to do anything in the template, but it still works with any elements that may appear in the source XML / DocBook document.

The power to match the XSLT by decline is mainly the ability to extend the matching function. Once a element is matched, the XSLT extends the matching function to the child elements of the element. Let us create a style sheet with a certain meaning by expansion on the NULL output. The important tag that allows to fall to the child element is . In general, each template includes this tag in its body:

Listing 4: Minimal chapter XSLT document (minimal.xls)

XMLns: XSL = "http://www.w3.org/1999/xsl/transform"

XMLns = "http://www.w3.org/tr/xhtml1/strict">

----- Start of chapter ------

##### Unmatch Element in Source #####

When you run the XSLT processor with this style sheet and the DocBook chapter, the result is similar to:

Listing 5: XSLT processor using style sheets and docbook chapters

----- Start of chapter ------

##### Unmatch Element in Source #####

This output is not so useful, but it makes us see what the style sheet is done. The root element of the chapter is the tag. Style Tables Match Tags, and Print "- - - - START of Chapter - - - - -". Various subsequences appear in the element. Each such child is called non-chapter content, so "*" template will be matched.

To develop its own XSLT style sheet, provide some obvious tags similar to the above non-matching elements, you can quickly see which templates needed to develop. Listing 6 shows versions with some real templates:

Listing 6: Valid HTML Output XSLT Document

XMLns: XSL = "http://www.w3.org/1999/xsl/transform"

XMLns = "http://www.w3.org/tr/xhtml1/strict">

</p> <p><xsl: value-of select = "title" /></p> <p></ title></p> <p></ hEAD></p> <p><body></p> <p><XSL: Apply-Templates /></p> <p></ body></p> <p></ html></p> <p></ xsl: template></p> <p><XSL: Template Match = "Chapter / Title"></p> <p><hr> </ hr></p> <p><h1> <XSL: Apply-Templates /> </ h1></p> <p></ xsl: template></p> <p><XSL: Template Match = "Para"></p> <p><p> <xsl: Apply-Templates /> </ P></p> <p></ xsl: template></p> <p><XSL: Template Match = "*"></p> <p>##### Unmatch Element in Source #####</p> <p></ xsl: template></p> <p></ xsl: stylesheet></p> <p>The HTML output is displayed some actual features of the XSLT style sheet. The Chapter template matches the HTML document you want to generate. The template matches the internal HTML tag without special; all the text you put there will appear in the output. In the HTML <title> element, use the <XSL: Value-of> instruction to insert <Chapter> The required Title child elements within the <Chapter> to Docbook. In the HTML <body> element, the control is passed to other templates (some part of the DOCBOOK).</p> <p>The next template after Chapter is Chapter / Title. This means that the <title> element is matched, but only when it directly appears inside <chapter>. If you want, you can only match Title to specify the output format of each <title> element in the source document. But I hope that the chapter title is formatted into the title of SECT1, the title of SECT2, and the like. Use Paraa in the example (but from do not match because PARA can only appear inside the markup that does not match. To perform accurate measure, the template still matches "*", so you can see the style table is incomplete when checking the output. Duplicate subsets are not the only technique of XSLT through a drop match template. It is also possible to perform conditional output, sort, and take out source properties, and then loop in the child. For now, just look at the simple loop example in Listing 7:</p> <p>Listing 7: XSLT template loop on child elements</p> <p><XSL: Template Match = "SimpleList"></p> <p><ul></p> <p><xsl: for-each select = "member"></p> <p><li> <xsl: Apply-Templates /> </ li></p> <p></ xsl: for-energy></p> <p></ ul></p> <p></ xsl: template></p> <p>Without falling into each child element in SimpleList, we only assume that the child elements are <member> elements. <XSL: for-Each> works very similar to the nested template and is very similar to the programming language cycle structure. The content of the <xsl: for-Each> element will appear in the output of each sub-element that matches the SELECT property. In the loop, the content of the current <mem> element is dropped to the active node of the <XSL: Apply-Templates /> tag found in the loop. That is to say, each of the list has a further tag inside, and we pass these elements to their corresponding templates (for text nodes, they are the output of text format).</p> <p>The future of the future only uncovers the veil of the XSLT. But it should provide you with some understanding of style sheets and conversions. The reference information provides many sources of further reading related issues. Especially through the more complete XML and XSLT examples in this document file can benefit from it. Please don't go away, this column will refer to the XSLT again in a variety of ways.</p> <p>Reference</p> <p>Go to the XSL home page of the World Wide Web Association to obtain a full description and description of the Extensible Style Sheet Language (XSL). The Web Association (W3C) XSLT Recommendation 1.0 provides an overview of the XML namespace mechanism. It still understands a great place to understand XSL conversion (XSLT) grammar and semantic definitions. XSLT.com provides a summary of surveys for many XSLT tools. The Sablotron XSL Conversion Processor (Open Source) can be used for public use, and it is very convenient for the basis for multi-platform XML applications. Look carefully to the XSL style sheet survey of the Norman Walsh of the XSLT tool. Joe BrockMeier's "A Gentle Guide to Docbook" is a detailed introduction to SGML-Tools Lite. This is another - using DSSSL --- is used to format the DocBook documentation, which is different from the XSLT mode. If you want to learn more about DSSSL, James Clark's Document Style Semizist and Specification Language (DSSSL) page is a good starting point. OASIS provides "known" resources for DocBook for Docbook. IBM AlphaWorks's Xeena XML Editor (free license within 90 days) is a regular Java application for editing a valid XML document generated from any effective DTD. Please read the XML SPY review of David Mertz on Webreview.com. For business XML editors, check: XML SPY Home (Icon Information System) Softquad's XMetal Home; ExtensIlity's XML instance. To confirm an XML document, go to the web-based XML confirmation format (you can get the source code, license). Best from Docbook: The Definitive Guide, Norman Walsh & Leonard Muellner, O'Reilly, Cambridge, MA 1999, starts to understand DOCBOOK more detailed information. Or take a closer look at its electronic version. Structured Information Standards Improved Organizations (OASIS) is a non-profit international federation, such as XML and SGML, such as XML and SGML, "create interoperable industry specifications." Download the files used and mentioned in this article. . In the three sections of the Doug Tidwell of DeveloperWorks, learn more about XSLT conversion, which demonstrates how many different documents are converted into HTML, PDF, and SVG format. Try the Crane Softwrights XSLT tutorial for Ken Holman's demo preview format. See also the previous column before David Mertz:</p> <p>XML problem # 1 introduces the Python XML_Pickle object. XML problem # 2 describes how to use Python's XML_Objectify. XML problem # 3 introduces Docbook. XML Question # 4 Continue to describe how to build an old document file using Docbook.</p> <p>About the author David Mertz must have a missed his macguffin in his other article. It will appear soon. You can contact David via mertz@gnosis.cx; in http://gnosis.cx/publish/, his life is described in detail. Very welcome to past, this article or future column articles comments and suggestions.</p></div><div class="text-center mt-3 text-grey"> 转载请注明原文地址:https://www.9cbs.com/read-58881.html</div><div class="plugin d-flex justify-content-center mt-3"></div><hr><div class="row"><div class="col-lg-12 text-muted mt-2"><i class="icon-tags mr-2"></i><span class="badge border border-secondary mr-2"><h2 class="h6 mb-0 small"><a class="text-secondary" href="tag-2.html">9cbs</a></h2></span></div></div></div></div><div class="card card-postlist border-white shadow"><div class="card-body"><div class="card-title"><div class="d-flex justify-content-between"><div><b>New Post</b>(<span class="posts">0</span>) </div><div></div></div></div><ul class="postlist list-unstyled"> </ul></div></div><div class="d-none threadlist"><input type="checkbox" name="modtid" value="58881" checked /></div></div></div></div></div><footer class="text-muted small bg-dark py-4 mt-3" id="footer"><div class="container"><div class="row"><div class="col">CopyRight © 2020 All Rights Reserved </div><div class="col text-right">Processed: <b>0.091</b>, SQL: <b>9</b></div></div></div></footer><script src="./lang/en-us/lang.js?2.2.0"></script><script src="view/js/jquery.min.js?2.2.0"></script><script src="view/js/popper.min.js?2.2.0"></script><script src="view/js/bootstrap.min.js?2.2.0"></script><script src="view/js/xiuno.js?2.2.0"></script><script src="view/js/bootstrap-plugin.js?2.2.0"></script><script src="view/js/async.min.js?2.2.0"></script><script src="view/js/form.js?2.2.0"></script><script> var debug = DEBUG = 0; var url_rewrite_on = 1; var url_path = './'; var forumarr = {"1":"Tech"}; var fid = 1; var uid = 0; var gid = 0; xn.options.water_image_url = 'view/img/water-small.png'; </script><script src="view/js/wellcms.js?2.2.0"></script><a class="scroll-to-top rounded" href="javascript:void(0);"><i class="icon-angle-up"></i></a><a class="scroll-to-bottom rounded" href="javascript:void(0);" style="display: inline;"><i class="icon-angle-down"></i></a></body></html><script> var forum_url = 'list-1.html'; var safe_token = 've9zdAacpYq7_2Be9aX16bv2Kji65B_2B_2F7NpEvYd84YHoj5JJprgwu96MzhIIYjNBJOUEn_2BLfdKbSiqUG9MU3YueA_3D_3D'; var body = $('body'); body.on('submit', '#form', function() { var jthis = $(this); var jsubmit = jthis.find('#submit'); jthis.reset(); jsubmit.button('loading'); var postdata = jthis.serializeObject(); $.xpost(jthis.attr('action'), postdata, function(code, message) { if(code == 0) { location.reload(); } else { $.alert(message); jsubmit.button('reset'); } }); return false; }); function resize_image() { var jmessagelist = $('div.message'); var first_width = jmessagelist.width(); jmessagelist.each(function() { var jdiv = $(this); var maxwidth = jdiv.attr('isfirst') ? first_width : jdiv.width(); var jmessage_width = Math.min(jdiv.width(), maxwidth); jdiv.find('img, embed, iframe, video').each(function() { var jimg = $(this); var img_width = this.org_width; var img_height = this.org_height; if(!img_width) { var img_width = jimg.attr('width'); var img_height = jimg.attr('height'); this.org_width = img_width; this.org_height = img_height; } if(img_width > jmessage_width) { if(this.tagName == 'IMG') { jimg.width(jmessage_width); jimg.css('height', 'auto'); jimg.css('cursor', 'pointer'); jimg.on('click', function() { }); } else { jimg.width(jmessage_width); var height = (img_height / img_width) * jimg.width(); jimg.height(height); } } }); }); } function resize_table() { $('div.message').each(function() { var jdiv = $(this); jdiv.find('table').addClass('table').wrap('<div class="table-responsive"></div>'); }); } $(function() { resize_image(); resize_table(); $(window).on('resize', resize_image); }); var jmessage = $('#message'); jmessage.on('focus', function() {if(jmessage.t) { clearTimeout(jmessage.t); jmessage.t = null; } jmessage.css('height', '6rem'); }); jmessage.on('blur', function() {jmessage.t = setTimeout(function() { jmessage.css('height', '2.5rem');}, 1000); }); $('#nav li[data-active="fid-1"]').addClass('active'); </script>