This article is a Tutorial record at the XML EUROPE 2002 meeting. Detailed explanation of various structures that define XML should be such a Schema language characteristics and use. I translated the article into three parts, this is the first part, telling how rule-based schema specifies XML.
1 Introduction
What is a xml schema language?
What is XML Schema language?
I will insist more on this point during my comparison of XML schema languages on Wednesday morning, but one thing is sure: a XML schema language is probably not what you're expecting, and its main feature is not (or not always) to describe .
I will emphasize more things to compare more XML Schema languages on Wednesday morning, but a thing is sure: an XML Schema language may not be what you expect, but its main characteristics are not (or not Always) describe a class of XML documents but as a filter or firewall to protect the program from a variety of synthetic XML documents.
All Over this Tutorial We Will Use The Following Example:
I will use this whole tutorial:
XML Version = "1.0"?>
A program that manages the library described in this document, or even an XSLT style sheet is designed to display it. If the name or content of the element is not what it expects, it may be completely trapped. One of the main roles of XML Schema language is to provide a regular way to describe what is expected and protecting the programs from the risk of errors.
2. Rules-based language (XSLT & Schematron)
............ ..
The most basic implementation of this firewall is to give a set of rules that need to come with some example documents.
This is the approach followed by rule based XML schema languages which main representative is Schematron. Before presenting Schematron itself, we will have a look on how XSLT may be used as a XML schema language since this is a good exercise to understand the basics of those Schema Languages. This is based on rule-based XML Schema languages, and the representatives are Schematron. Before introducing SCHEMATRON, we will first look at how XSLT is used as XML Schema language, as it is a good exercise for understanding basic knowledge of these Schema languages.
XSLT is used as a rule-based XML Schema language schematron
2.1. XSLT is used as a rule-based XML Schema language
We can use "Classical" Programming Languages to Write a Rule Based XML Schema Either General Purpose Using A XML API OR XML Specific Such As Xslt Or XQuery.
We can use the "traditional" programming language to write rule-based XML Schema can also use XML APIs or special XMLs such as XSLT or XQuery.
To Illustrate this Point, Let's Take The Following Very Simple Snippet OF Our Example:
In order to clarify this point of view, let's take a look at this very simple code snippet in our example:
XML Version = "1.0"?>
Why so simple? Because we will see that even if it is true that we can use XSLT as a rule based XML schema language, this is quite verbose and I do not want spend all the time allocated to this tutorial to develop our schema!
Why is so simple? Because even if Xslt can be used as a rule-based XML Schema language, it is also very long, I don't want to use all assigned to this tutorial to develop our schema!
To write this schema, we have basically two options which are the same than we have when we configure a firewall: the closed one where all what is not allowed is forbidden and the open one where all what is not forbidden is allowed and we will implement Both Schemas.
Write this Schema, I basically have two options to configure firewall: closed, all non-permitted are prohibited; and open, all non-disabled are allowed. We implement both SCHEMA. Open Xslt Schema Closed Xslt Schema
THE FIRST Conclusion from this Simple Example Is That XML Applications TEND to Forbid Much More Than ALLOW: Closed Schemas Are Offen Easier To Write Than Open Schemas.
The first inference from this simple example is that the XML program tends to ban more things allowed by them: Closed Schema is more easily written than open Schema.
ON The Other Hand, It's Easier To Define User Friendly Error Messages In A Open Schema Since The Context In Which Sometting Is Forbidden Is Always Determined.
On the other hand, it is easier to define user-friendly error messages in open Schema because the context of the prohibited thing is always determined.
2.1.1. Open XSLT Schema
TO Implement An Open Schema with xslt, We Well Start Defining a Default Template Which Will Accept Anything:
In order to achieve open Schema with XSLT, we will start from the definition of a model that allows all things to do:
WITH THIS SINGLE TEMPLATE, OUR "Schema" Wouldcept Any Well FORMED XML Document and Never Raise Any Error and We next To Add Templates To Define What's forbidden.
With this simple template, our "Schema" will accept any combined XML documentation without throwing any errors and we need to add a template to define what is disabled.
Like with the design of any XSLT transformation, we have the choice to implement the tests as conditions in the "match" attribute of templates or within the templates using if or choose statements. When we are using if or choose statements, we have also the CHOICE OF THE LOCATION WHERE WE WILL WILL DO The Test.
As with any XSLT conversion, we can choose to implement this test as a "match" property in the template or use the IF or Choose statement in the template. When we use the IF or Choose statement, we can also decide where to make such a test.
To Check That The Document Element IS "library", We can for instance: In order to check the document element is "library", I can do this:
Tested in the template in the Match expression
Now That We've Set Up Up Up The Background, We CAN Generalize It and a Pretty Much Complete "Schema" include the Identifiers Could Be:
Now our table is set up, summarizes, a more complete "schema" including test identifier singleness may be:
Full XSLT implementation open SCHEMA
Note That We Have Left A Degree Of Opening and That Arbitrary Element and text nodes can be added to the book element.
Note that we have left a certain degree of expansion opportunities, any element and text can be added to the BOOK element.
2.1.1.1 Test in Match expression
We can Write a Template To Allow Library As Document ELEMENT:
We can write a template to allow librays as a document element:
But We Also Need To ForbiD Other Document Elements:
But we also need to ban other document elements:
OR, Alternatively, We can Rely on The Default Template and Replace Both Templates BY a Slightly More Complex Match Expression:
Alternatively, we can rely on the default template and replace the two templates to a Match expression slightly complex template:
2.1.1.2. Test in the template
We can Also Perform The Test in A Template for the root of the document: We can also perform the root of the document in the template:
Or do the Same Test in A Template for Document Element:
Or perform the same test on the template element in the template:
2.1.1.3. Open SCHEMA for complete XSLT implementation
XML Version = "1.0" encoding = "UTF-8"?>
2.1.2. Closed XSLT Realization SCHEMA
A Closed Schema Is The Other Way Round and Will Defaults Templates Which Are Forbidding Everything (Except Eventual "EMPTY" TEXT NODES):
Closed Schema is anti-its way, defining a default template for all things (except for the final "empty" text):
And the Define Everything Which IS Allowed, IE IN Fact Very Few Things:
Then define all the allowable things, in fact, something:
2.2. Schematron
Technically speaking, Schematron is a concise formalization of one of the examples which we have seen and generates a XSLT transformation which is an open schema (everything which has not been forbidden is allowed) with tests inside the templates.
Technically, Schematron is a precise formulation in the examples we have seen, and generating an XSLT conversion, which is open in the template (all that are not prohibited are allowed) test.
That Being said, xslt is totally hidden from the schema User who neseds to know the schematron syntax and xpath Which is buy to express the rules.
That is, XSLT is completely hidden on Schematron's users. But they need to know the syntax of Schematron and XPath used to express these rules.
A Schematron schema is composed of a set of patterns each pattern including one or more rules and each rule being composed of asserts and reports, however, to present the syntax used by Schematron, we'll take it "bottom / up" and start with Asserts and reports Before seeing how it is associated into rules, patterns and schemas.
Schematron Schema consists of a set of Pattern. Each Pattern includes one or more rules, and each rule consists of Assert and Report. However, in order to show the syntax used by Schemarand, we will in turn, start from Assert and Report before how they are related to rules, pattern, and Schema.
Assert (s) and Report (s) rule (s) pattern (s) Schema
2.2.1. Assert (s) and report (s)
The "Assert" and "Report" Elements Are WHERE The Rules Are Defined in A SchemaTron Schema. Both Carry A "TEST" Attribute Which is an XPath Expression The Differ In A Couple of Ways:
"Assert" and "report" elements are places where rules are defined in Schematron Schema. Brought a "test" attribute of XPath expressions, they differ in several respects:
They are the opposite one of each other: the test must be true to pass in an "assert" element and false to pass in a "report" element.The original purpose of assert is more for "fatal errors" and report more for things That Should Just Be "Reported", BUT this distinction is not Relevant ANY LONGER IN The Latest version (1.5).
They are the opposite side of each other: Test must be true in the "Assert" element, and in the "Report" element must be false. The initial purpose of Assert is more for "fatal errors" and Report is more used in things that only need "notification", but this difference does not exist in the latest version (1.5).
There Are Some Goodies Which We Will Not Cover In this Tutorial, But The Basic Syntax IS:
There are many good things in this tutorial that you can't tell, but the basic syntax is like this:
Which raises an error with the corresponding message if there is no "library" Element Under the context node, or:
If there is no "library" element in the context node, it throws an error and corresponding information, or:
Which raises an error if there is any attribute under the context node.
If there is any property in the context node, it throws an error.
In Both Cases, The Context Node Is Set by The "Rule" Parent Element of the Report or Assert Node.
In both cases, the context node is set by the "rule" parent element of the Report or an ASSERT node. 2.2.2. Rule (s)
Schematron Rule Elements Are Roughly Equivalent To Xslt Templates and Are Used To Define The Context Under Which A Set of Assert And Report Elements Will Be Performed.
The Schematron Rule element is substantially and the XSLT template and is used to define the context that Assert and Report elements will apply.
An Example of Rule (without Bells and whistles) Performing The Tests Done in Our Open Schema On The Book Element Could Be:
An example of a rule that performs testing in our open schema (without all fancy things) may be:
Some Notes About Rules:
Some precautions for rules are:
............... ..
The context of the rule cannot be set to attribute. The order of the test (Report and Assert elements) is not guaranteed, and a series of tests terminate when the first error occurs. .
2.2.3. Pattern (s)
Pattern Elements Are Sets of Rules Which Are Evaluated Independntly (Technically Using Different Modes In The xslt styleheet generated out of the schema).
The Pattern element is a batch of independent execution rules (technical speaking, which is guaranteed by different modes in the XSLT style sheet).
An Example of Pattern Roughly Equivalent To Our Open Xslt Schema Could Be: A quite example with our open XSLT Schema may be like this:
One of the differences with what we had implemented is that Schematron will stop the evaluation of a pattern after the first error found (following the order of the source tree) and if we wanted to be potentially able to raise several errors, we would have to Spread Our Rules forin Several DiffERENT PATTERNS.
One of the different things we have implemented is that SchemaTron will stop the execution of Pattern when the first error is found (according to the order of the source tree) and if you want to throw several errors, we need to put rules. Dispersed into several different Pattern. 2.2.4. Schema
Finally, The Schema Element Is The Document Element of a Schema Title And One or More Patterns. TO IMPLEMENT Our Rules WITHIN SEPARATED PATTERNS, We Could Write:
Finally, the Schema element is the documentation element of Schematron Schema and mainly includes a title and one or more pattern. To achieve our rule in several pattern, we can write this:
XML Version = "1.0" encoding = "UTF-8"?>
Solving the same problem with Schematron and XSLT shows the nature of Schematron which is a subset of XSLT tailored to XML validation through open rule based schemas. Solve the same problem with XSLT and Schematron Schematron reveal the nature of that it is tailored for XML Implementation of the XSLT subset of SCHEMA based on open rules.
Why do i insist this much on the openness of schematron schema?
Why I emphasize the openness of Schematron Schema?
Because The Default Behavior of Schematron Is To Be Open, But It Is Still Possible To Write Closed (Or Semi Closed Schemas) with Schematron Even Though It Isn't A CommON Practice.
Because Schematron's default behavior is open, it is also possible to write closed (or semi-closed) SCHEMA with SchemaRon, although this is not a common behavior.
The main trick when doing so is to note that the rules within a Schematron pattern are evaluated in lexical order instead of following the rules of priorities as defined by XSLT. The default rules which will forbid content not described in any rule need therefore to be located After All The Other Rules, Such As:
The main skill to do that is to remember that the rules in schema pattern are determined in the order of words rather than by the priority defined by the XSLT. By default, there is no rules that have not been described in any rules, so it is necessary to put it in all other rules, like this:
XML Version = "1.0" encoding = "UTF-8"?>
Also note the use of "name" elements and cannot define rules to attribute.