Flavored XML Schema Technology: grammar-based language

zhaozj2021-02-16  53

3. Syntax-based language (Relax NG)

We have seen that a schema can be described as a set of rules formalized using a language such as Schematron of XSLT (other languages ​​such as Prolog are probably good candidates too). The fact that this is possible is not a proof that it's easy And people............................

We have been able to see that Schema can describe the formal rules such as a set of schematron using Schematron (other languages ​​such as Prolog) like the XSLT. This is a possible fact is not a simple proof, and people have developed other more specialized Schema languages ​​to describe the structure of the document rather than verify their rules.

Relax ng is the main example of self behind "grammar based" Since The Describe Documents In The Manner of A Bnf Adapted to Describing XML Trees.

Relax Ng is the main representative of such a "grammar-based" language because they describe documents in a method suitable for describing the BNF of the XML tree.

Although ITS SYNTAX IS VERY DIFFERENT from XPath, RELAX NG IS All About Named Patterns Allowed in The Structure.

Although its grammar is different from XPath, Relax NG is completely related to name Pattern that is allowed in the structure.

Getting Started Non-xml Syntax Identifier Patterns More Features Complete Schema is not so important to open our SCHEMA Other features

3.1. Getting started

The Description of Our Simplified Library Could Be:

Our simplified library can be described as this:

This schema reads almost as plain English and describes: 'a grammar starting with a document element named "library" containing zero or more elements named "book" with an attribute named "id"' and his equivalent to our XSLT closed schema accepting only " / library "," / library / book "and" / library / book / @ id "--Except That The Restriction ON IDS Being Unique Is Not Captured (YET) IN Our Relax Ng Schema. This schema reads almost the same as English. And described: 'a grammar, it starts with the document element of "library", contains zero or more names "BOOK", with an element of the attribute of "ID" ", and it is equivalent to We only accept "/ library", "/ library / book" and "/ library / book / @ ID" XSLT closed Schema - except that there is no ID in our Relax NG Schema must be unique.

3.2. Non-xml syntax

The XML Syntax of this Schema Is Still Quite Verbose and James Clark Has Proposed An Equivalent Yet More Concise Non XML Syntax. Using this Syntax, Our Schema Would Become:

This Schema's XML syntax is still very lengthy and James Clark, and the proposed peer-to-the-equivalent of an XML syntax, and more accurate. With this grammar, our schema becomes:

Grammar {start = element library {element book {attribute id {text}} *}}

THIS SYNTAX HAS ROUGHLY THE SAME Meaning, Except this a) IT'S NON XML B) Some DTD Goodies Are Used: Here The "*" Means "Zero or More" and We Will See More of these Goodies in More Complete Examples Later ON.

This syntax has basically had the same meaning, except for a) it is a non-XML B) Advantages of some DTDs are used: "*" means "zero or more" and we will be more complicated behind Seeing more such benefits in the examples.

3.3. Identifier

We are still behind what we had implemented with our XSLT or Schematron schemas which did test the uniqueness of the book identifiers. Although it is generally impossible to implement with a grammar based XML schema language all the constraints which can be expressed as rules, this example Has Been Chosen So That We can Find A Way to Define a NEARLY Equivalent Constraint with Relax Ng. We still hide behind our firewall implemented with XSLT or Schematron, which will test the uniqueness of the BOOK identifier. Although it is generally impossible to implement all rules-based constraints based on grammar-based XML Schema languages, this example is still elected to find a way to define approximate equivalents in Relax NG.

This is the name systems.

This is achieved by a set of definitions to achieve a feature of a certain degree and XML DTD compatibility and the function of the RELAX NG for interaction with the data type system.

The Dattype System To Use in this case is "http://relaxng.org/ns/compatically/datatypes/1.0" and the dattype to use is "id" Since Our "ID" Attributes Can Be Considered As DTD ID Attributes (They Are Globally Unique All over a document and the match the xml "Nmtoken" Production).

The data type system to use in this example is "http://relaxng.org/ns/compatibility/datatypes/1.0", the data type you want is "ID" because our "ID" property can be considered DTD ID Attributes (they are unique throughout the document and they match XML "NMToken").

The Amended Schema To Express this New Constraint Becomes:

Modify SCHEMA that can express this new constraint:

< / element> The syntax is still straightforward: the attribute is now specified as holding data of type "ID" per the datatype library "http://relaxng.org /ns/compatic/datatypes/1.0 "defined through the dattypelibrary attribute of an network of the" data "element.

The syntax is still clear: attributes are now limited to the "ID" data type in the "http://relaxng.org/ns/compatically/dattypes/1.0" of the data type library. The selection of the data type library is defined by the DataTypeLibrary property of the parent element "data".

The Non XML SYNTAX Uses A Namespace Prefix Declaration (Also Available In The XML Syntax) and Becomes:

Non-xml syntax uses a declaration that is prefixed in the namespace (can also be used in XML syntax), becomes:

DataTypes DTD = "http://relaxng.org/ns/compatically/datattypes/1.0" grammar {start = element library {element book {attribute id {dtd: id}} *}}

WE WILL SEE LATER ON That Side Effects: The Point of affecting the flexibility of relvel.

We will see that these data types will not have no side effects: Provide them to provide you with compatibility to DTD, but this imitation of DTD also affects the flexibility of Relax NG.

3.4. Patterns

All over Our Brief Experience With Relax Ng, We'vel Been Manipulating Patternal. We use the RELAX NG's short experience, we have already used Pattern and it deserves us. Go back to see these truly fundamental concepts.

THE Basic Think To Note Is That We We Write {Element Book {Attribute Id {DTD: Id}} *} ", We Are Not Giving Definitions of What The Elements" Library "," Book "and the Attribute "ID" Are But Defining a Pattern of Nodes Which May Appear in The Documents.

Basic thinking about this comment is when we write things such as "Element Library {Element Book {Attribute ID {DTD: ID}} *}", we have not given elements "library", "book", "and attributes" ID "is what is defined in a node that may appear in the document.

In this respect, we are here much closer to the schemas which we have written with XSLT or Schematron than to the schemas we will write later on with W3C XML Schema and the meaning of the pattern defined above is "accept here an element node library with Children Element Nodes Book Having An ID Attribute Having Data OF TYPE ID ".

In this regard, we and Schama who have already written in XSLT or Schematron are closer, and better, we will have to use Schema written in W3C XML Schema. And the pattern defined above is "Allow here there is a library element with sub-elements Book, Book has an id attribute of the attribute type ID."

The nodes manipulated in this pattern are always anonymous, which means that we can not make a reference to these nodes elsewhere in the schema. What's possible, though, is to define global named patterns (aka named templates in a XSLT transformation) and to refer to The release pattern in other patterns.

The nodes operating in this pattern are always anonymous, which means that we cannot reference these nodes elsewhere else. Although we can define global naming pattern (similar to named templates in XSLT conversion) and reference these pattern in other pattern.

The syntax to define a named Pattern Holding The Set of Book Elements Would Be: The syntax of the name Pattern that defines a batch of book elements will be:

OR (NON XML):

Or (non-XML form):

Bookelements = Element book {attribute id {dtd: id}} *

And a reference to this pattern would be:

And a reference to this Pattern is:

OR (NON XML):

Or (non-XML form):

Start = element library {bookelements}

Note that there is no restriction on the "content" located in named patterns. We have chosen here to include a set of zero or more book elements but could also have created patterns to include a single book element or the id attributes. In every case Named Patterns Are Containers and Even WHEN A Name Pattern Contains a Single Element, It's A Pattern Containing A Single Element Rather Than Definition OF this Element.

Note that there is no limit to the location of "Content" in the naming template. We choose to include zero or multiple BOOK elements here but can also create Pattern containing a BOOK element or ID property. In any case, the naming template is a container and even when the name template contains only one element, it is a Pattern containing an element instead of a definition of this element.

3.5. More characteristics

It's now Time to Add Some More Elements to Explore More Features from Relax Ng ... Let's Describe The "Author" Element:

Now is the time to add more elements to explore more features from Relax NG ... Let's describe "Author" elements:

charles m. Schulz Sparky 1922-11-26 2000-02 -12 Since The Definition of The ID Attribute Is Common To Several Elements, We can ISOLATE IT IN A PATTERN:

Because the definition of the id attribute is used for several elements, we can separate it separately into a Pattern:

OR:

or:

iDattribute = attribute id {dtd: id}

This Description of the Author Element is StraightForward Using The Few Features Which We've Already Seen:

This description of the Author element is direct, using some of the features we have seen:

< Element name = "born">

OR:

or:

Element author {iDattribute, element name {text}, element nickname {text}, element born {text}, element dead {text}}

Note that we have defined all the sub-elements as "text" meaning that they can hold any text node. We could also use a datatype library such as the W3C XML Schema datatype library which we can define as the default type library since we ' Ve Define The Dattype Library Used for the ID attribute in the Type Definition Itself. Note We have defined all child elements "text" means they can contain any text nodes. We can also use something such as W3C XML Schema Data Type Library, because we have already defined data type libraries to id attributes in the type definition itself.

The definition involves then choosing the right type for each of the element. Here for instance, we've been lucky enough to have date expressed in the ISO 8601 date format supported by W3C XML Schema and can use this type in our schema. For string types, we need to distinguish between "token" and "string" depending on the behavior we want to space normalization (token applies full space normalization and trimming while string applies none) Depending on these choices, our definition might become.:

The definition is then given to select the correct type for each element. For example, we are very lucky to express in the ISO 8601 date format supported by W3C XML Schema. For string types, we need to distinguish "token" and "string", based on our different processing needs (TOKEN completely applied blank standardization, TRIM operation is performed at the character string). According to these options, our definition may become:

OR:

or:

Element author {ipttribute, element name {xs: token}, element nickname {xs: token}, element born {xs: date}, element dead {xs: date}}

3.6. Complete Schema

Writing the full schema for the completion of it's pretty much repele the time process:

Write the entire SCHEMA to the entire example is a fairly repetitive process:

< ZEROORMORE>

OR:

or:

DataTypes DTD = "http://relaxng.org/ns/compatibility/datattypes/1.0" DATATYPES XS = "http://www.w3.org/2001/xmlschema-DataTypes" grammar {start = element library {(bookeElement | authorElement | characterElement) } idAttribute = attribute id {dtd: ID} idrefAttribute = attribute id {dtd: IDREF} bookElement = element book {idAttribute, element isbn {xs: token}, element title {xs: token}, element author- ref {idrefAttribute} *, element character-ref {idrefAttribute} *} authorElement = element author {idAttribute, element name {xs: token}, element nickName {xs: token}, element born {xs: date}, element dead {xs : DateRelement = Element Character {iDAttribute, element name {xs: token}, element since {xs: date}, element Qualification {xs: string}}}

Note the usage to define the "library" element of the "choice" element (XML syntax) represented in the non XML syntax by the "|" operator The meaning of this compositor is to allow one possibility only within a list Here,.. The Choice May Have "Zeroormore" (or "*" in the non xml syntax) Occurrences Which Means That The Choice May Be Repeated Indefinitely. Note The use of "library" elements in the "Choice" element (XML syntax) in non-XML Syntax Expression uses "|" operator. This symbol means that only occurs once in the list. Here, Choice may appear "zeroormore" (or "*" in non-XML syntax) times, which means that Choice is indefinitely repeated.

3.7. When the order is not so important

There are cases when the relative order between elements does not matter for the application For instance, one may wonder what's the point of constraining the order of the sub-elements of "author" and impose to write.:

There is such a case, the relevant order between elements is not critical for the program. For example, you may be strange to the order of the binding "Author" child, and why:

charles m. schulz sparky 1992-11-26 2000-02 -12

Rather Than

Instead of

Charles M. Schulz 2000-02-12 1992-11-26 Sparky

After all, the elements have names and it's not much more complex to write applications which will retrieve the information they need whatever the order of the sub-elements is. So, why should we bother document writers with respecting a fixed order? After all, the elements They are not very complicated in the order in which they don't have the order of the child. Therefore, why do we want to work with the author of the document explaining a fixed order?

RELAX NG allows such definitions without any restriction through the use of "interleave" elements (XML syntax) or "&" operator (non XML syntax) and the updated definition of the author element to remove the restriction on the order of the sub-elements Would Be:

Relax Ng allows such definitions and no constraints are used by using the "Interleave" element (XML syntax) or "&" operator (non-XML syntax). The definition of the updated author element will be followed by the order of the child element:

< Data type = "token" />

OR:

or:

Element Author {iDattribute & element name {xs: token} & element nickname {xs: token} & element born {xs: date} & element dead {xs: date}}

Note That this does Apply Even The Number of Occurrence of Some of the Sub-Elements IS Greater Than One Such as for Our "Book" Element: Note that this is greater than once in some sub-elements, for example, our "BOOK" Elements can also be used:

Bookelement = element book {ipttribute & element isbn {xs: token} & element title {xs: token} & element author-ref {idrefattribute} * & element character-ref {idrefattribute} *}

3.8. Open our Schema

If we come back to our highly simplified example with only "library" and "book" elements, we have achieved a pretty good equivalence with the closed schemas previously developed with XSLT and Schematron and you may wonder if we can open our schema to allow arbitrary TEXT AND Element Nodes forin Our book element like we had been able to do.

If we return to our highly simplified examples of the "library" and "book" elements, we have implemented almost completely identical peers developed with XSLT and Schematron developed. And you may want to know if we can open our Schema to make any text and element node can appear in our book elements like we can do in front.

THE FIRST Step to do so is to define an open pattern for accepting any element. There is no predefined pattern to do soned limited a big deal with all what weame shen so far and a new goodies which Is The "AnyName" Element IMPLEMENTING NAME WILDCARDS (or "*" in The Non XML Syntax):

The first step is to define an open Pattern that accepts any elements. There is no predefined Pattern in Relax NG, but it is not difficult to use things we have seen. A good thing name is an element of "Anyname" to realize the name wildcard (or "*" in the non-XML syntax):

< / zeroormore> OR:

or:

Anyelement = Element * {(attribute * {text} | text | anyelement) *}

The Other Thing to Note Is That Recursive Patterns Are ALOWED WHEN The Recursion Happens Wtem..

Another thing to note is that recursive pattern is permissible when recuing occurs in such an element:

The Surprise Comes WHEN We Try To Use this named Pattern in Our Book Element:

When we try to use the entire name Pattern in our book element, unexpectedly:

OR:

or:

Element book {attribute id {dtd: id}, anylement *}

The Schema Is Ten Detected As Invalid with The FOLLOWING ERROR:

SHEMA then detected invalid, have the following error:

ERROR AT URL ... LINE NUMBER 5, Column Number 22: conflicting ID-TYPES for Attribute "ID" of element "book"

We've been hit by a side effect of the DTD compatibility library used for our id attribute and to make sure that this is not a limitation of the RELAX NG language itself, we can just change the definition of these attributes to be plain text:

We are used for the DTD compatibility library of our ID attribute to the strike, and to confident that the whole is not the restriction of the Relax NG language itself, we can turn these attributes into normal text:

OR:

or:

Element book {attribute id {text}, Anyelement *}

Andur Schemas Become Valid.

This way our Schema has become effective.

What's happening here is that to emulate the behavior of a DTD, RELAX NG imposes that if an ID attribute is defined somewhere a in a element, the same ID attribute must be defined in all the other definitions of this element and this is not the case In The Definition of Our "Anyelement" Pattern Which May - THROUGH THE WILDCARD- INCLUDE A "BOOK" Element Which Does NOT INCLUDE A MANDATORY ID Attribute with the Type DTD: ID ...

What happens here: To simulate DTD behavior, Relax Ng forcibly considers ID attributes definitions in an element, and the same ID attribute must be defined in all other definitions of the entire element. And this our "Anyelement" pattern "is different, which contains a" book "element through wildcard, and" book "element does not include a must-have type DTD: ID attribute ...

To workaround this issue, we may either avoid using the ID type as shown above or if we want to use this type, exclude or handle separately the case of a book element included as a sub-element of the top level book. This exclusion can BE DONE THROUGH The "Except" and "name" Elements (or "-" Operator in the non xml syntax):

I don't happen, we either avoid using the ID type as above, or if we want to use this type, rejection or separately process the BOOK element as the sub-element of the top BOOK is included. This exclusion can be done by "ExcePT" and "Name" elements (or in a non-XML syntax "-" operator): book

OR:

or:

Anyelement = Element * - book {(attribute * {text} | text | anyelement) *}

3.9. Other features

Relax NG Has Some Other Nice Features Which We Will Not Cover Here and Are Detailed on The Very Good Tutorial Available on Their Web Site (http://relaxng.org), SUCH AS:

RELAX NG has some other good features, we can't mention here, there are some very good tutorials on specific information on their website (http://relaxng.org), for example:

Schema Composition and Pattern Redefinitions.namespace SupportannotationSlist of Values

转载请注明原文地址:https://www.9cbs.com/read-26590.html

New Post(0)