Understand the XML architecture
Release Date: 4/13/2004
| Update Date: 4/13/2004
Aaron Skonnard
Developmentor
March 2003
Applicable to:
Type system
XML Architecture Definition Language (XSD)
Web service development
This page
Introduction Data Type: Value and Lexing Space Define Type Define Simple Type Define Complex Type Positioning and Management Schematic Small Note
Summary: The XML architecture is expected to play core roles in future XML processing, especially in web services, which will serve as one of the important foundations of constructing higher level abstraction. This article explains how to use the XML architecture definition language. (22 page print pages)
Introduction
1 2 =?
In the software, the information you need to answer this class is provided by the type system. The programming language uses type system to simplify the task of generating quality code. Type systems define a set of types and operations available to developers in their programming. A type defines a value space, or in other words, define a group of possible values. For example, if the above operand is considered a numeric type, the answer may be 3; but if they are considered a string type, the answer may be "12", the specific situation depends on how the " " operator is defined. .
One of the main benefits of type system is that the compiler can use it to determine if there is an error before running, which avoids a large number of errors. The compiler can also utilize type system information to generate an operation code for a given type. In addition, both compilers and runtime depends on the type system to determine how to assign memory spaces when using a particular type, which makes developers do not pay attention to these monotonous work.
Many languages and runtime allow for programming type information during runtime. This allows developers to consider more, propose issues with respect to type characteristics, and make decisions based on the corresponding answer. This technique for checking type information at runtime is often referred to as reflection. In today's mainstream programming environment (eg, Microsoft? NET Framework and Java), reflecting plays an important role, which effectively reduces problems that developers must consider in their code. In these programming environments, virtual machines (e.g., public language runtime or JVM) provide additional services required to most programs (for example, security, garbage collection, serialization, remote method calls or even web service integration).
Figure 1. Benefits of type information
A well defined type system and reflection can also create better tools in order to use this language. Developers have been able to quickly adapt to many things, for example, Microsoft® IntelliSense?, Code is completed, and those who can greatly accelerate the development process. Red Squiggle. Generally, a good type of system will provide many interesting benefits (see Figure 1), most of which is easy to be treated as a matter, but not when people feel lost.
XML 1.0 is a typical example of a language that lacks intelligent type system. If there is no type system, the information found in the XML 1.0 document can only be considered text. This requires developers to know "real types" in advance so they can perform the necessary forces in the code.
XML Architecture Definition Language (XSD) provides a type system for an XML processing environment. In the small container, the XML architecture describes the type you want to use. XML documents that meet XML architectures typically refer to instance documentation, which is very similar to the traditional-oriented (OO) relationship between classes and objects (see Figure 2). This is a conceptual switching of a basic work mode of a skip document type definition (DTD) that provides greater flexibility when mapping to a traditional programming language or database type system. In these environments, the XML architecture greatly rejected DTD usage. Figure 2. OO with XML concept
The XML architecture can only provide all the benefits shown in Figure 1 in a single way in XML-centric approach. Logic XML documents containing XML schema type information are often referred to as a post-schema verification information set (PSVI). PSVI makes it possible to operate as follows: Like other programming environments, perform reflections based on XML architectures at runtime. In general, XML architecture is expected to play core roles in future XML processing, especially in web services, which will be one of the important foundations of building higher level abstractions. The remainder of this article describes how to use the XML architecture definition language in more detail.
Back to top
Data type: value and lexical space
The XML architecture provides a list of built-in data types, developers can use it to constrain text (see W3C XML Schema Part 2: DataTypes Web Page) for assistants. All of these types can be found in http://www.w3.org/2001/xmlschema namespace. Each type has a defined value space. Type value space is only a set of values available in an instance of a given type.
Figure 3. Byte value space
For example, an XML architecture provides a built-in type named byte with a value space from -128 to 127. Another example is the Boolean type in the XML architecture, which is very simple because it only has the following two values: true and false. There are 44 built-in types for you to choose from, each has different value spaces to meet the needs of different data modeling.
Figure 4 illustrates that many built-in types are defined as subsets of another type of space, also known as derived by limiting. For example, byte type value space is a subset of short-interented space, short integer space is a subset of integer space, and the integer space is a subset of long-intensive spaces, and so on. . Therefore, the basic set theory tells us that an example of a derived type is also an effective example of any ancestral type. (Strictly speaking, they are the subset of AnysimpleType itself.)
Although the programming language uses value spatial information to calculate how much memory is required, the developer has little need to worry about the problem that they represent them as text. However, for XML, it cannot ignore a fact, that is, the instance will be likely to serialize the XML 1.0 file, which requires a value in the form of a lexical. If each XML schema processor decides to do this, interoperability will soon lose. Therefore, in addition to defining the value spaces of each type, the XML architecture also defines the form of the lexical representation allowed.
Figure 4. Type subset
For example, the true value of the Boolean can be represented as "true" or "1", and the boolean fake value may be represented as "false" or "0". The double precision type value 10 can be expressed as "10", "10.0" or "10.0000", and can even be "0.01E3". The date type "January 1, 2003" can be expressed as "2003-01-01" in the format of the word. If any type of lexical format (and any possible form) meets the standard, the developer can do not consider the complexity of the code actual serialization method, and dedicated the value in the code.
Back to top
Define type in namespace
In addition to providing built-in types, most programming languages also allow developers to define their own type, which is often referred to as a user-defined type (UDT). When defining UDT, most of the programming languages allow you to use namespace to limit them in order to make them confused with other UDTs that have the same name. For more information on how XML namespace work, see Understanding XML Namespaces. Figure 5 shows a C # namespace definition and a definition of XML architecture with it. As you can see, the XML architecture also supports the definition type in the namespace.
Figure 5. Define type in naming space
XSD: Schema elements determine the scope of the namespace, and the targetnamespace property specifies the name of the namespace. For example, the following XML architecture template defines a new namespace name http://example.org/publishing:
Targetnamespace = "http://example.org/publishing" XMLns: TNS = "http://example.org/publishing" > ... xsd: SimpleType> ... xsd: complexType> ... xsd: schema> All content within the XSD: SCHEMA element (as direct sub-level) is considered globally, so they will automatically be associated with the target namespace. In the above example, http: //example.org/publishing Namespace has 4 elements: Authorid, Authrype, Author and Authorid. Therefore, the namespace defined name must be used whenever one element is quoted in the architecture. In order to use the namespace defined name, another namespace declaration will be required, which is mapped to the target value of the architecture. The role of the "TNS" namespace declaration shown above is here. Therefore, whenever you need to reference the content defined in the architecture, I can add the "TNS" prefix before the name, as shown in this example. You can define two types within XSD: Schema elements: Simple Type (using XSD: SimpleType) and complex type (using XSD: ComplexType). The simple type can only be assigned to plain text elements and properties because they do not define structures, but define the value space. Elements with additional structures (eg, elements with attributes or child elements) must be defined as complex types. In addition to the type, you can define global elements (using XSD: Element) and properties (using XSD: Attribute) within the architecture and specify types. In the above example, I defined a global element called Author and a global attribute called Authorid. Because these constructs are also globally, it is necessary to limit them when I use them in an instance document. The following XML document contains an instance of the AUTHOR element defined by: ... x: author> The following XML document contains global authorid properties: x: Authorid = "333-33-3333" /> You can also use the http://www.w3.org/2001/xmlschema-Instance Namespace to explicitly specify the type of element in the instance document. This namespace contains a few properties that can only be used in the instance document. Using type properties is similar to mandatory conversion between types in some programming languages. The following example is the genericID element (defined in the architecture) Explicitly specifies the Authorid type: XMLns: x = "http://example.org/publishing" XMLns: xsi = "http://www.w3.org/2001/xmlschema-instance" XSI: Type = "TNS: Authorid" > 333-33-3333 genericid> Note that Authorid and we specify the type of the global authorid property displayed above. This indicates that you can specify a simple type to constrain their values for attributes or plain text elements. Again, be noted that XSI: Type technology for specifying types can only be applied to elements without being applied to attributes. Back to top Define simple type Most programming languages only allow developers to arrange a variety of built-in types to a structuring type, without allowing developers to define a simple type with a new user-defined value space. At this point, the XML architecture is different because it allows users to define their respective custom simple types, which are a subset of a predefined built-in type. As with the previously shown, you can use the XSD: SimpleType element to define a new simple type. In the XSD: SimpleType element, you can specify a base class that you want to restrict (using the XSD: Restriction element) its value space. Within the XSD: Restriction element, you can accurately specify how you want to limit the base type by limiting one or more aspects. For example, the simple type below uses XSD: MININCLUSIVE and XSD: MaxInClusive, constraints XSD: Double and XSD: Date Value Space: ... xsd: restriction> xsd: SimpleType> xsd: restriction> xsd: SimpleType> ... The following document contains a valid example of the elements defined above: > 2003-06-01 x: PublicationDate> The XML architecture defines aspects available for each type (see Table 1). Most aspects cannot be applied to all types (some aspects are only meaningful for certain types). Most aspects limit the value of the type of type, and the mode is limited to the type of lexical space. For value space and lexical space, any of the two will indirectly limit another one. Previous example constraints the value of the base type, and the next example uses the regular expression constraints the words of the string: ... xsd: restriction> xsd: SimpleType> xsd: restriction> xsd: SimpleType> xsd: restriction> xsd: SimpleType> ... The following document contains a valid example of the elements defined above: > 123-45-6789 x: authorid> > 01-23456789 x: Pubsauid> > (801) 390-4552 x: phone> A string that matches the regular expression (specified in mode aspects) is considered a valid instance of a given type. Aspects of element Description xsd: enumeration Specifies a fixed value that must match this type. XSD: FractionDigits Specifies the maximum value of the decimal number of decimal points on the right side of the decimal point. XSD: Length Specifies the number of characters in a string, based on the number of eight-bit bytes in the binary, or the number of items based on the list. XSD: maxexclusive Specifies the upper limit of the value space of this type (excluding the upper limit). XSD: MaxInClusive Specifies the upper limit of the value space of this type (including the upper limit). XSD: MaxLength Specifies the maximum number of characters based on characters in a string, the maximum number of eight bytes in the binary type or the maximum number of items based on the list. XSD: MINEXClusive Specifies the lower limit of the value space of this type (excluding the lower limit). XSD: MININCLUSIVE Specifies the lower limit of the value space of this type (including the lower limit). XSD: minLength Specifies the minimum number of characters in a string based type, the minimum number of eight bytes in the binary type or the minimum amount of items based on the list of types. XSD: Pattern is based on the regular expression specifies a mode that must match this type. XSD: TotalDigits Specifies the maximum number of decimal bit numbers to the type of digital derived. XSD: Whitespace Specifies blank normalization rules. Table 1. Another interesting aspect is XSD: Enumeration, which allows the value space to constrain the value list as a list of enumerated values. The following example constrained XSD: NMTOKEN's value space to four specific enumeration values: ... xsd: restriction> xsd: SimpleType> ... The following document contains a valid example of the elements defined above: > Online x: Pubtype> Derived element description XSD: Restriction new type is an existing type limit, which means that the new type has a narrower legal value of a set of ranges. XSD: The new type of List is another simple type, separated by blank. XSD: Union new types are two or more simple types of combination. Table 2. Simple Types of Construction Skills In addition to the value of the limit type, you can also construct a new simple type of a list or a combination of other simple types. To do this, use the XSD: List or XSD: UNION element, without using XSD: Restriction (see Table 2). When using XSD: LIST, it is essentially defined from the specified value space to define a list of values separated by blank. It is worth reminding that when using XSD: List or XSD: UNION, it is not like a derived hierarchy like using XSD: Restriction, so that type compatibility cannot be applied in these cases. The following example defines a new type named AuthorList as a list of SSN values. ... xsd: SimpleType> ... The following document contains valid examples of the Authors element: > 111-11-1111222-22-2222 333-33-3333 444-44-4444 x: authors> For XSD: Union, it is a new type in which you can combine multiple value space to a new value space. The unique example of a federal type can be a value in any value space specified. For example, the type named Authorid is combined with the SSN value space with the PublisheraSsignedID value of the SSN value: ... xsd: simpleType> ... Each document below shows an effective instance of the Authorid element: > 111-11-1111 x: Authorid> > 22-22222222 x: authorid> The support of the XML architecture for user-defined types (and more specific custom value space / lexical space) is one of the more powerful aspects of this language. Since most programming languages do not provide this support, developers have to handle such issues in their application code (usually through the attribute setter). This definition can completely meet the custom value space / lexical space of your needs, making the error handling and verification code issues a difficulty level. ... xsd: complexType> ... Back to top Define complex types The XML architecture allows you to arrange different simple types (or value space) as structures (also known as complex types). You can use the XSD: ComplexType element to define a new complex type in the target namespace of the architecture, as shown below: The XSD: ComplexType element includes a so-called synthesizer, a synthesis of the synthesizer, thus being called the content model of the element. The XML architecture defines three synthesizers available in complex type definitions: XSD: Sequence, XSD: Choice and XSD: All (see Table 3). The synthesizer contains particles, and the particles include content such as other synthesizers, elements declarations, wildcards, and model groups. Attribute declarations are not considered particles because they do not repeat. Therefore, the attribute declaration is not placed in the synthesizer, but is behind the synthesizer at the end of the complex type definition. Synthesizer defines the orderly sequence of XSD: Sequence contains particles. XSD: Choice is available in the selected particles. XSD: All contains particles in any order. Table 3. Complex type synthesizer Element declaration (XSD: ELEMENT) may be the most commonly used particles. The following name is an ordered sequence that is composed of two sub-elements and an attribute (sub-element and attribute points here): ... xsd: sequence> xsd: complexType> Elements and properties declared within the XSD: ComplexType element are treated as part of the complex type of local elements and properties. Partial elements and properties can only be used within the context defined. This triggers an interesting question: In the instance document, whether the local element / attribute needs to be limited by namespace. Since local elements and attributes always contain a ancestors defined by the target namespace (usually global elements), people can think that it is not necessary to make local elements and properties are defined by namespaces. This is similar to how many programming languages - If you define a class in a namespace, only the name of the class is limited by namespace, and its local member will not be limited. For this reason, in the XML architecture, local elements and attributes should not be limited by default. Therefore, the effective example of the Author element is as follows: ID = "333-33-3333" > x: author> However, the XML architecture allows the use of XSD: Element / XSD: Attribute's FORM attribute or using the XSD: Schema's ElementFAULT / AttributeFormDefault property, to display the given partial element / attribute should be limited or unlimited, as shown below: Targetnamespace = "http://example.org/publishing" XMLns: TNS = "http://example.org/publishing" ElementFormDefault = "Qualified" AttributeformDefault = "qualified" > ... xsd: schema> After this architecture, the following instance will be considered a valid instance (and the above instance will be considered invalid instance): x: id = "333-33-3333" > x: author> In most cases, as long as the example is in line with the architecture, it is irrelevant to which namespace style for local elements is irrelevant. You can also use the REF attribute to reference the global element / attribute declaration within a complex type, as shown below: ... xsd: sequence> xsd: complexType> ... Since ID and NAME are global elements, they always need to be limited in the instance document. Use "Ref" to specify that global elements can also be used in the AuthhorType μ? But this does not change it needs to be limited. The Phone element is still partially defined, which means that in an example, it may also need to be limited, depending on the form used. Therefore, suppose ElementFormDefault = "UNQUALIFIED", the valid instance will be as follows: x: id = "333-33-3333" > x: author> Here is an example of a slightly complex point that uses nested complex types, other synthesizers and repeating particles: xsd: all> xsd: complexType> xsd: choice> xsd: complexType> xsd: choice> Minoccurs = "0" maxoccurs = "unbounded" /> Type = "tns: publicationsListtype" /> xsd: sequence> xsd: complexType> ... In this example, AuthorType contains a sequence consisting of another synthesizer and an option, and there are three elements declare. Some elements belong to other complex types defined by the user, which can effectively define nested structures within this type. This option means that the NAME element is allowed or allowed to appear in this position. Finally, the ALL synthesizer in AddResStype indicates that the order of the elements is negligible. Also note that the Phone element declaration specifies the constraint using the Minoccurs and the maxoccurs element. A constraint can be applied to any particles in a complex type. The default value for each appearance is 1, which means that the given particles must appear once at the specified location. Specify minoccurs = "0" will make the given particles to be optional particles, and the specified maxoccurs = "unbounded" will allow particles to repeat unlimited multiple times. You can also specify any restrictions according to your own preferences, for example, Minoccurs = "3" maxoccurs = "77". In general, the use of the synthesizer use is suitable for the entire group (please note that publicationsListType will appear for a constraint to an option). The following example shows the effective example of the new authortype: ID = "333-33-3333" > address> recentpublications> x: author> By default, complex types have closed content models. This means that only the specified particles are allowed in the example. However, the XML architecture allows the use of so-called wildcards to define an open content model. Using XSD: Any within complex types means that any element can appear in that position, which effectively makes it a placeholder of content that cannot be predicted in advance. You can also use XSD: Anyattribute to define placeholders for the property. xsd: sequence> xsd: complexType> An effective example of the Author element defined above is explained below: XMLns: AW = "http://www.aw.com/legal/contracts" AW: AUID = "01-3424383" > aw: contract> ... x: author> When using wildcard, you can also constrain the namespace from the content from the content. XSD: Any and XSD: Anyattribute comes with an optional namespace properties that can include any of Table 4 shown in Table 4. This makes it very specific about what is replaced by wildcards. Attribute values Allowed Elements ## any Elements from any namespace ## 的 的 的 的 的 的 的 的 的 的 的 的 的 的 的 的 的 的 的 的String list comes from any element in the namespace listed Table 4. Namespace properties of wildcard Using wildcards, you can also specify how the schema processor handles wildcard content during the verification process. XSD: ANY and XSD: Anyattribute comes with a processContents property that specifies one of the following three values: LAX, Strict, and Skip. This value tells the processor to perform architectural verification for the content of the replacement of the card. Strict instructions must perform verification for content, and the LAX indicates that the processor should perform verification when the schema information is available, while the SKIP indicates that the processor cannot perform architectural authentication. Let us look at an example of using these properties. The architecture of the SOAP 1.1 actually uses wildcards and these two properties to define SOAP: Header and SOAP: body elements: XMLns: TNS = "http://schemas.xmlsoap.org/soap/envelop/" Targetnamespace = "http://schemas.xmlsoap.org/soap/envelop/" > ... xs: sequence> ProcessContents = "LAX" /> xs: complexType> Maxoccurs = "unbounded" processContents = "LAX" /> xs: sequence> ProcessContents = "LAX" /> xs: complexType> ... xs: schema> According to this architecture, SOAP: Header can contain zero or more elements, as well as any number of properties from non-TargetNameSpace namespace, and SOAP: body can contain zero or more elements, as well as arbitrarily from any namespace. Quantity properties. In both cases, authentication (eg, LAX verification) should only be performed when the architecture information is available at runtime. Because in advance, it will be pre-known what will be placed in the SOAP: Header or SOAP: Body element, so the wildcard provides a method of defining a flexible open frame for an XML message. Back to top Positioning and management architecture One of the total problems in this point is that the XML schema processor is scheduled to locate the schema definitions required for a given instance document. The XML schema processor cuts off the namespace of the sample document to locate the corresponding architecture, but the XML architecture specification does not explicitly specify how the processor should do this. Most processors allow you to load a schema cache that contains all architectures you will use. Then, at runtime, you only need to point the processor to the schema cache so that it can effectively find the architecture required for a particular instance. The XML architecture also defines a method of providing a schema location prompt in an instance document. This is made through the XSI: SchemAlocation property, as shown below: XMLns: xsi = "http://www.w3.org/2001/xmlschema-instance" XSI: SchemAlocation = "http://example.org/publishing pubs.xsd" > The XSI: SchemAlocation property allows you to provide a list of space-separated, including namespace names and URI position pairs (pointing to where to find a specific architecture file). However, this is just a prompt, the processor may not actually see if there is still a more efficient retrieval mechanism. Back to top summary The XML architecture provides an expression type system for XML (capable of providing a lot of powerful services). We have covered the basis of XML architecture definitions, including simple type definitions and complex type definitions. Simple Type Definition Allows you to define custom value spaces for plain text elements and properties. On the other hand, complex type definitions allow you to arrange simple types as a structure. In fact, the XML architecture is far more than what we have discussed here. For example, complex type definition supports derived by extending and restrictions, allowing you to define complex type hierarchies in a well mapped to the OO class hierarchy. After constructing complex type hierarchies, replacement techniques can be used in instance documents. The XML architecture also enables the XML architecture definition to penetrate into multiple files and namespaces, and then they are included and / or imported to increase reuse and simplify maintenance. These more advanced topics are also introduced in the future article about XML architecture design. For more information on XML architecture, review the electronic version of Essential XML Quick Reference (free download from online) - XML architecture chapter contains a simplified description and example of each constructor. Back to top reference XML Schema Part 0: Primer XML Schema Part 1: Structure XML Schema Part 2: DataTypes Essential XML Quick Reference Go to the original English page