Translation: XML 1.1 Candidate Recommendation Standard Unicode Simplified Chinese version (http://xml1p1.w3china.org) Original: XML 1.1 Candidate Recommendation (http://www.w3.org/tr/2002/CR-XML11-20021015) : L This document is translated by XML 1.1 candidate recommended standards released on October 15, 2002. l This document is the only official version. l Although the translator has made efforts to translate the accuracy of translation, it is still inevitable to exist. Welcome to letter finger. l The content of the translation is inappropriate, only represents the translator's personal point of view. l Copyright Notice is located in: Copyright © 2002 W3C ® (MIT, INRIA, KEIO), All Rights Reserved. W3C Liability, Trademark, Document Use and Software Licensing Rules Apply. Translator: Collin HSU
Time: The first released on October 28, 2003 / finally updated on October 28, 2003
Scalable Markup Language (XML) 1.1
W3C candidate recommended standard October 15, 2002
current version:
http://www.w3.org/tr/2002/CR-XML11-20021015
http://www.w3.org/tr/2002/CR-XML11-20021015
http://www.w3.org/tr/2002/CR-XML11-20021015
Scalable Markup Language (XML) 1.1
W3C candidate recommended standard October 15, 2002
current version:
http://www.w3.org/tr/2002/CR-XML11-20021015
http://www.w3.org/tr/2002/CR-XML11-20021015
Scalable Markup Language (XML) 1.1
W3C candidate recommended standard October 15, 2002
current version:
http://www.w3.org/tr/2002/CR-XML11-20021015
Scalable Markup Language (XML) 1.1
W3C candidate recommended standard October 15, 2002
current version:
http://www.w3.org/tr/2002/CR-XML11-20021015
The latest version of:
http://www.w3.org/tr/xml11/
Previous version:
http://www.w3.org/tr/2002/wd-xml11-20020425/
editor:
John Cowan, Reuters
Summary
This document is the work outcome of XML Core Working Group, which describes XML 1.1 based on the definition in XML Blueberry Requirements. XML 1.1 will be the XML Blueberry in the past. The description of this document will take a series of modifications to XML 1.0 recommendations [XML1.0]. The chapter number of this document corresponds to the post number in the XML 1.0 recommendation standard. For those XML 1.0 recommendations, there is no chapter in this document, indicating that they have not been changed.
This document is status
This section describes the status of this document in the release. This document may be replaced by other documents. The latest state of this document sequence is maintained by W3C.
This document is the W3C candidate recommendation standard for XML 1.1 (W3C Candidate Recommendation). The status of the Technical Report [Technical Report [Translation // Technical Report, the official document released by the W3C] means that the document is considered stable and encouraging the development group to implement it. For a description of the candidate recommended standard state, see Section 5.3.2 of Process Document. Once XML Core Working Group, a part of XML Activity, see a summary) confirmed that there are at least two implementations that can be interoperable, and they will drain the W3C Director to improve this specification as proposed Recommendation. These two implementations must be implemented by different organizations.
The current implementation report has recorded feedback from the implementation we have received so far.
The status of the candidate recommended standard does not indicate that it has been recognized by the W3C member. It is a draft that may be updated, replaced or invalidated by other documents. "Work In Process" should be indicated when referenced this document.
This document and its subsequent documentation will be written in the form of XML 1.0 recommended standards for editing and checking. The final XML 1.1 Recommendation Standard (XML 1.1 Recommendation) may take the form of the necessary modification of the XML 1.0 recommendation standard.
Records related to the intellectual property related to this document can be found in the Working Group's Public IPR Disclosure page.
We clearly request to comment on this document. The review period of the candidate recommended standard is over 25:59 on February 14, 2003. Please send your comments to WWW-XML-BLUEBERRY-COMMENTS@w3.org. This is the preferred way to provide feedback. Public comments and reply can be obtained from the following Website: http://lists.w3.org/archives/public/www-xml-blueberry-Comments/.
The release of this document does not indicate that it has been recognized by W3C members. It is a draft that may be updated, replaced or invalidated by other documents. "Work In Process" should be indicated when referenced this document. The latest list of W3C recommendations and other technical documents (Technical Document) can be obtained from the following URL: http://www.w3.org/tr.
table of Contents
Introduction
2.2 characters
2.3 Ordinary syntax structure elements
2.8 Preface and Document Type Declaration
2.11 line treatment
2.13 W3C Standardization Check [New]
4.1 Character reference and entity reference
4.3.4 Version information in the entity [new]
Appendix A Reference
Appendix B character class [replacement]
Introduction
The W3C's XML 1.0 recommended standard was originally released in 1998. Although the release of XML 1.0 (second edition) is a lot of errata, XML 1.0 remains unchanged in terms of well-final XML (intended). This stability is extremely useful for interoperability. However, the Unicode Standard as the character specification that is dependent on XML 1.0 does not remain unchanged, and it has been upgraded from version 2.0 to version 3.1 or even higher. Many characters in Unicode 2.0 may have been used in Character Data in XML 1.0. However, they are not allowed to appear in the XML Name (XML Name) (such as element type name, attribute name, enumeration attribute value, PI target, etc.). In addition, some characters should be allowed in the XML name, due to negligence or uniform with Unicode 2.0, etc. is not allowed.
After XML 1.0, the overall view of Name (Name) has changed. However, XML 1.0 provides a strict definition to the name (Name): any of these is not allowed, it is forbidden. The name of XML 1.1 is designed such that any unforesented (for a particular reason) is allowed. Considering the development of Unicode versions will cross version 3.1, to avoid further changes to XML, XML 1.1 almost all characters (including those that have not been specified) appear in the name. In addition, XML 1.0 attempts to adapt to the row tail processing rules of various modern operating systems, but does not consider the line-up processing rules for IBM (or IBM compatible) mainframe. Therefore, according to local rules, the XML document on the mainframe is not a simple text file. The XML 1.0 documentation of large machines must violate local line processing rules or use other excess conversion processes before parsing and generating. When data is to be shared between mainframes and non-large machines (different from one machine to another machine), it is especially important. Therefore, XML 1.1 adds a NEL (# x85) character in the line tail character list. For complete consideration, XML 1.1 also supports Unicode's line separator # x2028.
Finally, a standard representation is a requirement that should be valued for any Unicode character in the XML document. Thus, XML 1.1 allows character references to reference the control characters from # x1 to # x1f (where most of them are disabled in XML 1.0). For robustness (Robustness), these characters cannot be used directly in documents. In order to improve the robustness of character encoding recognition, the additional control character (from # x7f to # x9f) that is allowed to appear in XML 1.0 document (from # x7f to # x9f) must now appear in the form of character references (except for blank characters), sacrificing less The post-compatibility effect is not much. Due to potential problems in APIS, #x0 is still disabled (ie, you can't use it directly, you can't use the character reference).
There is no set of errata for XML 1.0, but create a new version of XML because these changes affect this definition of the Well-Formed XML document. The XML 1.0 processor must continue to reject those XML names containing new characters, using new row tail processing rules, and documents reference to control characters. The distinction between XML 1.0 documents and XML 1.1 documents can be judged by the version number information in the XML Declaration.
2.2 characters (Characters)
Change the production [2] to:
[2] char :: = # x9 | #xa | #xd | [# x20- # x7e] | # x85 | [# xa0- # xd7ff]
| [# xe000- # xffd] | [# x10000- # x10fff]
Change the annotation of the production [2] to:
Any Unicode character removes most of the ISO control character, the Surrogate Blocks (Surrogate Blocks) is Unicode Terminology, and see the Unicode Surrogates], FFFE, and FFFF written by TIM BRAY.
2.3 Common Syntactic Construction Elements Modify Generation [4] and add new generation [4A]:
[4] nameStartchar: = ":" | [a-z] | "_" | [a-z] |
[# xc0- # x2ff] | [# x370- # x37d] | [# x37f- # x1fff] |
[# x200c- # x200d] | [# x2070- # x218f] | [# x2c00- # x2Fef] |
[# x3001- # xd7ff] | [# xf900- # Xefff]
[4A] Namechar: = namestartchar | "-" | "." | [0-9] | # xb7 |
[# x0300- # x036f] | [# x203f- # x2040]
Change the generated [5] to:
[5] Name :: = namestartchar namechar *
Insert the following three paragraphs behind the production [5]:
The first character of Name must be a namestartchar, and the rest of the characters must be NameChars; this mechanism guarantees that the name does not use Latin (ASCII) or basic combination character (combining // unicode terminology, definition) See here. ]beginning. Almost all characters can appear in the name (except those that are used or may be used as a separator). Its purpose is to make it compatible rather than exclusive, which has not been used by UNICODE-encoded writing systems [Declats // Unicode terminology, see here. ] Can also be used in the XML name. For the recommendations of the name creation, see Appendix B.
It is recommended that the document author uses meaningful words in the natural language as an XML name and avoids using symbolic characters or blank characters (Whitespace Characters) in the name. Note: Call registration (:), even characters (-), the junction (_) and dots (·) are clearly allowed.
The ASCII symbol character (Symbols), punctuation, and a large number of Unicode symbols appear in the XML name, and these characters can be used as separators when using an XML name in the context outside the XML document. Character # 0x037e (Question mark of Greek characters) is disabled because the character will become a semicolon (;) after normalization, and this will change the meaning of entity reference.
Change the production [7] to:
[7] NMTOKEN :: = NameChar
2.8 Preface and Document Type Declaration (ProLog and Document Type Decaration)
2.8 Preface and Document Type Declaration (ProLog and Document Type Decaration)
Change all "1.0" to "1.1".
Add this paragraph:
The XML 1.1 processor should also process XML 1.0 documents. If a document is a well-formed or a valid (Valid) XML 1.0 document, and does not directly contain characters in [# x7f- # x9f] (except in the form of escape character form), Change the XML version number of the document to "1.1" to make it a well-organized or valid XML 1.1 document. 2.11 end-of-line handling
Replace the second paragraph as the following text ::
To simplify the task of the application, the XML processor passes the character of the application must be like all the external parsing entities (including document entities) (including document entities) (including document entities) before making parsing. The symbol is the same. The normalization here is done by replacing the following characters to the character #xa:
· Double character sequence #xd #xa
· Double character sequence #xd # x85
Single character # x85
Single character # x2028
· All characters #xd without character #xa or character # x85.
2.3 Standardized Checking [New]
All XML parsing entities (including document entities) should be found in accordance with the definitions in [Charmod] and the terms in the XML-related construction component [translation // charmod ". ] Supplemental definitions are fully normalized [TEXT // Charmod to define three levels of standardization, which is: Unicode normalization, embedd normalization, and complete standardization Fully Normalization. ]:
· All parsed entities (Replacemane Text)
· All the following texts are matched in the context
o CDATA
o Chardata
o Content
o Name
o NMToken
However, even if the document is not fully standardized, it is still an agreement. The XML processor should make the user choose whether or not to verify that the processed document is in full normalized form, and reports the result of the verification to the application. According to the provisions of [Charmod], only the type of Certified [Certi // Certified Text is the terminology in Charmod, the definition will be seen here. Wherever the verification should be selected.
For fully standardized verification (or like) First verifying entity is an Include-Normalized (see [Charmod]), and then verify that the relevant construction components listed above are not starting with component characters (Composing Character) (See [Charmod]) after the character reference is expanded. Non-Validating Processors must ignore non-standardization (Denormalization) that may be caused when an unread external entity is embedded. Note: Composition Characters [Terminology in Charmod, the definition is referred to here] for all non-zero combining classes [Decludes // Unicode specifies a number for each combination class. To distinguish each combination class. These numbers are only identified, and there is no meaning. ] The characters in this, plus a small number of characters in the Class-Zero (refer to characters that are not the first character in some standard decomposition). Since the component characters are used to follow the base characters [Decline // Unicode term, see here], therefore, the restriction related construction component (including the content) cannot be expressed in the beginning of the component character and does not substantially reduce the expression capability of XML. .
If the XML processor encounters characters that cannot determine its normalized feature during a fully standardized verification process (i.e., the [Unicode] version introduced into these characters is released after the Unicode version considered when the processor is implemented), then the process is processed. The device can be ignored (DenorMalization) issues that may be caused by these characters (depending on the user's selection). Applications should not choose to ignore these uncalmonized.
The XML processor must convert the input into a fully standardized form. Creating an XML 1.1 output XML application (whether the input is XML 1.0 or XML 1.1) should ensure that the output is completely standardized; and for internal processing forms, it is not fully standardized.
The intent of this section is to strongly recommend the XML processor to ensure that the creator of the XML document has completed fully standardized, so the XML application can make a string comparison, not to worry about a variety of different spelling forms of the string (this is unicode allowed ).
4.1 Character and Entity References (CHARACTER AND Entity References)
Replace Well-formedness Constraint: Legal Character to the following text:
The characters referenced in characters in characters must be in accordance with the formula CHAR, or an ISO control character in the range [# x1-# x1f] or [# x7f-# x9f].
4.3.4 Version Information in Entities [New]
Each entity (including document entities) can be declared as XML 1.0 or XML 1.1. The version declaration in the document entity determines the version of the entire XML document. An XML 1.1 document can call an XML 1.0 external entity so that no repetition of external entity versions (especially DTD external subset) is required. But in this case, the rules of XML 1.1 are applied to the entire document.
If an entity (including document entity) does not indicate the version number, it is considered that its version number is 1.0. Appendix A References (References)
Added normative references:
[Xml1.0]
Tim Bray, Jean Paoli, CM Sperberg-McQueen, EVE MALER (Editors), EXTENSIBLE MARKUP LANGUAGE (XML), 6 October 2000. (see http://www.w3.org/tr/rec-xml .)
[Charmod]
Martin J. Dürst, François Yergeau, Richard Ishida, Misha Wolf, Asmus Freytag, Tex Texin Character Model For The World Wide Web, W3C Working Draft, 30 April 2002. (See http://www.w3.org/tr/Charmod /.
Appendix B Recommendations for XML Names (Non-normative)
The title of Appendix B is changed to "Normative) to" Non-Normative "and change its content to the following text:
The following is the attribute value to the element name, attribute name, the property name, the Names, the Names, the Names, and the Id type (Values of Attributes of Type) ID) Creating an XML Name (XML Names) gives the best method.
The first two suggestions are directly taken from the rules in the Unicode standard version 3.0, remove all of the control characters, surround non-blank symbols, non-decimal numbers, private use characters, punctuation symbols (exceptional punctuation will be indicated below), Symbol characters, codepoints, and blank characters, etc. Other suggestions are mainly from Appendix B for XML 1.0.
· All names of the first name should be either LL, Lu, LO, LM, LT or NL class [Translation // These classes are defined in the Unicode Overall Category (General Category). ] The characters in it are either character '_' (# x5f).
· The characters other than the first character should be either LL, Lu, LO, LM, LT, MC, MN, NL, ND, PC, or characters in the CF class, or one of the following characters: '-' # x2d, '.' # x2e, ':' # x3a or '·' # xb7 (dot). Since the characters in the CF class are not directly visible, the warning should be given when using them, and can only be used if necessary to avoid the name of the created, but in the eyes of people. identical.
· The name should not contain a critical decomposition [CANONICAL DECMOMPSITION "[Decoction // Unicode Terminology, see here. The like (including in the [# xf900- # xfaff] and [# x2f800- # x2ffd] range, there are twelve characters exceptions).
· Names should not contain compatibility decomposition [Decoction // Unicode terminology, see here. The characters (ie those with the fifth field in the Unicode character library have a compatible format tag "<" as a "<" as the first character of the fifth field). This is not applied to # x0e33 Thai Character Sara am or # x0eb3 lao character AM (although their compatibility decomposition is normal when writing their characters). · Names should not contain combined characters that are only used for symbols (including characters in [# x20d0- # x20ef] and [# x1d165- # x1d1ad].
· The name should not contain a linear tag character ([# xff9- # xffb]).
· The name should not be included in the name.
· There should be no meaningless, unable to pronounce, hardly read or easy to confuse the other name.