The main purpose of the article describes how to standardize the XML document. In order to better understand the standardized rules, I omitted some content (XML digital signature; asymmetric key system and information summary). Let's take a look at the following two documents (files 1 and files 2) file 1 Xml version = "1.0"?> rooms> file 2 xml version = "1.0"?> rooms> You will definitely say: Both files are the same. Yes, the two files express the same information, which use the same document structure, which is logically logically. You may have noticed some of the small differences between them: some of the order of content is different (the content of the blue font). In this example, the order of the attributes of the elemental Room of the two files is different, so their corresponding byte streams are also different. Of course, there are many other reasons leading to different characteristics of the logically identical XML document. The purpose of establishing an XML document specification is to determine whether different XML documents are logically identified. W3C has formulated a standardized rule, and uses these rules to standardize the same document as two logically, and can get the same document. When we need to judge whether the two XML documents are logically logically, we can first specify the document, then transform into byte streams, if the word is the same, then we can conclude that the two documents are logically logically of. The XML standardization rules define a set of rules to form a specified XML document. Here's a document (file 3) as an example, step by step to standardize the XML document.
File 3 XML Version = "1.0"?> ]> & TestHistory; comments> part> & testhistory; comments> part> parts> product> 1 Coding method code is referring to byte by one byte by one in a certain manner. Obviously, the same content of the same content of different coding methods is used, and the obtained byte stream is different. The XML specification Terms specify that the XML specification is encoded using UTF-8. If you need a normalized XML document to use other way to encode, you must first convert it to UTF-8 encoding. 2 Broken log text file interrupt line characters generally use A or D (hexadecimal) or a combination of both. The XML document is an ordinary document file, so it also uses #xa and #xd as a breakbox. The XML specification requires that all broken lines are represented by #xa. 3 Whitening XML Standardization requires all blank characters (such as Tab and Space to Space (# x20), file 4 is a transformed file.
Note: In File 3 (), there is a taber file between S and 1753 4 XML Version = "1.0"?> ]> & testhistory; comments> < / part> & testHistory; comments> part> parts> produter 4 property value in Dual quotation number XML document specification In the form, the attribute value must be enclosed in double quotes. Document 4 (red part), Name's attribute value is single quotes, must be changed to double quotes. File 5 is the file after the specification.
File 5 XML Version = "1.0"?> ]> & TestHistory; comments> part> & testhistory; comments> part> parts> product> 5 Special Character File 5 in the property value 5 has a problem (red part): Name property value contains dual quotes. XML Standardization Rule stipulates that special characters in attribute values (such as dual quotes) must use the corresponding escape character (such as "instead of dual quotation marks).
Document 6 XML Version = "1.0"?> ]> & TestHistory; comments> part> & testhistory; comments> part> parts> product> 6 Entity Reference file 6 contains a DTD declaration that defines an entity: TestHistory, this entity is referenced by the element Comments. Standardization requires that there is no entity reference in documentation, and it needs to be replaced with its content. Document 7 is a standardized document.
Document 7 XML Version = "1.0"?> ]> Part Has Been Tested According to the specified standards. comments> part> <:suppplier ID = "s 3908" /> < Comments> Part HAS Been Tested to the The Specified Standards. Comments> part> parts> product> 7 Default Properties File 7 defines a default attribute approved (red font) for Part elements, In normalized documents, the default attribute must appear in the properties of the elements. The file is standardized by the file 8.
Document 8 XML Version = "1.0"?> ]> Part HAS Been Tested According to the specified standards. Comments> part> Part HAS Been Tested to the specified standards. Comments> part> parts> product> 9 XML and DTD declaration Standardized XML documentation does not exist XML Or DTD declaration, file 9 is file that removes XML and DTD declarations.
File 9 Part HAS Been Tested According to The Specified Standards. Comments> part> Parthas Been tested accounting to the specified standards. comments> part> parts> product> 10 Document Element Standardized XML documents do not have spaces outside the document element, and the document begins with "<", in "<" can't have spaces in front. When the file 10 removes the file behind the "<" in front of the space.
File 10 Part HAS Been Tested According to The Specified Standards. Comments> part> Part HAS Been Tested According to The Specified Standards. Comments> Part> parts> product> 11 Start and end elements Space 1) "<" There is no space between the element name, "" is the same. 2) If the element contains properties, there is only one space between the element name and attributes. 3) There is no space between the equal sign between the attributes and attribute values. 4) There is only one space between attribute values and adjacent properties. 5) There is no space before ">". 12 empty elements Standardized XML documentation, empty elements should appear in the form of <...> ...>, to obtain the file 11 after converted to EmptyElement>.
Document 11 sup: Supplier> sup: supplier> Part Has Been Tested According to the Specified Standards. Comments> part> sup: support = "s 4589"> sup: support = "s 1098"> sup: supporting to be HAS been test accounting to The Specified Standards. Comments> part> parts> product> 13 Names Space Declaration XML Document Normalization Requirements Documentation In addition to excess namespace, all namespaces are retained. The namespace of the second part element in the file 11 is redundant, and she does not affect the namespace context of all nodes in the document.
Document 12 sup: Supplier> sup: supplier> Part Has Been Tested According to the Specified Standards. Comments> part> sup: support> sup: supports> Part Has Been Tested According to the Specified Standards. Comments> part> parts > product> 14 Element Properties Sort XML Document Specification Requirements Elements The properties are arranged in ascending order, in an element, the namespace first appear first, then the attribute name and attribute value, the file 13 is the file file 1 after the arrangement file file 13 < Product xmlns = "http://www.myfictiouscompany.com/product" XMLns: sup = "http://www.myfictitiouscompany.com/supplier" classification = "Measuringinstruments / Electrical / Energy / "ID =" P 184.435 "Name =" Rotating Disc "EnergyMeter"> <