Validation
verification
Validation compares the types of elements in an XML document against a Document Type Definition (DTD) or XML Schema. For example, the DTD may say that all "Customer" elements must contain a child "Name" element. Take a look at the DTD FOR HAMLET.XML (Hamletdtd.htm) and the xml schema for hamlet.xml (Hamletschema.htm).
Verification refers to the type of element in the XML document according to the Document Type Definition (DTD) or XML Schema. For example, all "Customer" elements must contain a "name" child element. You can take a look at Hamlet.xml's DTD (Hamletdtd.htm) and Hamlet.xml XML Schema (Hamletschema.htm) [SL1].
Validation is another huge area for performance analysis, but I only have time for a brief mention today. Validation is expensive for several reasons. First, it involves loading a separate file (the DTD or XML Schema) and compiling it. Second, it requires state machinery for performing the validation itself. Third, when the schema also includes information about data types, any data types also have to be validated. for example, if an XML element or attribute is typed as an integer, that text has to be parsed To see if it is a valid integer.
Verification is another major field of performance analysis, but there is only one simple discussion. Due to many reasons, the cost of verification is very large. First, it involves another individual file (DTD or XML Schema) that requires loading. Second, it requires a state machine (State Machinery) to verify. Third, if Schema contains information of data type, then all data types must be verified. For example, if an XML element or type is set to be integrated, the corresponding text must be parsed to see if it is a legal integer.
The Following Table Shows The Difference Between Loading With Count Validation, with DTD Validation, And with Xml Schema Validation.
There is no verification when loaded in the table below, DTD verification and different situations with XML Schema validation:
Sample sample
Load (millisecond) loading (millisecond)
DTD (MilliseConds) DTD (mi)
Schema (Milliseconds) Schema (milliseconds)
Schema Plus DataTypes (Milliseconds) Schema and Data Type Check (ms) ADO.XML
662
2,230
2,167
3064
Hamlet.xml
106
215
220
N / a
Ot.xml
1,069
2,168
2,193
N / a
Northwind.xml
64
123
127
N / a
The bottom line is to expect validation to double or triple the time it takes to load your documents. New to MSXML January 2000 Web Release is a SchemaCollection object, which allows you to load the XML Schema once and then share it across your documents for validation . This will be discussed in a....
At least, verification may increase the time of the loaded document to twice or three times. MSXML January 2000 Web Release adds the Schemacollection object, which enables XML Schema only to load it once and can be shared when each document is verified. This will be discussed in future articles.
XSL
XSL can be a big performance win over using DOM code for generating "transformed" reports from an XML document. For example, suppose you wanted to print out all the speeches by Hamlet in the sample Hamlet.xml. You might use selectNodes to find all The Speeches by Hamlet, The Use Another Selectnodes Call To Itee Speeches, As Follows:
XSL is greatly better than using the DOM code to transform the XML document. For example, suppose you want to print all the words of Hamlet.xml. You may use SelectNodes to find all Hamlet's words, then use another SelectNodes to find each line in these words, the code is as follows:
Function Method1 (DOC)
{
Var Speeches = Doc.selectNodes ("/ Play / Act / Scene / Speech [Speaker = 'Hamlet');
Var s = Speeches.nextNode ();
Var out = "";
While (s)
{
Var lines = s.selectnodes ("line");
Var line = lines.nextNode ();
While (line)
{
OUT = line.text;
Line = lines.nextNode ();
}
OUT = "
S = SpeECHES.NEXTNODE ();
}
Return Out;
}
This Works, But It Takes About 1,500 MilliseConds. A better Way to Tackle this problem is to use xsl. The following xsl style sheet (or template) Does Exactly the Same Thing: This can achieve the goal, but will spend about 1,500 milliseconds. A better way to handle this problem is to use XSL. The following XSL style sheet (or template) can complete the same task:
xsl: for-energy>
xsl: for-energy>
xsl: template>
You Can Then Write The Following Simpler Script Code That Uses THIS THIS TEMPLATE:
You can write the following simple script code using this template:
Function Method2 (DOC)
{
VAR XSL = New ActiveXObject ("Microsoft.xmLdom");
Xsl.async = false;
XSL.Load ("Hamlet.xsl");
Return Doc.TransFormNode (XSL)
}
This takes only 203 milliseconds-it is more than seven times faster. This is a rather compelling reason to use XSL. In addition, it is easier to update the XSL template than it is to rewrite your code every time you want to get a different REPORT.
This is only 203 milliseconds - more than 7 times more than the previous method. This is why powerful reasons for using XSL. Moreover, if you want to get a different report, rewrite the XSL template is much easier than rewriting your code.
The problem is that XSL is very powerful. You have a lot of rope with which to hang yourself, so to speak. XSL has a rich expression language that can be used to walk all over the document in any order. It is highly recursive, and the MSXML parser includes script support for added extensibility. Using all these features with reckless abandon will result in slow XSL style sheets. The following sections describe a few specific traps to watch out for.
The problem is that XSL is too powerful. So you can use many ways to process questions. XSL has a rich expression language allows you to traverse documents in any order. It is highly recursive, and the MSXML parser adds script support for scalability. Abuse these features will result in a very low efficiency XSL style sheet. The following sections will discuss some traps that must be paid attention to. Scripting
script
It is convenient to call script from within an XSL style sheet, and it is a great extensibility mechanism. But as always, there is a catch. Script code is slow. For purposes of illustration, imagine that we wrote the following style sheet instead of The one shown previously:
It is very convenient to call scripts in the XSL style sheet, which provides good extension performance. But it always brings a performance loss. The execution speed of the script code is relatively slow. To illustrate this, we rewrite the front style sheet as follows:
xsl: for-energy>
................
This code has a speed of 516 milliseconds, twice a longer slower. So, you should use the script code in XSL.
The dreaded "//" Operator
Worried "//" operator
Watch out for the "//" operator. This little operator walks the entire subtree looking for matches. Developers use it more than they should just because they are too lazy to type in the full path. (I catch myself using it all the time , TOO.) For Example, Try Switching The Select Statement In The Previous Example To The Following: Be careful "//" operator. This small operator will traverse the entire subtree to find match match. Developers often use it without having to use it, just because they are too lazy to enter the full path. (I found that I always use it.) For example, rewrite the SELECT statement in the previous example as follows:
If you have any way, you can "streamline" look up the tree, then try your best. For example, suppose you want to find all Bernardo in Hamle.xml. And all his words are in the first scene. If you already know this, you should skip to find the second to fourth scenes. The following is a new SELECT statement: Select = "/ play / act [Title = 'ACT I'] / Scene / Speech [Speaker = 'Bernardo']" THIS Chops The Time Down from 141 MilliseConds to 125 MilliseConds, A Healthy 11 Percent Improvement. This makes the runtime from 141 milliseconds to 125 milliseconds, and the whole increase is 11% performance. Cross-Threading Models Cross thread mode Before, the transformNode and transformNodeToObject methods required that the threading model of the style sheet and that of the document being transformed be the same. In the MSXML January 2000 Web Release, you can use free-threaded style sheets on rental documents and vice versa. This means you can get the performance benefit of using rental documents at the same time as the performance win of sharing free-threaded style sheets across threads. before, transformNode and transformNodeToObject method requires threading model and style sheets are converted document must be the same. In MSXML January 2000 Web Release, you can use free thread style sheets on the leased mode documentation, or in turn. This means that you can enjoy the performance of the free thread mode in the performance of the free thread mode while getting the performance advantage of the leased document. Conclusion [SL1] Since the link is not available, we can omit thissence