Easily process XML data in .NET Framework

xiaoxiao2021-03-06 62

In the preface, the XMLTextReader and the XMLTextWriter class provide read and write operations for XML data. In this article, the authors tell the architecture of the XML reader and how they are combined with XMLDom and SAX interpreters. The author also demonstrates how to use reader analysis and verify the XML document, how to create a good XML document, and how to read / write Base64 and Binhex encoded large XML documents. Finally, the author tells how to achieve a stream-based read / write analyzer, which packages the reader in a separate class. About three years ago, I participated in a software seminar, the theme is "no XML, there is no future programming." XML is indeed another step in step, it has been embedded. Net framework. In this article, I will explain. NET Framework is used to process the role of the XML document API and its internal features, and then I will demonstrate some common functions. From MSXML to .NET XML before. Net Framework appears, you are used to using MSXML services - a COM-based class library - write Windows XML drivers. Unlike. NET Framework, some code of the MSXML class library is deeper than the API, which is completely embedded at the bottom of the operating system. MSXML can indeed communicate with your application, but it cannot be truly combined with external environments. The MSXML class library can be imported in Win32 and can be used in CLR, but it can only be used as an external server component. However, based on the .NET Framework application can be integrated with the XML class with the .NET Framework other namespace, and the code written is easy to read. As a separate component, the MSXML analyzer provides some advanced features such as asynchronous analysis. This feature is not available in the XML class in .NET Framework, it is not available, however, the XML class in NET Framework can be easily gain the same function easily, on this basis, you can add More features. The XML class in .NET Framework provides basic analysis, query, and converts the functionality of XML data. In .NET Framework, you can find classes that support XPath query and XSLT conversion, and class read / write XML documents. In addition, .NET Framework also includes other classes that process XML, such as the sequence of objects (XMLSerializer and the SoapFormatter class), application configuration (AppsetTingsReader class), data storage (DataSet class). In this article, I only discuss classes that implement basic XML I / O operations. XML analysis mode Since XML is a tag language, there should be a tool to analyze and understand information stored in a document with a certain syntax. This tool is an XML analyzer - one component is used to read the target and return to the target of the specified platform. All XML analyzers, no matter which operation platform is it, it is not more than two types: trees or event-based processors. These two categories are usually implemented with XMLDOM (The Microsoft XML Document Object Model) and SAX (Simple API for XML). XMLDOM analyzer is a normal tree-based API - it treats XML documents as a memory tree. The SAX analyzer is an event-based API - it handles each element in the XML data stream (it processes the XML data in the stream).

Typically, the DOM can be loaded and executed by an SAX, so the two types of processing are not mutually exclusive. Overall, the SAX analyzer is opposite to the XMLDOM analyzer, and their analysis model has great differences. Xmldom is well defined inside its FunctionAlation set, you can't extend it. When it is handling a large document, it should take a lot of memory space to handle this huge collection of FunctionAlation. The SAX analyzer uses the client application to process the analysis event through the instance of the existing specified platform object. The SAX analyzer controls the entire process, "introduces the data" to the handler, which accepts or rejects processing data. The advantage of this mode is that there are few memory spaces. .NET Framework fully supports XMLDOM mode, but it does not support SAX mode. why? Because .NET Framework supports two different analysis modes: XMLDOM analyzers and XML readers. It obviously does not support SAX analyzers, but this does not mean that it does not provide a function similar to SAX analyzer. All functions through XML reader SAX can be easily implemented and more efficient. Unlike SAX analyzers, .NET Framework's reader is entirely operating under the client application. In this way, the application itself can only "launch" the truly needed data, then jump from the XML data stream. The SAX analysis mode is to process all information useful and useless to the application. The reader is based on the .NET Framework stream mode, its work mode is similar to the database of the database. Interestingly, achieving similar cursor analysis modes provide the underlying support of the XMLDOM analyzer in the .NET Framework. XmlReader, XMLWRITER two abstract classes are the basic class of the XML class in all .NET Framework, including XMLDOM classes, ADO.NET driver classes, and configuration classes. So you have two optional methods to process XML data in .NET Framework. Use XMLReader and XMLWRITER classes to process XML data directly, or use XMLDOM mode. More about read documents in .NET Framework can see the Cutting Edge column article of MSDN August 2002. The XMLReader class XML reader supports a programming interface. The interface is used to connect the XML documentation, "launch" you want. If you go deep into the reading reader, you will find that the reader works similar to our desktop application to remove the data from the database. The database service returns a cursor object, which contains all query result sets and returns a reference to the start address of the target dataset. The client of the XML reader receives a reference to the reader instance. This example extracts the underlying data stream and presented the removed data as an XML tree. Reader class provides read-only, forward cursors, and you can scroll through each data in the game traversal result set with the method provided by the reader class. From a reader, you are not a tag text file, but a serialized node collection. It is a special cursor mode in the .NET Framework; in the .NET Framework, you can't find any other similar API functions. Readers and XMLDOM analyzers have several different places. The XML reader is only available, it does not have the concept of parents, children, ancestors, brothers, and is read-only. In .NET Framework, reading and writing XML documents is divided into two completely different functions, which are completed by XMLReader and XMLWRITER classes. To edit an XML document, you can use the XMLDOM analyzer, or you design a class yourself to implement these two functions. Let's start analyze the program function of the reader. XmlReader is an abstract class that you can inherit and extend its features.

User programs are generally based on the three classes below: XMLTextReader, XMLValidatingReader or XMLNodeReader class. All of these classes have a method of attributes and diagram of Figure II. It should be noted that the value of some attributes actually depends on the actual reader class, and the different classes may differ from the base class. Therefore, the description of each attribute in Figure 1 is subject to the base class. For example, the CanResolveEntity property returns only true in the XMLValidatingReader class; but it can be set to false in other reader classes. Similarly, the actual return value of certain methods in Figure II may differ from different classes. For example, if the node type is not an element node (Element Node), all return value types of the method containing the Atributes are Void. The XMLTextReader class quickly accesses XML data streams with just, read-only way. The reader first verifies that the XML document is well format, if not, throw an exception. XMLTextReader checks if the format of the DTD is good, but does not verify the document with DTD. XMLTextReader loads XML document data from the file name of the XML document, or its URL, or from the file stream, and then quickly processes XML document data. If you need to verify the data of the document, you can use the XMLVALIDATINGReader class. You can create an instance of the XMLTextReader class in a variety of ways, load files from the hard disk, or load it from the URL address, the stream (streams) is loaded, and there is an XML document data from the text: XmlTextReader Reader = New XMLTextReader (file Note that all public constructor of all XMLTextReader classes require you to specify the data source, the data source can be Stream, file or others. The XMLTextReader default constructor is protected (protected), so it cannot be used directly. Like all reader classes in .NET Framework (such as SqlDataReader class), once the reader object is connected and opened, you can use the Read method to access the data. You can only move the pointer to the first element with the read method; then we can use a Read method or other method (such as Skip, MoveToContent, and ReadinnerXML) to move pointers to the next node element. To handle the content of the entire XML document, you can use a loop multi-circular document according to the return value of the READ method, because the read method returns a Boolean value, when reading the tail node of the document, the Read method returns false, otherwise it returns true.

Figure 3 Outputting an XML Document Node Layout string GetXmlFileNodeLayout (string file) {// create a XmlTextReader class to point to the target XML document XmlTextReader reader = new XmlTextReader (file); // cycle nodes removed and placed in a text object instance StringWriter StringWriter Writer = New StringWriter (); String TabPrefix = ""; while (ready) {// write start flag, if the node type is element IF (Reader.NodeType == XMLNodeType.element) {// The depth of the node in the element, add the Reader.Depth TAB, and write the element name into <>. TabPRefix = new string ('/ t', reader.depth); Writer.writeLine ("{0} <{1}>", tabprefix, reader.name;} else {// write end flag, if the node type is Element IF (Reader.NodeType == XMLNodType.Endelement) {tabprefix = new string ('/ t', reader.depth); Writer.Writeline ("{0} ", tabprefix, reader.name) ;}}} // output to the screen string buf = Writer.Tostring (); Writer.close (); // Turn off the flow reader.close (); return buf;}

Figure 3 demonstrates a simple function of node elements for outputting a given XML document. This function opens an XML document and then processes all the contents in the XML document with a loop. Each time you call the Read method, the reader's pointer will move down a node. In most cases, the element node can be processed with the Read method, but sometimes, when you move from a node to the next node, it may be between two different types of nodes. But the Read method cannot move between attribute nodes. The MoveToContent method of the reader allows the pointer to jump from the head node location to the first content node location. You can also move pointers with SKIP methods in Processinginstruction, DocumentType, Comment, Whitespace, and SignificantWhitespace types. The type of each node is one of the XMLNodeType enumeration values. In the code shown in Figure 3, we only use two types: Element and endelement. The output source code reordered the original document structure, which discarded or ignored the properties and node content of the XML element, only outputting the element node name. Suppose we use the following XML fragment: msdn magazine MSDN Voices with the above program The result of the output is as follows: The indentation amount of child node is set according to the depth property of the reader (depth attribute), the depth property returns one Shaped data, it represents the nested level of the current node. All text is placed in the StringWriter object (a very convenient stream-based packaged StrigBuilder class). As mentioned earlier, the reader does not automatically access the property node via the Read method. To access the current element's attribute node collection, you must use a simple loop that is simply controlled by the return value of the MovetoneXTAttribute method. The following code is used to access all attributes of the current node and combine the name of the attribute and its value into a string: if (reader.hasetTributes) While (reader.movetonextttribute ()) BUF = Reader.Name "= /" " Reader.Value " / ", Reader.moveToelement (); When you complete the processing of the property set, call the MOVETOEEMENT method to return the pointer to the element node to which the property belongs. Accurately, the MoveToeElement method is not a real movement pointer because the pointer is never removed from the element node during processing attribute set. The MoveToElement method only points to an internal member and takes the value of the member. For example, use the Name property to get a property name of a attribute, and then call the MoveToelement method to move the pointer to the element node it belong. But when you don't need to continue to handle other nodes, you don't have to call the MoveToeElement method. The attribute value is a simple text string in most cases where the property value is analyzed. However, this does not mean that attribute values in practical applications are characters.

Sometimes, attribute values are combined by many types of data, such as Date or Boolean, at this time, you have to convert these types into the original type with XMLConvert or System.convevt classes. Both XMLConvert and System.convevt classes enable conversion of data types, but the XMLConvert class converts the data type specified in XSD, regardless of what type it is now. Suppose you have the following XML data pieces: Let us first confirm that the BIRTHDAAY attribute value is February 8, 2001, if you use the System.convert class to convert the string into. The DateTime type in NET Framework, so we can use it as a Date type. Compared to the XMLConvert class to convert a string, you will see an analysis error because the XMLConvert class does not explain the date in this string. Because in XML, the format of date data must be in YYYY-MM-DD form. The XMLConvert class serves as the mutual conversion between the CLR type and the XSD type. When the conversion work occurs, the conversion result is partial. In some solutions, the attribute value is constructed from plain text and entity. In all reader classes, only XMLValidatingReader class can handle entities. Although XMLTextReader cannot handle entities, they simultaneously appear in the attribute value, it can only take the text value. This happens, you must use the ReadttributeValue method to replace simple readings to analyze the contents of the property. The READATTRIBUTEVALUE method analyzes the property value and then separates the elements of each component (such as separating plain text and entity). You can use the return value of the ReadAttributeValue method as a loop condition, traversing the elements of the entire property value. Since the XMLTextReader class cannot handle the entity, you can write a class that is used to handle the entity. The following code snippet demonstrates how to call a custom processing class: while (reader.ReadattributeValue ()) {if (reader.NodeType == xmlnodetype.entityreference) // resolve the "reader.name" Reference and add // THE Result to a buffer buf = YourResolverCode (Reader.Name); Else // Just Append the value to the buffer buf = reader.value;} After all of the property values are analyzed, the ReadaTributeValue method returns false to end the loop. The final result of the attribute value is the value of the global variable buffer. Handling XML text (text) When we do not process the XML tag text, its error reason can be determined quickly. For example, a character conversion error is inevitably transmitted in an XML data stream. Not all valid characters in a given platform are valid XML characters. Only valid characters specified in the XML specification (www.w3.rg/tr/2000/rec-xml-20001006.html) can be safely used as elements and attribute names. The XMLConvert class provides features that convert non-XML standards to standard XML naming.

Encodename and DecodeEname methods are adjusted into XML naming that complies with Schema when the label name contains invalid XML characters. Includes SQL Server? And Microsoft Office, which allows and supports the Unicode document, however, characters in these documents are not a valid XML naming. Typical situations are when you handle column names that contain spaces in your database. Although SQL Server allows the long name, this may not be a valid naming for XML flow. The space is replaced by hexadecimal code invoice_0x0020_details. The following code demonstrates how to get the string in the program: XmlConvert.Encodename ("Invoice Details); the opposite method is Decodename. This method converts XML text into its original format. It should be noted that it can only convert a complete hexadecimal code, only _0x0020_ is used as a space, and _0x20_ is not: XmlConvert.Decodename ("invoice_0x0020_details); in the XML document Important is not important. Say it is important, it is the actual meaning when it appears in the content of the element or it is in the comment statement. For example, the following case: ??? In XML, space is not just representing space (blank), also representative Enter, wrap and indent. You can handle spaces through the WhitespaceHandling property of the XMLTextReader class. This attribute accepts and returns a WhitespaceHandling enumeration value (this enumeration class has three optional values). The default is all, which indicates that the meaningful and meaningless space will return as a node ---- SignificantWhitespace and Whitespace nodes, respectively. Another enumeration value is none, which means that it is not returned to any space. Finally, the SignFICANT enumeration value, which means that ignores the meaningless space, and only returns the node of the node type as SignficantWhitespace. Note whitespaceHandling property is one of a few reader properties. It can be changed at any time and giving the read operation. The Normalization and XMLResolver properties are "Sensitive". String and Fragment programmakers cut down the MSXML program and found a lot between COM and .NET Framework XML APIs. The .NET Framework class itself does not provide a method to analyze XML data stored in a string. Unlike MSXML analyzer objects, the XMLTestReader class does not provide any LoadXML method to create a reader from a wellform in a format. No way to provide logadxml because you can get the same feature with a special Text Reader --- StringReader class. XMLTextReader is one of the constructor accepts a TextReader derived object and an XML reader parameter (creating the reader based on the contents of Text Reader). A TEXT Reader class is a stream that is an input character to be generated. The StringReader class inherits the TextReader class and uses a string in memory as its input stream.

The following code fragment demonstrates how to initialize an XML reader, with a well-formed XML string as input: string xmlText = "..."; StringReader strReader = new StringReader (xmlText); XmlTextReader reader = new XmlTextReader (strReader) In addition, with the StringWriter class instead of the TextWrite class, you can create an XML document from the memory characters. A specified type of XML string is an XML fragment. The XML piece consists of XML text, but the XML document without a root node is not a well-format XML document, so it cannot be applied. An XML piece is part of the original document, so it may miss the root node. For example, the following XML text is a valid XML piece, but not a valid XML document because it doesn't have a root node: Dino esposito .NET Framework XML API Allow program The XML fragment is used in conjunction with an analyzer content. The analyzer content consists of a similar encoding character set, a DTD document, a namespace, a language, and space handler: public XMLTextReader (String XMLFragment, XMLNodType FRAGTYPE, XMLParserContext context); XMLFragment parameters include XML string analysis. The FRAGTYPE parameter represents the type of Fragment, which gives the type of Fragment root node. Only the nodes of Element, Attibute, and Document are used as the root node of Fragment. The contents of the analyzer can be interpreted by the XMLParserContext class interpretation of the reader XMLValidatingReader class to implement the XMLReader class, which provides a variety of XML authentication: DTD, XML-Data Reduced (XDR) architecture, and XSD, DTD, and XSD are recommended by W3C. XDR is a format for Microsoft to process the XML architecture. You can use the XMLVLidatingReader class to verify the XML document and XML pieces. The XMLValidatingReader class works on the XML reader - is a typical XMLTextReader class instance. XMLTextReade is used to read the node of the document, but XMLVLidatingReader verifies each XML block based on the required validation type. The XMLVLidatingReader class only implements a very small XML reader necessary for a functional subset. This class is always working on an existing XML reader that monitors methods and properties. If you go deep into this class constructor, you will find it clearly on an existing text reader. A verified XML reader cannot serialize directly from one file or a URL.

It follows a list of the class constructor: public XmlValidatingReader (XmlReader); public XmlValidatingReader (Stream, XmlNodeType, XmlParserContext); public XmlValidatingReader (string, XmlNodeType, XmlParserContext); XML reader can verify with any XML fragment analysis, XML fragment by A String or a Stream is provided or an XML document provided by any reader. There is very little way in the XMLVLidatingReader class (relative to other Reader classes), and other For Read, it has SKIP and READTYPEDVALUE methods. SKIP methods Skip all child nodes of the current node (you can't skip the XML text of bad format, it is quite useful algorithm), the Skip method also verifies the skipped content. The READTYPEDVALUE method returns the CLR type corresponding to the specified XML schema (XSD) type. If the method finds the CLR type corresponding to the XSD type, the CLR type name is returned. If you can't find it, the value of the node is returned as a string value. The verified XML reader is just a name, it is a node-based reader that verifies whether the structure of the current node complies with the current SCHEMA. Verification is incremental; it does not have a method to return a Boolean value indicating whether the document is valid. Usually you are using the Read method to read the entered XML document. In fact, you can also read the XML document with a validated reader. In each step, whether the structure of the node currently accessed is in line with the specified schema, if not, throw an exception. Figure 4 is a console application, which has a command line to enter the file name, and finally output the verification result. Figure 4 Console App using System; using System.Xml; using System.Xml.Schema; class MyXmlValidApp {public MyXmlValidApp (String fileName) {try {Validate (fileName);} catch (Exception e) {Console.WriteLine ( "Error: / t {0} ", E.MESSAGE);" Exception raised: {0} ", E.GETTYPE (). Tostring ());} private void validate (String filename) {xmlTextReader xtr = new XmlTextReader (fileName); XmlValidatingReader vreader = new XmlValidatingReader (xtr); vreader.ValidationType = ValidationType.Auto; vreader.ValidationEventHandler = new ValidationEventHandler (this.ValidationEventHandle); vreader.Read (); vreader.MoveToContent (); while (vreader .Read ()) {} xtr.close (); VREADER.CLOSE ();}

Public void validationEventHandle (Object sender, validationEventArgs args) {console.write ("Validation Error:" args.Message "/ r / n");}

Public static void main (string [] args) {MyXMLValidApp o = new myXmlvalidApp (args [0]); return;}} ValidationType property Set the type of verification, which can be: DTD, XSD, XDR, or None. If you do not specify the type of verification (with the validationType.Auto option), the reader will automatically use the most suitable authentication type according to the document. Any errors occur during the verification process will trigger the ValidationEventHandler event. If an event ValidationEventHandler event handler is provided, an XML exception is thrown. Defining the ValidationEventHandler event handler is a way to capture any errors in an XML source file to raise an XML exception. It should be noted that the principle of the reader is to check if a document is well format, and check whether the document is consistent with the architecture. If a verified reader discovers an XML document with a serious format, it will only trigger XMlexception exception, which does not trigger other events. Verify that when the user moves the pointer with the READ method, once the node is analyzed and read, it obtains an internal object that is transmitted. The verification operation is based on the node type and the required verification type. It confirms that the node all attributes and the child node containing the child feature complies with the verification criteria. Verify objects call two different styles internally: DTD analyzer and architecture generator (Schema Builder). The DTD analyzer processes the contents of the current node and the subtree that does not conform to DTD. The architecture generator builds an SOM (Schema Object Model) based on the XDR or XSD architecture. The architecture generator class is actually a base class specified as an XDR and an XSD architecture generator. Why, although many of the XDR and XSD architectures are processed, they have no difference in performance during execution. If the node has a child node, collects sub-node information with another temporary reader, so the architecture information of the node can be fully verified. You can see Figure 5: Note that although the constructor of the XMLValidatingReader class can accept an XMLReader class as its reader, the reader can only be an instance of the XMLTextReader class or an instance of its derived class. This means you can't use other classes derived from XmlReader (such as a custom XML reader). Inside the XMLValidatingReader class, it assumes that the reader is a sub-XMLTextReader object and explicitly converts the incoming reader to the XMLTextReader class. If you use XMLNodeReader or a custom reader, the program will be erroneous when compiling, and an exception is thrown. The Node Reader XML reader provides an incremental method (read by a node) to process the contents of the document. So far, we assume that the source file is a hard disk-based stream or a string stream, however, we cannot guarantee that the XMLDOM object of a source file will be provided in practice. In this case, we need a special class with a special reading method. This situation, .NET Framework provides an XMLNodeReader class. Just like XMLTextReader access to all nodes in the specified XML stream, the XMLNodeReader class accesses all nodes of the XMLDom subtree. XMLDOM classes (XMLDocument classes in .NET Framework) support xPath-based methods, such as SelectNodes methods, and SelectsingLenode methods. The role of these methods is to put the matching nodes in memory.

If you need to handle all nodes in the subtree, the node reader has higher efficiency than the reader for processing nodes with incremental methods: // xmldomNode is the XML Dom Node XMLNodeReader NodeReader = New XMLNodeReader (XmLDomnode); while NodeReader.Read ()) {// do something Here} When you want to quote custom data in a configuration file (such as web.cofig file), pop the data into the XMLDOM tree, then use the XMLNodeReader class with XMLDom Class combined with these data. This is also efficient. The XMLTextWriter class is obviously not difficult to create an XML document in this section. Over the years, developers have created an XML document by outputting the string in the cache in the way in which the cache is output to the file by connecting some strings in the cache. But in this way, the way to create an XML document is only valid when you guarantee that there is no fine error in the string. .NET Framework provides a better way to create an XML document with XMLWRITER. The XML Writer class outputs XML data in the forward-only mode to flow or files. More importantly, XML Writer guarantees that all XML data is in accordance with W3C XML 1.0 recommended specification, you don't even have to worry about writing a tag, because XML Writer will help you. XMLWRITER is an abstract base class for all XML Writer. .NET Framework only provides a unique Writer class ---- XMLTextWriter class. Let's take a look at the difference between XML Writers and the old Writers. The following code saves an array of String type: StringBuilder SB = New StringBuilder (""); sb.append (""); foreach (String S in THEARRAY) {sb.append ("");} sb.append ("") The code removes the elements in the data by looping, write tag text and adds them to a string. The code guarantees the output of the output is a good format and pays attention to the indentation of the new line, and supports the namespace. This method may not have errors when the created document structure is relatively simple. However, when you want to support processing instructions, namespace, indentation, formatting, and entity, the number of code is increasing, and the possibility of error is also increased. The XML Writer write method function corresponds to each possible XML node type, which makes the process of creating an XML document more logical, less relicity in cumbersome tag language. Figure 6 shows how to connect to a String data with a way of using an XMLTextWriter class. The code is very simple, easier to read with XML Writer, more standard.

Figure 6 Serializing a String Array void CreateXmlFileUsingWriters (String [] theArray, string filename) {// Open the XML writer (with the default character set) XmlTextWriter xmlw = new XmlTextWriter (filename, null); xmlw.Formatting = Formatting.Indented; xmlw.WriteStartDocument (); xmlw.WriteStartElement ( "array"); foreach (string s in theArray) {xmlw.WriteStartElement ( "element"); xmlw.WriteAttributeString ( "value", s); xmlw.WriteEndElement ();} XMLW.WriteEndDocument (); // close the Writer XMLW.Close ();} However, XML Writer is not a magician - it cannot fix the input error. XML Writer does not check if the element name and the property name are valid, nor to ensure that any Unicode character set is suitable for the current architecture coding set. As mentioned above, in order to avoid output errors, it is necessary to prevent non-XML characters. But Writer did not provide this method. In addition, when an attribute node is created, WRITER does not verify that the name of the property node is the same as the name of the existing element node. Finally, the XMLWRITER class is not a verified Writer class, nor does it guarantee whether the output complies with SCHEMA or DTD. The Writer class with verification in .NET Framework is not available yet. But in the "Applied XML Programming for Microsoft .NET (Microsoft Press ?, 2002) book, I wrote a Writer component with verification. You can download the source code from the following URL: http://www.microsoft.com/mspress/books/6235.asp. Figure 7 lists some status values of XML Writer (state). These values are derived from the WriteState enumeration class. When you create a Writer, its initial state is start, indicating that you will configure the object, actually Writer doesn't start. The next state is ProLog, which is set when you call the WritestartDocument method to start working. Then, the conversion of the state depends on your write documentation and the content of the document. ProLog State has been reserved until you add a non-element node, such as comment elements, processing instructions, and document types. When the first node is written after the root node is written, the state changes to ELEMENT. When you call the WritersTARTRIBUTE method, the status is converted to Attribute instead of converting the WRITEATRIBUTESTRING method write property to this state. If so, the status should be Element. When you write a closed tag (>), the status will be converted to Content. When you write a document, call the WriteEndDocument method, the status will return to start until you start writing another document or turn the Writer.

Figure 7 States for XML Writer State Description Attribute The writer enters this state when an attribute is being written Closed The Close method has been called and the writer is no longer available for writing operations Content The writer enters this state when the content of a node is being written Element the writer enters this state when an element start tag is being written Prolog the writer is writing the prolog of a well-formed XML 1.0 document Start the writer is in an initial state, awaiting for a write call to be issued Writer the Output text There is a buffer internally. In general, the buffer will be refreshed or cleared. When Writer is turned off, XML text should be written. You can clear the buffer by calling the Flush method, write the current content into the stream (exposing the BaseStream property), then release the memory, Writer remains open state, can continue operating. Note that although the partial document content is written, it is not possible to handle the document before the Writer is not closed. It can be used to write attribute nodes in two ways. The first method is to create a new attribute node to update the status of the Writer with the WriteStartatribute method. Then use the WRITESTRING method to set the attribute value. After writing, the node is ended with the WriteEndelement method. Alternatively, you can also create a new property node with WriteAttributeString method. When Writerr's status is ELEMENT, WriterattributeString starts working, it creates an attribute separately. Similarly, WriteStartElement methods write nodes start tabs (<), then you can set the properties of the node and the text content of the node. The closed tag of the element node is "/>". If you want to write and clock the tag, you can use the WriteFullendElement method to write. A sensitive tag character includes sensitive tag characters should be avoided, such as less than numbers (<). Writing the stream with the WriteraW method does not be parsed, we can use it to write a special string for the XML document. The following two lines of code, the first line output "<", the second line output "<": Writer.WritString ("<"); Writer.writeraw ("<"); reading and writing is interesting, Reader (Readers) and Writer class provide methods of read and write data streams based on Base64 and BinHEX encoding. The function of the WriteBase64 and WritebinHex methods is a subtle difference with other write methods. They are all streaming, the functions of these two methods are like a BYTE array rather than a string.

The following code first converts a String into a Byte array and writes them into a base64 encoded stream. Encoding class Gettes static method completes the transfer task: Writer.write.GetBytes (BUF), 0, BUF.LENGTH * 2); Figure 8 in the eighth code demonstrates the XML stream that converts a String data to Base64 encoded XML stream . Figure 9 is the result of the output. Figure 8 Persisting a String Array as Base64 using System; using System.Text; using System.IO; using System.Xml; class MyBase64Array {public static void Main (String [] args) {string outputFileName = "test64.xml"; if (args.length> 0) OutputFileName = args [0]; // file name // converts arrays into XML String [] THEARRAY = {"Rome", "New York", "Sydney", "Stockholm", "Paris" "}; CreateOutput (theArray, outputFileName); return;} private static void CreateOutput (string [] theArray, string filename) {// open XML writer XmlTextWriter xmlw = new XmlTextWriter (filename, null); // child element according Indentation And indentchar setting indentation. This option only indent the contents of the element to XMLW.Formatting = formatting.indented; // Write version "1.0" XML declaration XMLW.WriteStartDocument (); // Write a note containing the specified text . XMLW.WriteComment ("Array to Base64 XML"); // Start writing an Array Node XMLW.WRITESTARTELEMENT ("array"); // Write an attribute of the specified prefix, local name, namespace URI and value XMLW.WriteAttributeString ("XMLns", "X", NULL, "DinoE: MSDN-MAG"); // Recycled the child of Array Foreach (String S in THEARRAY) {// Write the specified start tag and A given namespace and prefix associate XMLW.WriteStartElement ("x", "element", null); // converts S to the Byte [] array, and encode the BYTE [] group to Base64 and write the result text, The number of bytes to be written is twice the total length of the String, and the number of bytes accounted for a String is 2 bytes.

XMLW.WriteBase64 (Encoding.unicode.getbytes (s), 0, s.Length * 2); // Turns the child node xmlw.writeEndelement ();} // Close the root node, only two XMLW.WriteEndDocument (); / / Turn off Writer XMLW.Close (); // Read the write content XMLTextReader Reader = new xmlTextReader (FileName); while (Reader.Read ()) {// Get Node Node called Element Node IF (Reader.LocalName = = "element") {byte [] bytes = new byte [1000]; int n = reader.readbase64 (Bytes, 0, 1000); string buf = encoding.unicode.getstring (bytes); console.writeline (buf.substring (0, N));}}}}}} Figure 9 String Array In Internet Explorer Reader class has a special explanation of Base64 and BinHex encoded flow. The following code snippet demonstrates how to use the XMLTextReader class readbase64 method to parse documents created with Base64 and Binhex codes. XmlTextReader Reader = New XmlTextReader (FileName); while (reader.Read ()) {if (reader.localname == "element") {byte [] bytes = new byte [1000]; int n = reader.readbase64 (bytes, 0, 1000); string buf = encoding.unicode.getstring (Bytes); console.writeline (buf.substring (0, n));}}} reader.close (); converted from the Byte type to String type by Encoding class GetString method is implemented. Although I only introduce the code based on the Base64 coding set, you can simply use the binhex replacement method name to implement the Binhex encoded node content (with the ReadbinHex method). This trick can also be used to read any binary data represented by Byte data, especially the image type of the image. Design XMLReadwriter class As mentioned earlier, XML Reader and Writer are independent: Reader read-only, Writer is only written. Suppose your application wants to manage lengthy XML documents, and this document has undetermined data. Reader provides a good way to read the contents of this document. On the other hand, Writer is a very useful for creating an XML document segment tool, but if you want it to read, you can write, then you have to use Xmldom.

If the actual XML document is very large, there will be a problem, what is the problem? Is it all loaded into memory in memory, then read and write? Let's take a look at how to build a mixed stream analyzer is used to analyze large XMLDoms. Like a general read-only operation, use a normal XML Reader to order access nodes. Different, you can use XML Writer to change the property value and the content of the node while reading. You use Reader to read each node in the source file, and the Writer in the background creates a copy of the node. In this copy, you can add some new nodes, ignore or edit some of the other nodes, and edit the value of the properties. When you complete your modification, you will replace the old document with a new document. A simple and effective approach is to copy the node object from the read-only stream to the WRITE stream, which can use two methods in the XMLTextWriter class: WriteAttributes method and WriteNode method. The WriteAttributes method reads all the valid properties of the nodes selected in the current Reader and then copies the properties as a separate String copy to the current output stream. Similarly, the WriteNode method handles other types of nodes except attribute nodes in a similar approach. The code snippet shown in Figure 10 demonstrates how to create a copy of the source XML document with the two methods described above, select some nodes. The XML tree begins to be accessed from the tree root, but only outputs other types of nodes other than the attribute node type. You can integrate Reader and Writer in a new class to design a new interface to read the write flow and access properties and nodes. Figure 10 Using the WriteNode Method XmlTextReader reader = new XmlTextReader (inputFile); XmlTextWriter writer = new XmlTextWriter (outputFile); // Configure reader and writer writer.Formatting = Formatting.Indented; reader.MoveToContent (); // Write the root writer .Writestartelement (reader.localname); // read and output every other node int i = 0; while (Reader.Read ()) {if (i% 2) Writer.Writenode (Reader, false); i ;} // Close the root writer.writendelement ();

// Close Reader and Writer Writer.Close (); Reader.close (); My XMLTextReadwriter class does not inherit from the XMLReader or XMLWRITER class. Instead, two other classes are replaced, one is based on read-only stream operation, and the other is based on a write-flow operation. The XMLTextReadwriter class method reads the data with the Reader object, writes to the Writer object. In order to adapt to different needs, the internal Reader and Writer objects are exposed by read-only Reader and Writer attributes. Figure XI lists some methods of the class: Figure 11 XmlTextReadWriter Class Methods Method Description AddAttributeChange Caches all the information needed to perform a change on a node attribute All the changes cached through this method are processed during a successive call to WriteAttributes.. Read Simple wrapper around the internal reader's Read method. WriteAttributes Specialized version of the writer's WriteAttributes method, writes out all the attributes for the given node, taking into account all the changes cached through the AddAttributeChange method. WriteEndDocument Terminates the current document in the writer and .

This new class has an read method that is a simple package for Reader's Read method. In addition, it provides the WriterstartDocument and WriteEndDocument methods. They initialize / release (Finalize) the internal Reader and Writer objects, and processes all I / O operations. While cycling read nodes, we can modify the node directly. For performance, to modify the properties must be declared using the AddAttributeChange method. All modifications made to the properties of a node are stored in a temporary table, and finally, clear the temporary table by calling the WRITEATTRIBUTE method. The code shown in Figure 12 demonstrates the ability to modify attribute values while using the XMLTextReadwriter class. The C # and VB source code downloads for the XMLTextReadwriter class are available in this period's MSDN (links provided in this article). Figure 12 Changing Attribute Values private void ApplyChanges (string nodeName, string attribName, string oldVal, string newVal) {XmlTextReadWriter rw = new XmlTextReadWriter (InputFileName.Text, OutputFileName.Text); rw.WriteStartDocument (true, CommentText.Text); // Manually modify the root node rw.writer.writestartElement (RW.Reader.LocalName); // Start modifying attributes // (can modify the properties of more nodes) rw.addattributeChange (NodeName, AttribName, OldVal, NewVal); // loop processing Document While (rw.Read ()) {switch (rw.NodeType) {copy xmlnodetype.efficient: rw.writer.writestartelement (rw.reader.localname); if (nodename == rw.reader.localname) // Modify Properties Rw.writeAttributes; Else // deep rw.writer.writeAttributes (rw.reader, false); if (rw.reader.iselement) rw.writer.writeEndelement (); break;}} // Close the root Tag rw.writer.writeEndelement (); // close the document and any intence res Ources rw.writeEndDocument ();} XMLTextReadwriter class not only reads an XML document, but also writes an XML document. You can read the content of the XML document, if you need it, you can use it to do some basic update operations. Basic update operations here refer to modifying the value of an existing property or a node, or add a new attribute or node. For more complex operations, it is best to use an XMLDOM analyzer. Summary Reader and Writer are the foundation of processing XML data in the .NET Framework.

转载请注明原文地址:https://www.9cbs.com/read-55046.html

9cbs

New Post(0)