Ishallwin Translation 2005.03.03
John Fleck
Copyright © 2002, 2003 John Fleck
Typographical history
Revision 1 June 4, 2002
Initial draft
Revision 2 June 12, 2002
Add a lookup attribute value
Revision 3 Aug. 31, 2002
Correct release memory
Revision 4 Nov. 10, 2002
Increase coding content discussion
Revision 5 DEC. 15, 2002
More release memory content change
Revision 6 Jan. 26. 2003
Add an index
Revision 7 April 25, 2003
Add compilation appendix
Revision 8 July 24, 2003
Add XPATH example
Revision 9 Feb. 14, 2004
Fixed bug in XPath example
table of Contents
introduction................................................. .............................. 2
type of data................................................ ................................ 2
analyse file................................................ ........................... 2
Get element content ........................................... ........... 3
Get element content using XPath ........................................ 4
Write an element ............................................ ...............
Write properties .............................................. .......................... 6
Get attribute .............................................. ...................
Coding conversion .............................................. ...................
A. Compilation ............................................. ............................ 9
B. Sample document .......................................... ..................... 9
C. Keyword routine code ........................................ .......... 9
D. XPath routine code ........................................ ............ 10
E. Add attribute routine code ....................................... ... 11
F Add Keyword routine code .......................................... .. 13
G. Get attribute value routine code ........................... 14
H. Code conversion routine code ................................. 15
I. Thank ............................................... .................
Summary
LIBXML is a C language library with free licensed to handle XML and easily across multiple platforms. This guide provides examples of its basic functions. introduction
LIBXML is a C language library that implements read, creates and manipulates XML data. This guide provides example code and gives an explanation of its basic functions. .
There are libxml and more information about it available on this project. Contains a complete API document. This guide does not replace these complete documents, but clarifies that the function needs to use the library to complete the basic operation.
This guide is based on a simple XML application that uses an article I write, which contains the main body of metadata and articles.
Example code demonstration in this guide is:
• Resolution Document
• Take the text of the designated element
• Add an element and its content
• Add a property
• Take a value of an attribute
The complete code of the example is included in the appendix.
type of data
LIBXML defines a lot of data types, we will repeatedly encounter them, it hides the messy source to do not have to handle it unless you have a specific need.
Xmlchar replaces the Char, using the UTF-8 encoded one-character string. If your data uses other encodings, it must be converted to UTF-8 to use libXML functions. There is more useful information about encoding in the libXML encoding support web page.
XMLDOC contains tree structures created by parsing documents, and XMLDocptr is a pointer to this structure.
XMLNodePtr and XMLnode contains a single knot XMLNodePtr to point to this structure, which is used to traverse the document tree.
Parsing document
The file name is only required when parsing the document and only one function is called, and there is an error check. Complete code: Appendix C, Keyword routine code
1Xmldocptr doc;
2Xmlnodeptr Cur;
3Doc = xmlparsefile (docname);
4IF (DOC == null) {
FPRINTF (stderr, "document not parse successfully. / n");
Return;
}
5cur = xmldocgetrootelEment (DOC);
6IF (Cur == Null) {
FPRINTF (stderr, "empty document / n");
XMLFreedoc (DOC);
Return;
}
7IF (XmlStrCMP (Cur-> Name, (const Xmlchar *) "story")) {
FPRINTF (stderr, "document of the wrong type, root node! = story);
XMLFreedoc (DOC);
Return;
}
1 Define the parsing document pointer.
2 Define the node pointer (you need it to move between each node).
4 Check whether the parsing document is successful, if it is not successful, libXML will refer to a registered error and stop.
Comment
A common error is inappropriate encoding. The XML standard document is also saved with other codes in addition to UTF-8 or UTF-16. If the document is like this, libXML will automatically switch to UTF-8. More about XML encoding information is included in the XML standard.
5 get a document root element
6 Check that confirm that the current document contains content.
7 In this example, we need to confirm that the document is the correct type. "Story" is the root type using the document in this guide.
Get element content
You find it after you have to find the element in the document tree. In this example we look for the "story" element. The process will find the elements we are interested in the length of the tree. We have a periodic you already have a DOC XMLDocptr and an XMLNodPtr called Cur.
1Cur = Cur-> XmlchildrenNode;
2WHILE (Cur! = Null) {
IF ((! xmlstrcmp (cur-> name) "storyInfo"))) {Parsestory (DOC, CUR);
}
Cur = Cur-> Next;
}
1 Get the first sub-node of Cur, Cur point to the root of the document, that is, "Story" element.
2 This loop iteration looks for "StoryInfo" through the child element of "story". This is an element that contains "Keywords" we will look for. It uses the libXML string comparison function XMLSTRCMP. If match, it calls the function Parsestory.
Void
Parsestory (XMLDocptr Doc, XMLNodePtr Cur) {
Xmlchar * key;
1 cur = cur-> xmlchildrennode;
2WHILE (Cur! = Null) {
IF ((! xmlstrcmp (cur-> name, (const xmlchar *) "keyword"))) {
3 Key = XMLnodelistgetString (DOC, CUR-> XMLChildrenNode, 1);
Printf ("Keyword:% S / N", Key);
XMLFree (Key);
}
Cur = Cur-> Next;
}
Return;
}
1 get the first sub-node again.
2 Like the cycle above, we can overlap, find the elements called "Keyword" in interest.
3 When we find the element "keyword", we need to print the contents of the records in XML, text is included in the sub-nodes of the elements, so we have used Cur-> XMLChildrenNode, in order to get text, we use functions XMLNodelistgetString, it has a document pointer parameter, in this example, we only print it.
Comment
Because XMLNodeListGetString assigns memory for its returned string, you must release it with XMLFree.
Get element content using XPath
In addition to step-by-step traversal document tree finding element, libXML2 contains support to obtain a designated node set using XPath expressions. Complete XPath API document here. XPath allows the node of the specified condition to be matched by the path document search. In the following example, we search for the "Keyword" element in the document.
Comment
Here is a complete discussion of XPath. For detailed use of information, please refer to the XPath specification.
This example is complete code to see Appendix D, XPath routine code.
Using XPath Requires Setting Up An XMLXPathContext and the support
Xpath Expression and The Context to The XMLXPATHEVALEXPIPRESSION FUNCTION.
The Function Returns An XMLXPathObjectptr, Which Includes the Set of
Nodes satisfying the xpath expression.
Use XPath to install XMLXPathContext to support the XPath expression and XMLXPATHEVALEXPRESSION functions, which returns an XMLXPathObjectPtr, which contains a node set of XPath expressions.
XMLXPATHOBJECTPTR
GetNodeSet (XmLDocptr Doc, Xmlchar * XPath) {
1XMLXPATHCONTEXTPTR Context;
XMLXPATHOBJECTPTR RESULT
2context = xmlxpathnewcontext (doc); 3Result = XMLXPATHEVALEXPRESSION (XPath, Context);
4IF (XMLXPathnodeSetiMpty (Result-> NodeSetVal)) {
Printf ("No Result / N");
Return NULL;
}
XMLXPathFreeContext (Context);
Return Result;
}
1 First define variables
2 Initialization variable Context
3 Application XPath expression
4 check results
The XMLPathObjectPtr returned by the function contains a node set and other information that requires iterations and operations. In this example, our function returns XMLXPathObjectPtr, we use it to print the contents of the keyword node in our document. This node set object contains the number of elements in the NodenR and an array of nodetabs.
1for (i = 0; i
2keyword = xmlnodelistgetstring (DOC,
NodeSet-> NodeTab [I] -> XMLChildrennode, Printf ("Keyword:% S / N", Keyword);
XMLFree (Keyword);
}
1 Variable NodeSet-> NR holds the number of elements in nodes. We use it to traverse arrays.
2 Print each node containing the content.
Comment
Note That We Are Printing The Child Node of The Node That Returned, Because
THE CONTENTS OF THE Keyword Element Are A Child Text Node. Note that we print the return value of the sub-node of the node, because the content of the keyword element is a sub-text node.
Write elements
Write element content uses many of the same steps above - parsing the document and traverses the tree. Let's pars the document first and then traverse the tree to find out where we want to insert the elements. In this example, we look for "storyInfo" elements again and insert a keyword. Then we installed the file to the disk. Complete code: Appendix E, add Keyword routine
The main difference in this example is Parsestory
Void
Parsestory (XmlDocptr Doc, XMLNodePtr Cur, Char * Keyword) {
1XMLNewTextChild (Cur, Null, "Keyword", Keyword);
Return;
}
1XMLNEWTEXTCHILD function Add a new child element of the current node to the tree
Once the node is added, we should write a document to the file. Do you want to specify a namespace to the elements? You can add it, in our example, the namespace is NULL.
Xmlsaveformatfile (DocName, DOC, 1);
The first parameter is the name of the write file, you noticed that the file name we just read is. In this example, we only cover the original file. The second parameter is an XMLDoc structural pointer, the third parameter is set to 1, and it is guaranteed to write on the output.
Write attribute
Write properties is similar to writing a new element. In this example, we will add a Reference Node URI property to our document. Complete code: Appendix F, add attribute routine code.
Reference is a sub-node of the Story element, so finding and inserting new elements and its properties are simple. Once we have an error check in Parsedoc, we will add our new elements in the correct location. But we need to define a data type we have not seen before.
XMlatTrptr newattr;
We also need XMLNodePtr: XMLNodeptr next;
The rest of Parsedoc is the same as before, check whether the root node is Story. If yes, then we know we will add our elements at the specified location.
1 NewNode = XMLNewTextChild (Cur, Null, "Reference", NULL);
2Newattr = XMLNewProp (NewNode, "URI", URI);
1 Add a new node to the current node location using the XMLNewTextChild function.
Once the node is added, the file should write the elements and text content we add to the disk like the previous example.
Properties
The obtained attribute value is similar to the previous text content of a node. In this example, we will take out the value of the URI we add in the previous portion. Complete code: Appendix G, acquire attribute value routine code.
The initial steps of this example are similar: parsing documents, find the elements you are interested, then enter a function to complete the specified request task. In this example, we call GetReference.
Void
GetReference (XMLDocptr Doc, XMLNodePtr Cur) {
Xmlchar * URI;
Cur = Cur-> XmlchildrenNode;
While (cur! = null) {
IF ((! XmlstrCMP (Cur-> Name, (const Xmlchar *) "Reference"))) {
1 URI = XMLGetProp (Cur, "URI");
Printf ("URI:% S / N", URI);
XMLFree (URI);
}
Cur = Cur-> Next;
}
Return;
}
1 The key function is XMLgetProp, which returns an xmlchar that contains attribute values. In this example, we only print it.
Comment
If you use the DTD defined attribute's fixed value or default, this function will also get it.
Encoding conversion
Data encoding compatibility issues are the most common difficulties when programmers have newly built ordinary XML or specific XMLs. Follow the discussion here to think to design your app will help you avoid this difficult. In fact, libXML can save and manipulate multiple data in UTF-8 format.
Your program uses other data formats, such as common ISO-8859-1 encodings, you must switch to UTF-8 using the libXML function. If you want your program to remove other coding methods outside UTF-8, it must also be converted.
If the converter can be effectively converted to the data libXML will use the converter. When there is no converter, only UTF-8, UTF-16 and ISO-8859-1 can be used as external format. When there is a converter, it can be used in any format from other formats with UTF-8 interchange. The current converter supports the mutual conversion between approximately 150 different coding formats. The actual number of formats is being implemented. Each implementation is supporting each format as much as possible.
caveat
A common error is to use different coding formats in different parts of the internal data. The most common thing is that an application is used as an internal data format in ISO-8859-1, combined with the libXML section using the UTF-8 format. The result is an application to face different internal data formats. After part of the code is executed, it or other part of the code will use the data.
This example constructs a simple document and then adds the content provided on the command line to the root element and outputs the result to the standard output device using the appropriate encoding. In this example, we use ISO-8859-1 encoding. The content entered in command will be converted from ISO-8859-1 to UTF-8. Complete code: Attachment H, encoding conversion routine code.
The conversion function contained in the example uses libXML XMLFindCharencodingHandler functions.
1XmlcharencodingHandlerptr Handler; 2Size = (int) Strlen (in) 1;
OUT_SIZE = SIZE * 2-1;
OUT = malloc ((size_t) OUT_SIZE);
...
3Handler = XMLFindCharencodingHandler (Encoding);
...
4Handler-> Input (OUT, & OUT_SIZE, IN, & TEMP);
...
5Xmlsaveformatfileenc ("-", DOC, Encoding, 1);
1 Define an XMLCHARENCODINGHANDLER function pointer.
The 2XmlcharencodingHandler function needs to give the size of the input and output strings, which calculates the input and output strings.
3XMLFindCharencodingHandler uses data initial encoding as a parameter search for transducer handles that libXML has completed and returns the found function pointer, return null if not found.
4The Conversion Function Identified by Handler Requires As ITS Arguments
Pointers to The Input and Output Strings, Along with The length of each. The
Lengths must be determined separately by the application.
The conversion function request input specified by the handle, the output character and the length thereof are used as the parameters. This length must be specified by the application.
5 Use the specified encoding instead of the UTF-8 output, we use XmlsaveFormatFileEnc to refer to the non-regular encoding method.
A. Compilation
LIBXML contains a script XML2-Config, which is generally used to generate a flag when compiling and linking the library. In order to obtain pre-processing and compilation, XML2-Config-CFLAGS is used, and XML2-Config -Libs is used in order to obtain link signs. Other effective parameters Please use XML2-Config -help.
B. Sample documentation
XML Version = "1.0"?>
storyinfo>
body>
story>
C. Keyword routine code
#include
#include
#include
#include
#include
Void
Parsestory (XMLDocptr Doc, XMLNodePtr Cur) {
Xmlchar * key;
Cur = Cur-> XmlchildrenNode;
While (cur! = null) {
IF ((! xmlstrcmp (cur-> name, (const Xmlchar *) "keyword"))) {key = xmlnodelistgetstring (DOC, CUR-> XmlchildrenNode, 1);
Printf ("Keyword:% S / N", Key);
XMLFree (Key);
}
Cur = Cur-> Next;
}
Return;
}
Static void
PARSEDOC (Char * Docname) {
XMLDocptr doc;
XMLNodePtr Cur;
DOC = XMLPARSEFILE (DOCNAME);
IF (DOC == NULL) {
FPRINTF (stderr, "document not parse successfully. / n");
Return;
}
Cur = XmldocGetrootElement (DOC);
IF (Cur == Null) {
FPRINTF (stderr, "empty document / n");
XMLFreedoc (DOC);
Return;
}
IF (xmlstrcmp (cur-> name, (const Xmlchar *) "story")) {
FPRINTF (stderr, "document of the wrong type, root node! = story);
XMLFreedoc (DOC);
Return;
}
Cur = Cur-> XmlchildrenNode;
While (cur! = null) {
IF ((! xmlstrcmp (cur-> name, (const Xmlchar *) "storyInfo"))))
Parsestory (DOC, CUR);
}
Cur = Cur-> Next;
}
XMLFreedoc (DOC);
Return;
}
int
Main (int Argc, char ** argv) {
Char * docname;
IF (argc <= 1) {
Printf ("USAGE:% s DOCNAME / N", Argv [0]);
Return (0);
}
DOCNAME = Argv [1];
parseoc (docname);
Return (1);
}
D. XPath routine code
#include
#include
XMLDocptr
GetDoc (char * docname) {
XMLDocptr doc;
DOC = XMLPARSEFILE (DOCNAME);
IF (DOC == NULL) {
FPRINTF (stderr, "document not parse successfully. / n");
Return NULL;
}
Return DOC;
}
XMLXPATHOBJECTPTR
GetNodeSet (XmLDocptr Doc, Xmlchar * XPath) {
XMLXPATHCONTEXTPTR CONTEXT;
XMLXPATHOBJECTPTR RESULT
Context = XMLXPathNewContext (DOC);
Result = XMLXPATHEVALEXPIPRESSION (XPATH, Context);
IF (XMLXPathNodeSetiMpty (Result-> NodeSetVal) {
Printf ("No Result / N"); Return NULL;
}
XMLXPathFreeContext (Context);
Return Result;
}
int
Main (int Argc, char ** argv) {
Char * docname;
XMLDocptr doc;
Xmlchar * xpath = ("// keyword");
XMLNODESETPTR NODESet;
XMLXPATHOBJECTPTR RESULT
INT I;
Xmlchar * keyword;
IF (argc <= 1) {
Printf ("USAGE:% s DOCNAME / N", Argv [0]);
Return (0);
}
DOCNAME = Argv [1];
DOC = getDoc (docname);
Result = GetNodeEt (DOC, XPATH);
IF (result) {
NodeSet = Result-> NODESETVAL;
For (i = 0; i
Keyword = XMLnodelistgetString (DOC, NODESET-> NODETAB [i] -> Printf ("Keyword:% S / N", Keyword);
XMLFree (Keyword);
}
XMLXPATHFREEOBJECT (RESULT);
}
XMLFreedoc (DOC);
XmlcleanupParser ();
Return (1);
}
E. Add Keyword routine code
#include
#include
#include
#include
#include
Void
Parsestory (XmlDocptr Doc, XMLNodePtr Cur, Char * Keyword) {
XMLNewTextChild (Cur, Null, "Keyword", keyword);
Return;
}
XMLDocptr
Parsedoc (Char * Docname, Char * Keyword) {
XMLDocptr doc;
XMLNodePtr Cur;
DOC = XMLPARSEFILE (DOCNAME);
IF (DOC == NULL) {
FPRINTF (stderr, "document not parse successfully. / n");
Return (NULL);
}
Cur = XmldocGetrootElement (DOC);
IF (Cur == Null) {
FPRINTF (stderr, "empty document / n");
XMLFreedoc (DOC);
Return (NULL);
}
IF (xmlstrcmp (cur-> name, (const Xmlchar *) "story")) {
FPRINTF (stderr, "document of the wrong type, root node! = story);
XMLFreedoc (DOC);
Return (NULL);
}
Cur = Cur-> XmlchildrenNode;
While (cur! = null) {
IF ((! "))) {Parsestory (DOC, CUR, Keyword);
}
Cur = Cur-> Next;
}
Return (DOC);
}
int
Main (int Argc, char ** argv) {
Char * docname;
Char * keyword;
XMLDocptr doc;
IF (argc <= 2) {
Printf ("USAGE:% s DOCNAME, Keyword / N", Argv [0]);
Return (0);
}
DOCNAME = Argv [1];
Keyword = argv [2];
DOC = Parsedoc (DocName, Keyword);
IF (doc! = null) {
Xmlsaveformatfile (DocName, DOC, 0);
XMLFreedoc (DOC);
}
Return (1);
}
F. Add attribute routine code
#include
#include
#include
#include
#include
XMLDocptr
Parsedoc (Char * Docname, Char * URI) {
XMLDocptr doc;
XMLNodePtr Cur;
XMLNodePtr News;
XMlatTrptr newattr;
DOC = XMLPARSEFILE (DOCNAME);
IF (DOC == NULL) {
FPRINTF (stderr, "document not parse successfully. / n");
Return (NULL);
}
Cur = XmldocGetrootElement (DOC);
IF (Cur == Null) {
FPRINTF (stderr, "empty document / n");
XMLFreedoc (DOC);
Return (NULL);
}
IF (xmlstrcmp (cur-> name, (const Xmlchar *) "story")) {
FPRINTF (stderr, "document of the wrong type, root node! = story);
XMLFreedoc (DOC);
Return (NULL);
}
NewNode = XMLNewTextChild (Cur, Null, "Reference", NULL)
NewAttr = XMLNewProp (NewNode, "URI", URI);
Return (DOC);
}
int
Main (int Argc, char ** argv) {
Char * docname;
Char * URI;
XMLDocptr doc;
IF (argc <= 2) {
Printf ("USAGE:% s DOCNAME, URI / N", Argv [0]);
Return (0);
}
DOCNAME = Argv [1];
URI = Argv [2];
DOC = Parsedoc (DocName, URI);
IF (doc! = null) {
Xmlsaveformatfile (DocName, DOC, 1);
XMLFREEDOC (DOC);
Return (1);
}
G. acquired attribute value routine code
#include
#include
#include
#include
#include
Void
GetReference (XMLDocptr Doc, XMLNodePtr Cur) {
Xmlchar * URI;
Cur = Cur-> XmlchildrenNode;
While (cur! = null) {
IF ((! XmlstrCMP (Cur-> Name, (const Xmlchar *) "Reference"))) {
URI = XMLGetProp (Cur, "URI");
Printf ("URI:% S / N", URI);
XMLFree (URI);
}
Cur = Cur-> Next;
}
Return;
}
Void
PARSEDOC (Char * Docname) {
XMLDocptr doc;
XMLNodePtr Cur;
DOC = XMLPARSEFILE (DOCNAME);
IF (DOC == NULL) {
FPRINTF (stderr, "document not parse successfully. / n");
Return;
}
Cur = XmldocGetrootElement (DOC);
IF (Cur == Null) {
FPRINTF (stderr, "empty document / n");
XMLFreedoc (DOC);
Return;
}
IF (xmlstrcmp (cur-> name, (const Xmlchar *) "story")) {
FPRINTF (stderr, "document of the wrong type, root node! = story);
XMLFreedoc (DOC);
Return;
}
GetReference (DOC, CUR);
XMLFreedoc (DOC);
Return;
}
int
Main (int Argc, char ** argv) {
Char * docname;
IF (argc <= 1) {
Printf ("USAGE:% s DOCNAME / N", Argv [0]);
Return (0);
}
DOCNAME = Argv [1];
parseoc (docname);
Return (1);
}
H. Code conversion routine code
#include
#include
unsigned char *
Convert (unsigned char * in, char * encoding)
{
Unsigned char * out;
int RET, SIZE, OUT_SIZE, TEMP
XmlcharencodingHandlerptr Handler;
SIZE = (int) Strlen (in) 1;
OUT_SIZE = SIZE * 2-1;
OUT = malloc ((size_t) OUT_SIZE);
IF (out) {
Handler = XMLFindChagencodingHandler (Encoding);
IF (! handler) {
FREE (OUT);
OUT = null;}
}
IF (out) {
TEMP = Size-1;
Ret = Handler-> Input (out, & out_size, in, & temp);
IF (RET || TEMP-SIZE 1) {
IF (re) {
Printf ("Conversion Wasn't Successful./N");
} else {
Printf ("Conversion Wasn't Successful. Converted:}
FREE (OUT);
OUT = NULL;
} else {
OUT = Realloc (OUT, OUT_SIZE 1);
OUT [OUT_SIZE] = 0; / * null Terminating out * /
}
} else {
Printf ("NO MEM / N");
}
Return (OUT);
}
int
Main (int Argc, char ** argv) {
Unsigned char * content, * out;
XMLDocptr doc;
XMLnodeptr rootnode;
Char * encoding = "ISO-8859-1";
IF (argc <= 1) {
Printf ("USAGE:% s content / n", argv [0]);
Return (0);
}
Content = argv [1];
OUT = Convert (Content, Encoding);
DOC = XMLNewDoc ("1.0");
Rootnode = XMLNEWDOCNODE (DOC, NULL, (Const Xmlchar *) "root", out);
XMLDocSetrootElement (DOC, ROOTNODE);
XmlsaveFormatfileenc ("-", DOC, Encoding, 1);
Return (1);
}
I. Thanks
A Number of People Have mener offered feedback, code and suggested improvements
To this Tutorial. in no particular Order: Daniel Veillard, Marcus Labib
ISKANDER, Christopher R. Harris, Igor Zlatkovic, Niraj Tolia, David Turover
index
A
Attribute
Retrieving Value, 7
Writing, 6
LIBXML TUTORALIAL
16
C
Compiler Flags, 9
E
ELEMENT
Retrieving Content, 3
Writing content, 6
ENCODING, 3, 7
Fly
File
Parsing, 2-3
Saving, 6
X
Xmlchar, 2
XMLDoc, 2
XMLNodePtr, 2