The expression language used by XPath, mainly consists of two parts: path search expressions for expression languages and XPath. 1. Expression language, by , -, *, /, or, and, not, and values Strings, functions, etc. Comply with formal syntax. 2. XPath path search expressions refer to the expression of an XML Node in the XML document. For example, / books / book, search all Book bytes under the Books node. Specifically, you can refer to the description of XPath in www.w3c.org.
To achieve a complete XPath parser, the foundation is to implement its expression language parsing and XPath path search. Since only the XPath path search section is currently implemented, it simply introduces the implementation of path search.
The main components of XPath path expressions are 3: separator: "//", "/", '[', ']', ":" node name: such as / books / book path, Books And Book is a node name. Attribute name: such as / books / book [@ title = "123 *"], Title is a property name XPath path expression is also fully compliant, and there are three implementation methods for the lexical analysis and grammar analysis: state machine syntax analysis , Regular expression ratio, and lex / yacc. State machine syntax analysis is very convenient, the disadvantage is that the code implementation is not easy to expand; regular expression matches easy implementation, easy to expand, the disadvantage is performance integrity; Lex / YACC is easy to implement and expand, the disadvantage is that Unicode is not well supported, and XPath is an integral part of XML, support Unicode is the basic requirement. Since the XPath syntax has been relatively complete, the words of the individual selection state machine implement lexical and grammar analysis. The simple architecture is as follows: XPathToken, which is used to remove a token from the input XPath expression, which may be a separator, a node name, or a property name; XPathParser, read each token using XPathToken, then assembles these Token A meaningful XPath syntax, simultaneously checking; XpathDocument, based on the parsed XPath syntax, search from a corresponding node in an XML document; XPathDocument is the difficult point of XPath path search, because XpathDocument needs to go according to XPath expressions Traverse the XML node, so XML Parser is best to fully implement the functional part of the XML Dom Level2, for example, as follows: A simple XML document structure is as follows: