Preparation of mini search engines with PHP & XML (1)
First, I know XML, everyone may be very strange to XML, I don't want the system to explain XML why, I just introduced some concepts used in this article, if you have used XML, even beginners. You can also skip this chapter. Talking about XML, I may wish to give you a familiar HTML code. (1) (2)
Page title title> (3) (4)
text font> center> p> (5) (6) body> (7) html> above Segment code can comply with XML rules from the structure. He meets the following characteristics 1. When references the same element, use consistent cases such as center> is a defined 2, any property value (such as href = "????") "" "" "" "" "" Img ... /> Please pay attention to the end /> less or wrong code 4, all elements must nest each other, just like writing the loop, and all elements must be nested in the root element. All of the contents of the code are nested in HTML>. 5, the name of the element (ie, the above Body a P IMG, etc.) should begin with letters. In fact, it is best to be an English word, please pay attention to case. How, XML is not too annoying, you can understand that he is a good structure type of a tree containing data. Ok, everyone is familiar with the XML used in our program. Network Mania Mystery Your Search Engine Constructs Name2 programming language Name3 Web> php sub> sub> sub> links> In fact, its structure is quite simple, the root element is links, and the SUB represents a category, the web is a website information, It contains attributes, the URL represents the connection of the website, MEMO is a note information, ?? web>, ?? sub> is the name of the element and the name of the website is here. Please note that he is in line with the requirements of me. Plus XML Version = "1.0" Encoding = "GB2312" in 1st, in the first line, add it to XYZ.XML, use IE5 or more Take a look. How, the structure of his tree is unblora. So why do our mini's search engine use him. The first reason is that I can't use MySQL in Ozen (I really embrace), second For small data volume search engines, its data is small, if used in the database, efficiency is not necessarily high. The most important point is that he maintains quite simple, reduce manpower, and not writing Cut coins maintenance program, for example, we have to add a category or page, just edit the text file, plus a ??? web> or ???? sub> Yes, and if you want to move a category to another, we will not take this part of the SUB, CTRL-X, CTRL-V. In fact, XML function I only After using a little fur, I will give you a more deep article. II. How PHP parses XML Note: This chapter is based on the NetEase Virtual Community (I am too lazy to knock), two types of XML parsers Basic Type: Tree-based parser: Convert XML document into a tree structure. Such parser analyzes the entire article while providing an API to access each element of the tree. Its generic standard is DOM (documentation) Object mode). Used JavaScript may be used through XMLDOM. Event-based parser: Treat XML documents as a series of events. When a special event occurs, the parser will call the function provided by the developer. Based on event The parser has a data set view of an XML document, that is, it focuses on the data section of the XML document, not its structure. These parsers process the document from the head to the tail, and will be similar to the beginning of the element, the end of the element, The start of feature data. - The event is reported to the application via a callback function. Heral is an "Hello-World" XML Sample: Hello World greeting> Event-based parser will report as three events: Start element: Greeting CDATA item, the value is: Hello World End Element: Greeting is not like a tree-based parser The event-based parser does not produce the structure of the document.
In CDATA, event-based parsers do not allow you to get information about the parent element Greeting. However, it provides a lower-level access, which makes it better to use resources and access faster. In this way, there is no need to put the entire document into memory; in fact, the entire document can even be greater than the actual memory value. Prepare a function for generating an XML parser instance as XML_Parser_Create (). This instance will be used for all functions. This idea is very similar to the connection tag of the MySQL function in PHP. Event-based parsers are usually required to register the callback function before parsing the document - calls when specific events occur. Expat no exceptions, it defines the following seven possible events: the beginning and end of the character objects XML parsing function describes the elements xml_set_element_handler () element data xml_set_character_data_handler () to start an external entity xml_set_external_entity_ref_handler character data () external entities appear External unparsed entity xml_set_unparsed_entity_decl_handler () Unconcerned external entity appearance processing command XML_SET_PROCESSING_INSTRUCTION_HANDLER () Processing Declaration Declaration XML_SET_NOTATION_DECL_HANDLER () Declaration Default XML_SET_DEFAULT_HANDLER () Other Events No Events All Tune Functions All callback functions must be used as an instance of the parser The first parameter (there are other parameters). See PHP instructions for more detail. The following examples are used in the example of the XML element structure (Element Structure) taken from the PHP manual example, he is the basic structure of our search engine, but I will not comment, because we will introduce it next chapter.