PHP5 XML new features

zhaozj2021-02-16  86

Author Christian Stocker Translation Ice_Berg16 (Dream Scarecrow)

Reader

The object-oriented object-oriented is all levels of PHP developers who are interested in the XML of PHP5. We assume that the reader has mastered the basic knowledge of XML. However, if you have used XML in your PHP, then this article will make you benefit.

Introduction

In today's Internet world, XML is no longer a trendy word, it has been used extensive acceptance and norms. Therefore, relative to php4, PHP5 supports the support of XML more attention. In PHP4, you face almost all non-standard, API interrupts, memory leaks, and other incomplete features. Although some shortcomings have been improved in PHP4.3, developers have decided to abandon the original code and rewrite all code in PHP5.

This article will introduce all the exciting new features of XML in PHP5.

PHP4 XML

Early PHP versions have begun to support XML, and this is just a SAX-based interface that can be easily parsed to any XML document. As the DOMXML extension module is added to PHP4, XML is better supported. Later, XSLT was added as a supplement. In the entire PHP4 phase, other functions such as HTML, XSLT and DTD verification are also added to the Domxml extension. Unfortunately, due to the XSLT and DOMXML extensions are always in the experimental phase, the API part is also modified more than once, they still can't Install by default. In addition, DomXML extensions do not follow the DOM standards developed by W3C, and have their own naming methods. Although this part is improved in PHP4.3 and many memory leaks and other functions are also fixed, it has never been developed to a stable phase, and some in-depth problems are almost impossible to fix. Only SAX extensions are installed by default, and some other extensions have never been widely used.

Based on all these reasons, PHP XML developers decided to rewrite all code in PHP5 and follow the standards of use.

PHP5's XML all rewrbands all support XML in PHP5. All XML extensions are currently based on the libxml2 library based on the GNOME project. This will allow each other between different extension modules, and the core developers only need to develop on a basement library. For example, complex memory management can only improve all XML-related extensions only once.

In addition to inheriting a known SAX parser in PHP4, PHP5 also supports the W3C standard DOM and XSLT based on the libxSLT engine. At the same time, PHP's unique SimpleXML extension and the standard SOAP extension are also added. As XML is increasingly valued, PHP developers decide to join more XML support in the default installation method. This means you can now use SAX, DOM, and Simplexml, and these extensions will be installed on more servers. Then for the support of XSLT and SOAP, it also needs to be explicitly configured when PHP compile.

Support for data flow

Now all XML extensions support PHP data streams, even if you don't access them directly from PHP. For example, in PHP5 you can access data streams from a file or from a directive. Basically you can access PHP data streams anywhere where you can access normal files.

The data stream is briefly introduced in PHP4.3, which has been further improved in PHP5, including file access, network access, and other operations, such as sharing a set of functional functions. You can even use PHP code to implement your own data stream, so that data access will become very simple. For more details on this part, please refer to the PHP document.

SAX

SAX's full name is Simple API for XML, which is used to parse the interface of the XML document, which is based on callback. SAX has been supported from PHP3, and there is no much change now. In PHP5, the API interface has not changed, so your code can still run. The only difference is that it is no longer based on the expat library, but is based on the libxml2 library. This change brings some questions on namespace support, which has been resolved in libxml2.2.2.6. But there is no resolution in the previous version of LIBXML2, so if you use XML_PARSE_CREATE_NS (); strongly recommends installing libXML2.2.2.6 on your system.

DOM

The DOM (Document Object Model) is a set of standards that have been developed by W3C. In PHP4, DomXML can be used to operate this, and the most important issue of DOMXML is that it does not meet the standard naming method. And there is a memory leak problem in a long period of time (PHP4.3 has fixed this problem).

The new DOM extension is based on the W3C standard, contains methods and attribute names. If you are familiar with DOM in other languages, for example in JavaScript, you will become very easy to write a similar function in PHP. You don't have to view the document each time because the methods and parameters are the same.

Due to the use of new W3C standards, DOMXML-based code will not be able to run. There is a big difference in the API in PHP. But if you use a method of naming similar to W3C standards in your code, transplant is not very difficult. You only need to modify the load function and save functions, delete the underscore in the function name (DOM standard uses the first letter capitalization). The adjustments of the other are of course necessary, but the main logical part can remain unchanged.

Read DOM

I will not explain all the features of the DOM extension in this article, which is not necessary. Maybe you should add a document on http://www.w3.org/dom to the bookmark. It is basically the same as the DOM portion of the PHP5.

In most examples of this article we will use the same XML file, there is a very simple RSS version on Zend.com. Paste the following text into a text file and save it as articles.xml.

> http://www.zend.com/ze/week/week172.php

> http://www.zend.com/zed/tut/tut-hatwar3.php

To load this example into a DOM object, first create a DomDocument object, then load the XML file.

$ DOM = New domdocument (); $ dom-> loading ("articles.xml");

As mentioned above, you can use the PHP data stream to load an XML document, you should write this:

$ DOM-> Load ("file: ///articles.xml");

(Or other types of data flow)

If you want to output an XML document to your browser or as a standard mark, use:

Print $ DOM-> Savexml ();

If you want to save it into a file, use:

Print $ DOM-> Save ("newfile.xml");

(Note that this will send the file size to stdout)

Of course, this example doesn't have much function, let us do more useful. Let's get all the Title elements. There are many ways to do, the easiest is to use getElementsBytagname ($ TAGNAME):

$ titles = $ DOM-> getElementsBytagname ("Title"); Foreach ($ titles as $ node) {print $ node-> textcontent. "/ n";} TextContent property is not W3C standard, it can make us very convenient Quickly read all text nodes of an element, using W3C's standard reading is the following:

$ node-> firstchild-> data;

(At this time you have to make sure the firstchild node is the text node you need, otherwise you have to traverse all sub-nodes to find).

Another question is to pay attention to getElementSbyTagname () Returns a DomnodeList, object, not to return an array like pHP4, but as you can see in this example, you can use Foreach statements to easily traversal it. You can also use $ titles-> Item (0) directly to access the node. This method will return the first Title element.

Another way to get all Title elements is traversed from root point, you can see that this method is more complex, but if you need not just a title element, this method is more flexible.

Foreach ($ DOM-> DocumentElements) {// If the node is an element (nodetype == 1) and the name is Item continues to loop if ($ articles-> nodetype == 1 && $ articles-> NodeName == "item") {foreach ($ articles-> childnodes as $ it) {// If the node is an element, and the name is Title, print it. IF ($ item-> nodetype == 1 && $ item- > nodename == "title") {print $ item-> textContent. "/ n";}}}}

XPath

XPAHT is like SQL of XML, using XPath you can query specific nodes that match some mode syntax in an XML document. Want to get all Title nodes using XPath, just need to do this:

$ xp = new domxpath ($ DOM); $ titles = $ xp-> query ("/ articles / item / title"); Foreach ($ titles as $ node) {print $ node-> textcontent. "/ n"; }?>

This is similar to using the getElementSbyTagname () method, but XPath is much stronger, for example, if we have a title element is an Article element (not the child element of Item), getElementsBytagname () returns it. Using / Articles / Item / Title grammar, we only get the title element in the specified depth and location. This is just a simple example, and it may be like this:

/ articles / item [position () = 1] / Title Returns all / articles / item / title [@ID = '23'] of the first Item element to return all Title / Articles /////////// Title Returns all the Title under all Articles Elements (Translator Note: // Represents Arbitrary Depth) You can also query points containing special brother elements, including elements of special text content, or use namespaces, and so on. If you have a lot of query xml documentation, the appropriate learning uses XPath saves you a lot of time, it is simple, the execution speed is fast, and less code is required than the standard DOM.

Write data to DOM

The document object model is not only read and queries, you can also operate and write. (The DOM standard is a bit lengthy because the writer wants to support every environment that can be imagined, but it works very well). Take a look at this example, it adds a new element in our article.xml file.

$ item = $ DOM-> CreateElement ("item"); $ title = $ dom-> CreateElement ("Title"); $ TitleText = $ DOM-> CreateTextNode ("XML in PHP5"); $ TITLE-> AppendChild $ titletext); $ item-> appendchild ($ title); $ dom-> documentelement-> appendchild ($ item); Print $ DOM-> SaveXML ();

First, we created all the nodes required, an item element, a title element and a text node containing the Item title, and then let all the nodes link up, add the text node to the title element, put the Title Elements are added to ITEM elements, and finally we insert the ITEM element into the Articles root element. Now, there is a new list of articles in our XML document.

Expansion class (Class)

Ok, the above example can be done in PHP4 under the DOMXML extension (only API has some different), it is possible to extend the DOM class is a new feature of PHP5, which makes writing more readable code becomes possible . Below is the entire example of rewrite the DomDocument class:

Class Articles Extends DomDocument () {Function __Construct () {// must be called! Parent :: __ construct ();} Function addArticle ($ titles) {$ item = $ this-> CreateElement ("item"); $ titlespace = $ THIS- > createElement ( "title"); $ titletext = $ this-> createTextNode ($ title); $ titlespace-> appendChild ($ titletext); $ item-> appendChild ($ titlespace); $ this-> documentElement-> appendChild ( $ item);}} $ dom = new articles (); $ dom-> loading ("articles.xml"); $ dom-> addarticle ("XML in PHP5"); Print $ DOM-> Save ("NewFile. XML "); HTML

One often not noticed in PHP5 is the support of the libxml2 library for HTML, you can not only use the DOM extended load structure XML document, you can also load non-structured (not-Well- FORMED) HTML document, treating it as a standard DomDocument object, using all methods and features, such as XPath and SimpleXML.

When you need to access a content you can't control the site, HTML performance is very useful. With the help of XPath, XSLT or Simplexml, you save a lot of code, compare strings or SAX parsers like using regular expressions. When the HTML document structure is not very good, this method is especially useful (this is a frequent problem!).

The following code obtains and parsing the home page of php.net, will return the content of the first title element.

$ DOM = New DomDocument (); $ DOM-> LoadHtmlfile ("http://www.php.net/"); $ title = $ dom-> getElementsBytagname ("title"); print $ title-> item (0 .

Note that when the specified element is not found, your output may contain an error. If your website is still using the PHP output HTML4 code, there is a good news to tell you that DOM extensions can not only load HTML documents, but also saved them as an HTML4 format. After you add DOM documentation, use $ DOM-> SaveHTML () to save. It should be noted that in order to make the output HTML code conform to the W3C standard, it is best not to use neutral extensions. (TIDY Extension). The HTML supported by libxml2 libraries does not take into account each possible thing, and cannot process input in non-universal formats.

verification

The verification of the XML document is getting more and more important. For example, if you get an XML document from some foreign resources, you need to test whether it meets a certain confirmed format before you process. Fortunately, you don't need to write your own verification program in PHP, because you can use the most widely used standards (DTD, XML Schema or Relaxng) to complete it. .

DTD is a standard that produces the SGML era, lacks new features of XML (such as namespace), and it is difficult to parse and convert because it is not written in XML. XML Schemai is a standard developed by W3C, which is widely used, which contains all the contents needed to verify the XML document.

Relaxng is a complex XML Schema standard of the head, which is created by the liberty organization, because it is more easily achieved than XML Schema, more and more programs start supporting Relaxng.

If you don't have a planned document or a very complex XML document, use the Relaxng. It is relatively simple to write and read, and more and more tools also support it. There is even a tool called TRANG, which can automatically create a Relaxng document from the XML template. And only Relaxng (and aging DTDS) is fully supported by libxml2, although libXML2 is about to fully support ML Schema.

Verify that the syntax of the XML document is quite simple:

$ DOM-> Validate ('Articles.dtd');

$ DOM-> translaxngvalidate ('articles.rng');

$ DOM-> Schemavalidate ('Articles.xsd');

Currently, all of this will only return true or false, and the error will be made as a PHP warning output. Obviously I want to return to user friendly information, this is not a good idea, and there will be improved in the version after php5.0. How to achieve it is still discussing, but the error report will definitely handle better.

Simplexml

Simplexml is a lasting member of the XML family of PHP, and the purpose of adding SimpleXML extensions is to provide a simpler way to use standard object properties and iterators to access XML documents. This extension does not have much way, although it is quite powerful. The acquisition of all Title nodes from our document requires less code.

$ SXE = Simplexml_load_file ("Articles.xml"); Foreach ($ SXE-> Item AS $ Item) {Print $ item-> title. "/ n";}

What is this doing? First load articles.xml into a SimpleXML object. Then get the item element in all $ sx, last $ item-> title returns the content of the title element, that is. You can also use the associated array query properties, use: $ item-> title ['id'].

I saw it, this is really amazing, there are many different ways to get the results we want, for example, $ item-> title [0] returns the same result in the example, on the other hand, Foreach ($ SXE -> Item-> Title AS $ Item only returns the first Title, not all Title elements in the document. (Just like I expected in XPath).

Simplexml actually uses the first extension of the new characteristics of Zend Engine 2. Therefore, it is also a test point of these new features. You have to know that BUGS and unpredictable errors in the development phase are not a few.

In addition to the method of traversing all nodes used in the above example, there is also an XPath interface in SimpleXML, which provides a simpler way for accessing a single node.

Foreach ($ sx-> xpath ('/ articles / item / title') AS $ item) {print $ item. "/ n";}

It is undeniable that this code is not shorter than the shortest example, but provides more complex or deeper nesting XML documents, you will find XPax with SimpleXML saves you a lot of input. Write data to Simplexml documents

You can not only resolve and read simplexml, but also change the SimplexML document. At least we joined some extensions:

$ sx-> item-> title = "xml in pHP5"; // Title element new content. $ SX-> Item-> Title ['ID'] = 34; // Title element new attribute. $ xmlstring = $ sxe-> asxml (); // Returns the SimpleXML object as a serialized XML string returns Print $ xmlstring;

Interoperability

Since SIMPLEXML is also based on libxml2 libraries, you can easily convert SimpleXML objects to Domdocument objects with almost no speed. (Documentation does not have internal copy), due to this mechanism, you have the best part of the two objects, use a tool that suits your hand, it is used in this way:

$ sxe = simplexml_import_dom ($ DOM);

$ DOM = DOM_IMPORT_SIMPLEXML ($ SXE);

Xslt

XSLT is a language used to convert an XML document to other XML documents. The XSLT itself is written in XML, which is a functional language family, which is different from the program processing and the object language (like PHP). There are two XSLT processors in PHP4: Sablotron (in a widely used XSLT extension) and libXSLT (in Domxml extensions), these two APIs are not compatible with each other, and the use methods are also different. PHP5 only supports the libxSLT processor, which is selected because it is based on libXML2, so it is also more in line with the php5 XML concept.

It is also possible to bind Sablotron to PHP5, but unfortunately no one will do it. Therefore, if you are using Sablotron, you have to switch to the libxslt processor in PHP5. LIBXSLT is Sablotron with JavaScript exception handling, and can even use PHP powerful data streams to re-implement Sablotron's unique plan processing (Scheme Handlers). In addition, LibxSLT is one of the fastest XSLT processors, so you have also upgraded for free. (Perform speed is twice the SABLOTRON).

Like other extensions discussed in this article, you can exchange XML documents between XSL extensions, DOM extensions, and Vice Versa. In fact, you have to do this, because EXT / XSL extensions do not load and save XML documents interface, Only DOM extensions can be used. Learn XSLT conversion at first, you don't need to master too much content, there is no W3C standard, because this API is "borrowed" from Mozilla.

First you need an XSLT style sheet, paste the following text into a new file and save the ash articls.xsl

Then call it with a PHP script ::

/ * Load XML and XSL documents to DomDocument objects * / $ xsl = new domdocument (); $ xsl-> load ("articles.xsl"); $ INPUTDOM = New DomDocument (); $ inputdom-> loading Articles.xml "); / * Create a XSLT processor, and import style sheet * / $ proc = new xsltprocessor (); $ xsl = $ proc-> importstylesheet ($ xsl); $ proc-> setparameter (null," titles "," Titles "); / * Conversion and output XML document * / $ newdom = $ proc-> TransformTodoc ($ INPUTDOM); Print $ newdom-> savexml ();>> The above example first uses the DOM method LOAD ( Load XSLT style sheet articles.xsl, then create a new XSLTProcessor object, the object is guided back to use the XSLT style table object, the parameter can set the setParameter (Namespaceuri, Name, Value), last XSLTProcessor object Using TransformTodoc ($ INPUTDOM) Start conversion and returns a new DomDocument object.

The advantage of this API is that you can use the same style sheet to convert a number of XML documents, just load it, then repeatedly use it, because the TransorMtodoc () function can be applied to different XML documents.

In addition to TransormTodoc (), there are two methods for conversion: TransformToxml Returns a string, Transformtouri ($ DOM, $ URI) saves the document after the conversion to a file or a PHP data stream. Note If you want to use xslt's grammar such as

Or indent = "yes", you can't use TransformTodoc (), because the DomDocument object cannot save this information, you can only do this when you save the conversion directly to a string or file.

Calling a PHP function

The last new feature of the XSLT extension is to call any PHP functions inside the XSLT style table. Profile the orthodox XML support must not like this function (such a style sheet is a bit complicated, it is easy to confuse the logic and design), in a certain Some places are very useful. The xslt becomes very limited when it comes to a function, even if you want to implement a different language to output a date. But using this feature, processing these will be as easy as only PHP is used. Here is the code to add a function to XSLT:

Function daterang () {RETURN STRFTIME ("% a");} $ xsl = new domdocument (); $ xsl-> load ("datetime.xsl"); $ inputdom = new domdocument (); $ inputdom-> loading "Today.xml"); $ proc = new xsltprocessor (); $ proc-> registerphpfunctions (); // Load the document and use $ xsl to process $ xsl = $ proc-> importstylesheet ($ XSL); / * Conversion And output XML document * / $ newdom = $ proc-> TransformTodoc ($ INPUTDOM); Print $ newdom-> savexml ();?>

Here is the XSLT style sheet DateTime.xsl, which calls this function. Below is an XML document to be converted using a style sheet, Today.xml, articles.xml also gave the same result).

The above style sheet, PHP script and all XML files will use the language of the language set by the current system. You can add more parameters to PHP: Function (), and the added parameters are passed to the PHP function. Here is a function php: functionString (), which automatically converts all input parameters to a string, so you don't need to convert in PHP.

Note that you need to call $ xslt-> registerphpfunctions before conversion, otherwise the PHP function call will not be executed because of security reasons (do you always believe in your XSLT style sheet?). The current access system has not been implemented, perhaps this feature will be implemented in the future of PHP5.

Summary

PHP's support for XML has moved forward, it meets the standard, powerful, and has strong interoperability, and is installed as the default option, has been authorized. The newly added Simplexml extension provides a simple to quickly access the XML document, save you a lot of code, especially when you have a structured document or you can use powerful XPath.

Thanks to the underlying library used by libXML2-PHP5 XML extensions, using DTD, RELAXNG, or XML Schema Verify that XML documents have been supported.

XSL support has also been refurbished, and now uses the libxslt library, which has greatly improved than the original Sablotron library, and, in the XSLT style table, call the PHP function to let you write more powerful XSLT code.

If you have used XML in PHP4 or other languages, you will like the XML feature of PHP5, XML has a large change in PHP5, in line with standards, and other tools, language is equivalent.

link

PHP 4 related

Domxml extension: http://www.php.net/domxml/

Sablotron extension: http://www.php.net/xslt/

LibxSLT: http://www.php.net/manual/en/function.domxml-xslt-stylesheet.php

PHP 5 related

Simplexml: http://www.php.net/simplexml/

Streams: http://www.php.net/manual/en/ref.stream.php

standard

Dom: http://www.w3.org/dom

Xslt: http://www.w3.org/tr/xslt

Xpath: http://www.w3.org/tr/xpath

XML Schema: http://www.w3.org/xml/schema

Relaxng: http://relaxng.org/

Xinclud: http://www.w3.org/tr/xinclude/

tool

LIBXML2, THE Underlying Library: http://xmlsoft.org/

TRANG, A Schema / Relaxng / etc Converter: http://www.thaiopensource.com/relaxng/trang.html

About author

转载请注明原文地址:https://www.9cbs.com/read-12647.html

New Post(0)