The marker language has been constantly striving to improve the means of information. The current document processing system not only records the original information of the document, but also contains a lot of format, printing, processing information. The focus of formatting processing work is to make the document to Markup. The original meaning of Markup is to add a mark (Marking IT Up), most of which text processing system is based on this marker principle.
Several categories: 1, formatted tags: Mainly used to represent the format of the document, just like most text processing systems can format the text fonts in the document, tilting, and discoloration. The bold is used in in HTML. 2, Structured Tag: The structure used to define the document. Such as HTML can represent a sentence of the first level of the paragraph. 3, semantic tag: The content used to define the document.
The emergence of the problem is that the document generated by the software today is several years, and the software can also understand their meaning and make it easy to use them. That is to understand the meaning of the mark, so that we need to predetermine a set of rules, each set of tag languages will have a set of tag rules to help us understand his content and structure. The content above is nothing more than saying, in order to share the document sharing in different document processing systems, we should: 1, use a universal document format to support these document processing systems, or the document processing system we have written To support these formats Or is implemented in these document formats; 2, a detailed rule is required to define document formats. This way, the marker language GML (Generalized Markup Language) has emerged, which is later developed into a general tag language SGML (Standard Generalized Markup Language).
HTML's shortcomings HTML is a son of SGML, the same XML is also a son of SGML, XML and HTML are brothers. Learning programming is like this, a new technology must be criticized to criticize the old technology it replaced, although XML has no capacity to completely replace HTML, but we have to drop HTML shortcomings. First, HTML does not express the true meaning of the content and can only use pre-defined tags. Second, the HTML level is too monotonous: it only supports simple paragraphs or fragment structures, and cannot define the level of data, and these XMLs are possible. Third, HTML requires the document too complete: We can't filter the part of the data we just want to get. Fourth, HTML has no real internationalization: all manufacturers are defining their own standards, in order to be able to become the industry standard, the result is not to buy, so everyone has their own standards. The HTML parser implements different rules, the same content, must write different code for different browsers. 5. HTML cannot really implement data interaction: It doesn't matter to provide programming interface to resolve the data it carrier, which limits its data interaction with various applications, databases, and operating systems. 6. HTML is not reused. XML appears extensible markup language, which is expandable. This means we can define your own tag set, which is a generous language that can be used to describe other languages (CML). XML can go deep into distributed applications, database application areas. The business logic and display can be separated by XML. The logical structure of the data is expressed by XML tags, which maintains some related features of the database, including data queries, etc., and it is easy to perform corresponding program development. Then use the program to convert the logic's XML to a response display style HTML, the biggest advantage is that the intermediate agent layer can collect different databases and existing file data. The HTML shortcomings in the XML advantage should be the advantages of XML's minimum, and some other now are not fully understood, so they decided to analyze after learning.
An indispensable parser XML designer given that the current webpage is not quite compliant, so it is determined that this time is strictly executed, and some rules must be followed by XML, otherwise the system shows an error, the development of XML parsers It is simple because it has a simple syntax and strict requirements for formatting. Category: Apache Xerces, IBM XML4J, Open XML, Oracle XML Parser, and IE MSXML (even IE6 you use, it is also recommended to install an XML parser patch, otherwise there is a write effect may not appear).
Development Tools: Xmlspy here has some tutorials, everyone learn http://www.qcdn.net/school/list.asp?unid=2307.