1.0 Introduction This paper explores the relationship between XML and databases, and lists software that can use the database to process XML documents. Although these software is not intended to be detailed here, the author wants it to describe the main part of the XML document using the database. It is a bit a little bit of a relational database because my experience is.
2.0 XML is a database? Before we start discussing XML and database, we need to answer a question in a lot of heart: "Is XML is a database?" In strict sense, if "XML" refers to the XML document, the answer is "no". Although the XML document contains data, but if there is no other software to process these data, it is nothing to do with the meaning of the database and other text files. If "XML refers to the XML document and all related XML tools and techniques, the answer is" Yes ". The reason is that the XML provides many databases needed. Part: Storage (XML Document), Structure (DTD, XML Schema Language), Query Language (XQL, XML-QL, Quilt, etc.), programming interface (SAX, DOM), and so on. However ... XML is still a lot The contents necessary in the real database: effective storage, indexing, security, transaction, data completeness, multi-user access, trigger, multi-document query, etc. Therefore, if the amount of data is general, the user is small, the performance requirements are not high Under the environment, you can use XML as a database; in most product environments, there are many users who require strict data integrity and have high requirements for performance, XML can not be eligible. And, Considering that databases such as DBASE and Access are both cheap and very easy, even in the first case, XML has little reason to act as a role of the database. Why use the database? When considering using XML and database, first It should be: Why do I need to use the database? Do you need to export the original data? Do you need to save your web homepage? You want to use the database in an e-commerce application, and XML is transferred Data format? The answer to these issues will directly affect the selection of your database and middleware (if any). For example, it is assumed that you use XML in an e-commerce application to perform data transfer. This is a good solution Because your data has a highly standardized structure, and those entities and codes in XML are not important for you. After all, you are only concerned about data without how these data is physically stored in the document. If your application is relatively simple, the relational database and data transfer middleware will meet your needs; if the application is large and complicated, then you need a fully supported XML development environment. From the other hand Suppose you have a website created from the zero-scattered XML file. You need not only to manage this website, you still have to provide the user to query the content. At this time, your file will be very irregular, while the entity is used You will become important, because the structure of these files is the foundation of the website. In this example, you need a "native XML" database to perform version, track entities, and support queries such as XQL. Language. 4.0 data and document comparison author believes that when selecting a database, the most important judgment factor may be that you use a database to save data or save a document. If you want to save data, the database you need is mainly for data storage ( For example, a relational database or object-oriented database) And mutual conversion between the database and XML documents. From another angle, if you want to store documents, you need a content management system specially designed to store files. Although you can save your files in relational databases or object-oriented databases, you will often find that your work is in duplicate content management systems. Similarly, although a content management system is typically built on an object-oriented database or relational database, it is possible to use a content management system as a database as a database. You need to store data or a document, and the answer often depends on your XML document. The reason is that the XML file is divided into two categories: data-centered and documentation.
.4.1 Data-centric files in data-centric files are characterized by quite specification, and data granules are good (that is, the smallest independent unit in data is a PCDATA element or attribute), rarely or no mixed content. . The order in which the same level of elements and PCDATA is not important. A typical example is that the XML document contains sales orders, flight arrangements, restaurant menus, and more. Data-centric documents are often used for the use of machines. At this time, XML may be redundant - it is just a means of data transmission. For example, the following sales order document is data-centric: ABC Industries 123 Main St. Chicago IL 60609 981215 Turkey wrench: Stainless steel, one-piece construction, lifetime guarantee 9.95 10 Stuffing separator:. Aluminum, one-year guarantee 13.27 5 In the world of XML, many of the rich documents are actually data centered. We use the Amazon.com website showing book information as an example. Although this page is quite huge text, the structure of this text is highly standard, where many of them are the same for any book description page, and the size of each part in the Features page is limited. That is, the page can be established by a simple, data-centric XML document, which contains text information obtained from the database and an XSL style sheet. Typically, any websites currently dynamically constructing the HTML page by filling database data in the template can be replaced by the above-centered XML document and one or more XSL style meters. ABC Industries Agrees To Lease The Property At 123 Main St., Chicago, IL from Xyz Properties for a Term of NOT THAN TIMEUNIT = "MONTHS"> 18 at a cost of currency = "USD" TimeUnit = "MONTHS"> 1000 can From the XML documentation and simple style sheet below: ABC Industries 123 Main St., Chicago, IL XYZ Properties 18 10004.2 Taking document-centric files with document-centric documentation: structure is not standardized, data grain is more Big (ie, the smallest independent data unit is an element including mixed content or is an entire XML document) and contains a lot of mixed content. The order of elements and PCDATAs in which the same levels are very important. Typical examples are books, email, advertisements, and most XHTML documents. Document-centric documents is used for human use.
For example, the following product description document is centered on document: Turkey Wrench Full Fabranation Labs, Inc. Like a monkey Wrench, But not as big. The Turkey Wrench, Which Comes in Both Right- and Left-Handed Versions (SkyHook Optional) , is made of the finest stainless steel The Readi-grip rubberized handle quickly adapts to your hands, even in the greasiest situations Adjustment is possible through a variety of custom dials You can:... Order your own turkey wrench Read more about wrenches Download THE CATALOG THE TURKEY WRENCH COSTS JUST $ 19.99 and, if you Order Now, Comes with a HAND-CRAFTED SHRIMES WITH A Bonus Gift.4.3 Data, Documents and Databases In realities, data-centric files and documents are centered The difference between files is not very strict. For example, a document-centric file (such as an invoice) may also contain coarse particles, irregular data (such as the description of the invoice). A document-centric file (such as a user manual) may also contain a good granularity, the structured data of the rules (usually metadata), such as the author and the revision date. In addition, let your documentation have data-centered or as a document as a document. It helps you determine whether to care about data or documentation, which will also determine what kind of system you need. To store or retrieve data, you can use a database (usually a relational, object-oriented or hierarchical) and middleware (word or use third part), you can also use XML servers (ie create distributed applications) The platform, such as e-commerce applications that use XML for data transmission). To save a document, you will need a content management system or a consistent DOM implementation system. Discussions on various systems are discussed in 5.0 "Storage and Retrieve Data" section and 6.0 "HREF =" # StoringRetrieVingDocs "> Storage and Retrieve Document" section. You can also learn more about a detailed related product list in href = "http://www.rpbourret.com/xml/xmldatabaseprods.htm"> XML database products. 5.0 Storage and retrieving data in data-centered documents may come from database (at this point you want to export data as an XML format), or you may be an XML document (this point you want to store data in the database). The former example is a large number of existing data (or lyric data) stored in the relational database; the latter example is to publish the data as XML in the web, and you want to store in your database for more Multiple treatment. In this way, according to your needs, you may need to transfer XML documents to the database, or you may need to transfer from the database to the XML document, or both support.
5.1 Transfer Data When you store data in a database, you often need to discard information related to documents, such as document names and DTDs, as well as their physical structure, such as the definition of entities, definitions, attribute values, and the order of the same level, The storage method of binary data (is Base64 encoding, is an entity or hell of method), character data segment and other encoded information. Similarly, when the data is retrieved from the database, the resulting XML document result In addition to non-predefined entity LT (<"), GT (">), AMP ("&"), APOS ("'), quot ("), quot "" ") Does not contain any CDATA or entity reference. The order of appearance of the same elements and attributes is often the order of data returned from the database. Although it is a bit surprised, this is often reasonable. For example, suppose you need to use XML as a data format to transfer one from a database to another in another database. In this case, the number of the sales list is not concerned in the XML document is to save the date of the sales list or it is not followed, nor does it use to save the customer's name in the character data (CDATA) or as an external entity Or directly as a PCDATA. The most important thing is that the relevant data is transferred from the first database to the second database. In this way, this data transmission software needs to consider the hierarchy of the data (this structure is grouped), while others do not have to consider too much. One of the consequences of ignoring document information and its physical structure is the inconsistency of the "reverse regression" of the document, which is to store data of a document in the database and then reorganize into new documents based on these data. Even according to the standard format, it is often often different from the previous document. Whether this can be accepted to depends on your needs, and will also affect your choice for your database and data transfer middleware.
5.2 Mapping from the document structure to the database structure In order to transfer data between XML and the database, it is necessary to perform mutual mapping between the document structure and the database structure. Such mappings are usually divided into two categories: template drivers and mode drivers. 5.2.1 Template Drive Map In the template-driven map, there is no predefined mapping relationship between the document structure and the database structure, but uses the method of embedding the template within the command statement, allowing the data transmission middleware to process the template. . For example, consider the following template (note that the template does not apply any actual product), in the
5.2.2 Model Drive Mapping In a model-driven map, the data model corresponding to the XML document structure will be explicitly or implicitly mapped into the structure of the database, and vice versa. Its disadvantage is that flexibility is not enough, but it is easy to use because it is based on a specific data model to map, and it is usually possible to achieve many conversion work for users. Since the result of converting data from the database into XML is in accordance with a single model, the flexibility in which the template-driven system is typically combined in this manner to provide the XSL. Data views in XML documents typically have two models: Table models and specific data object models. Other models may sometimes appear. For example, by adopting ID and IDREF properties, an XML document can be used to specify a graphic. However, many existing middleware do not support these models. 5.2.2.1 Table Models Many middleware packages are converted between XML and relational databases in XML and relational databases. It looks like an XML model as a single table or a series of forms. That is, the structure of the XML document is similar to the following example. In the case of a single table,