XML and database Author: onecenter
Author: Ronald
Summary: This paper explores the relationship between XML and databases, and lists software that can use the database to process XML documents. Although these software is not intended to be detailed here, the author wants it to describe the main part of the XML document using the database.
content:
table of Contents
1.0 Introduction
2.0 XML is a database?
3.0 Why use the database?
4.0 comparison of data and documents
4.1 Data-centric files
4.2 Documents centered on documents
4.3 Data, Documents and Databases
5.0 Storage and retrieve data
5.1 Transfer Data
5.2 Map the document structure as a database structure
5.2.1 Template drive mapping
5.2.2 Model Drive Map
5.2.2.1 Table model
5.2.2.2 Specific Data Object Model
5.3 Data type, null value, character set and other
5.3.1 Data Type
5.3.2 binary data
5.3.3 null value
5.3.4 Character Set
5.3.5 Processing Directive
5.3.6 Storage tag
5.4 Generate DTD and its mutual inverse process from the structure of the database
1.0 Introduction
This paper briefly explores the relationship between XML and databases, and lists software that can use the database to process XML documents. Although these software is not intended to be detailed here, the author wants it to describe the main part of the XML document using the database. It is a bit a little bit of a relational database because my experience is.
2.0 XML is a database?
Before you start discussing XML and database, we need to answer a question that is lingering: "Is XML is a database?" In strict sense, if "XML" means "no" when "XML" means "NO". Although the XML document contains data, but if there is no other software to process these data, it is nothing to do with the meaning of the database and other text files.
If "XML refers to the XML document and all related XML tools and techniques, the answer is" Yes ". The reason is that the XML provides many databases needed. Part: Storage (XML document), structure (DTD,
XML Schema Language), Query Language (XQL, XML-QL, Quilt, etc.), programming interface (SAX, DOM), and so on. However ... XML is still lacking in a lot of contents necessary in real databases: effective storage, indexing, security, transaction, data completeness, multi-user access, trigger, multi-document query, etc.
So if the XML can be used as a database in an environment where the data is generally, fewer users, and the performance requirements can be used as a database; in most product environments, many users are required, requiring strict data integrity and The performance is high, and XML can't be eligible. Moreover, considering databases such as DBASE and Access are inexpensively and very easy, even in the first case, XML is rarely just served as the role of the database.
3.0 Why use the database?
When considering the use of XML and database, the first question should be asking yourself should be: Why do I need to use a database? Do you need to export the original data? Do you need to save your web homepage? You have to use the database in an e-commerce application, and the XML is the data format for transmission? The answers to these issues will directly affect your choice for databases and middleware (if any).
For example, suppose you use XML in an e-commerce application to perform data transfer. This is a good solution, because your data has a highly standardized structure, and those entities and codes in XML are not important for you. After all, what you care about is just data rather than how these data is stored in the document. If your application is relatively simple, the relational database and data transfer middleware will meet your needs; if the application is huge and complicated, then you need a fully supported XML development environment.
From another aspect, assume that you have a website created from the zero-scattered XML file. Not only do you need to manage this website, but you have to provide a way to get users to query the content. At this time your file will be very irregular, while the entity will become important to you, because the structure of these files is the foundation of the website. In this example, you need a certain type of "native XML" database to perform versionization, tracking entities, and support query languages such as XQL.
4.0 comparison of data and documents
The author believes that when choosing a database, the most important judgment factor may be that you use a database to save data or save a document. If you want to save data, the database you need is primarily for data storage (such as relational databases or object-oriented databases) and mutual conversion between databases and XML documents. From another angle, if you want to store documents, you need a content management system specially designed to store files.
Although you can save your files in relational databases or object-oriented databases, you will often find that your work is in duplicate content management systems. Similarly, although a content management system is typically built on an object-oriented database or relational database, it is possible to use a content management system as a database as a database.
You need to store data or a document, and the answer often depends on your XML document. The reason is that the XML file is divided into two categories: data-centered and documentation. .
4.1 Data-centric files
According to data-centric files, the structure is quite specified, and the data grain is good (that is, the smallest independent unit in the data is a PCDATA element or attribute), little or no mixed content. The order in which the same level of elements and PCDATA is not important. A typical example is that the XML document contains sales orders, flight arrangements, restaurant menus, and more. Data-centric documents are often used for the use of machines. At this time, XML may be redundant - it is just a means of data transmission.
For example, the document of the sales order below is data-centric:
ABC Industries
123 main ST.
Chicago
IL
60609
981215
Turkey Wrench:
Stainless Steel, One-Piece Construction, ONE-PIECE CONSTRUCTION,
Lifetime guarance.
9.95
10
Stuffing Separator:
Aluminum, One-Year Guarantee.
13.27
5
In the world of XML, many of the rich documents are actually data centered. We use the Amazon.com website showing book information as an example. Although this page is quite huge text, the structure of this text is highly standard, where many of them are the same for any book description page, and the size of each part in the Features page is limited. That is, the page can be established by a simple, data-centric XML document, which contains text information obtained from the database and an XSL style sheet. Typically, any websites currently dynamically constructing the HTML page by filling database data in the template can be replaced by the above-centered XML document and one or more XSL style meters.
For example, let's look at the Lease documentation below:
ABC Industries Agrees To Lease The Property At
123 Main St., Chicago, Il from xyz
Properties for a Term of Not Less TimeUnit = "MONTHS"> 18 At a cost of currency = "USD" TimeUnit = "MONTHS"> 1000.
You can get from the XML documentation and simple style sheet:
ABC Industries
123 Main St., Chicago, IL
XYZ Properties
18
1000
4.2 Documents centered on documents
The characteristics of documentation-centered documents are: structural irregularities, larger data granules (ie, the smallest independent data unit is an element including mixed content or an entire XML document) and containing a large amount of mixed content. The order of elements and PCDATAs in which the same levels are very important. Typical examples are books, email, advertisements, and most XHTML documents. Document-centric documents is used for human use.
For example, the following product description document is based on documentation:
Turkey Wrench
Full Fabranion Labs, Inc.
Like a monkey wrench, but not as big.
The Turkey Wrench, Which Comes in Both Right- and
Left-handed version (Skyhook Optional), IS Made of The Finest
Stainless Steel. The Readi-Grip Rubberized Handle Quickly Adapts
To Your Hands, Even in The Greasiest Situations. Adjustment IS
Possible Through a variety of custom dials.
You CAN:
Order Your Own Turkey Wrench
Read more about wrenches
Download The Catalog
Turkey Wrench Costs Just $ 19.99 and, if you
ORDER NOW, COMES WITH A HAND-CRAFTED SHRIMP HAMMER AS A
Bonus gift.
4.3 Data, Documents and Databases
In reality, the difference between data-centric files and documents-centric files is not very strict. For example, a document-centric file (such as an invoice) may also contain coarse particles, irregular data (such as the description of the invoice). A document-centric file (such as a user manual) may also contain a good granularity, the structured data of the rules (usually metadata), such as the author and the revision date. In addition, let your documentation have data-centered or as a document as a document. It helps you determine whether to care about data or documentation, which will also determine what kind of system you need.
To store or retrieve data, you can use a database (usually a relational, object-oriented or hierarchical) and middleware (word or use third part), you can also use XML servers (ie create distributed applications) The platform, such as e-commerce applications that use XML for data transmission). To save a document, you will need a content management system or a consistent DOM implementation system. Discussion on various systems in 5.0
"Storage and retrieve data" section and 6.0 "
HREF = "# storingRetrieVingDocs"> Storage and retrieve documentation "section. You can also
href = "http://www.rpbourret.com/xml/xmldatabaseprods.htm">
Detailed related products list in XML database products.
5.0 Storage and retrieve data
The data content in data-centered documents may come from the database (at this point you want to export the data as an XML format), or the XML document (this point you want to store the data in the database). The former example is a large number of existing data (or lyric data) stored in the relational database; the latter example is to publish the data as XML in the web, and you want to store in your database for more Multiple treatment. In this way, according to your needs, you may need to transfer XML documents to the database, or you may need to transfer from the database to the XML document, or both support.
5.1 Transfer Data
When you store data in a database, you often need to discard information related to documents, such as document names and DTDs, as well as their physical structure, such as entity definition and use, order of attribute values, and in-binary data. The storage method (which is Base64 encoding, is a unhaken entity or his way), character data segment and other encoded information. Similarly, when the data is retrieved from the database, the resulting XML document result In addition to non-predefined entity LT (<"), GT (">), AMP ("&"), APOS ("'), quot ("), quot "" ") Does not contain any CDATA or entity reference. The order of appearance of the same elements and attributes is often the order of data returned from the database. Although it is a bit surprised, this is often reasonable. For example, suppose you need to use XML as a data format to transfer one from a database to another in another database. In this case, the number of the sales list is not concerned in the XML document is to save the date of the sales list or it is not followed, nor does it use to save the customer's name in the character data (CDATA) or as an external entity Or directly as a PCDATA. The most important thing is that the relevant data is transferred from the first database to the second database. In this way, this data transmission software needs to consider the hierarchy of the data (this structure is grouped), while others do not have to consider too much.
One of the consequences of ignoring document information and its physical structure is
The inconsistency of the "reverse regression" of the document is stored in the database in the database, and then reorganizes into new documents based on these data. Even according to the standard format, it is often often different from the previous document. Whether this can be accepted to depends on your needs, and will also affect your choice for your database and data transfer middleware.
5.2
Mapping from the document structure to the database structure
In order to transmit data between XML and databases, mutual mapping needs to be performed between the document structure and the database structure. Such mappings are usually divided into two categories: template drivers and mode drivers.
5.2.1 Template drive mapping
In a template-driven map, there is no predefined mapping between the document structure and the database structure.
Instead, use the method of embedding the template within the command statement, let the data transmission middleware processes the template. For example, consider the following template (note that the template does not apply any actual product), in the
XML Version = "1.0"?>
Flightinfo>
When the data transmission middleware processes the document, each Select statement will be replaced by the respective execution results to get the following XML format:
XML Version = "1.0"?>
Row>
...
Flights>
Flightinfo>
This map-driven mapping can be flexible. For example, some products allow you to replace the content you want (including parameters in Select) in any result set, not simply simply in the example above. In addition, it also supports the use of programming, such as cyclic and conditional judgment structures. There are also a parameterization of the SELECT statement, such as passing the parameters by HTTP.
Currently, the template-driven mapping only supports the conversion from a relational database into an XML document.
5.2.2 Model Drive Map
In a model-driven map, the data model corresponding to the XML document structure will be explicitly or implicitly mapped into the structure of the database, and vice versa. Its disadvantage is that flexibility is not enough, but it is easy to use because it is based on a specific data model to map, and it is usually possible to achieve many conversion work for users. Since the result of converting data from the database into XML, according to the single model,
Therefore, in this manner, in this manner, the flexibility in the system of the template-driven system is typically combined.
Data views in XML documents typically have two models: Table models and specific data object models. Other models may sometimes appear. For example, by adopting ID and IDREF properties, an XML document can be used to specify a graphic. However, many existing middleware do not support these models.
5.2.2.1 Table model
Many middleware packages are converted between XML and relational databases. It looks like an XML model as a single table or a series of forms. That is, the structure of the XML document is similar to the following example, where in the case of a single table,