The relationship between XML and database

xiaoxiao2021-03-06  18

1.0 Introduction This paper explores the relationship between XML and databases, and lists software that can use the database to process XML documents. Although these software is not intended to be detailed here, the author wants it to describe the main part of the XML document using the database. It is a bit a little bit of a relational database because my experience is.

2.0 XML is a database? Before we start discussing XML and database, we need to answer a question in a lot of heart: "Is XML is a database?" In strict sense, if "XML" refers to the XML document, the answer is "no". Although the XML document contains data, but if there is no other software to process these data, it is nothing to do with the meaning of the database and other text files. If "XML refers to the XML document and all related XML tools and techniques, the answer is" Yes ". The reason is that the XML provides many databases needed. Part: Storage (XML Document), Structure (DTD, XML Schema Language), Query Language (XQL, XML-QL, Quilt, etc.), programming interface (SAX, DOM), and so on. However ... XML is still a lot The contents necessary in the real database: effective storage, indexing, security, transaction, data completeness, multi-user access, trigger, multi-document query, etc. Therefore, if the amount of data is general, the user is small, the performance requirements are not high Under the environment, you can use XML as a database; in most product environments, there are many users who require strict data integrity and have high requirements for performance, XML can not be eligible. And, Considering that databases such as DBASE and Access are both cheap and very easy, even in the first case, XML has little reason to act as a role of the database. Why use the database? When considering using XML and database, first It should be: Why do I need to use the database? Do you need to export the original data? Do you need to save your web homepage? You want to use the database in an e-commerce application, and XML is transferred Data format? The answer to these issues will directly affect the selection of your database and middleware (if any). For example, it is assumed that you use XML in an e-commerce application to perform data transfer. This is a good solution Because your data has a highly standardized structure, and those entities and codes in XML are not important for you. After all, you are only concerned about data without how these data is physically stored in the document. If your application is relatively simple, the relational database and data transfer middleware will meet your needs; if the application is large and complicated, then you need a fully supported XML development environment. From the other hand Suppose you have a website created from the zero-scattered XML file. You need not only to manage this website, you still have to provide the user to query the content. At this time, your file will be very irregular, while the entity is used You will become important, because the structure of these files is the foundation of the website. In this example, you need a "native XML" database to perform version, track entities, and support queries such as XQL. Language. 4.0 data and document comparison author believes that when selecting a database, the most important judgment factor may be that you use a database to save data or save a document. If you want to save data, the database you need is mainly for data storage ( For example, a relational database or object-oriented database) And mutual conversion between the database and XML documents. From another angle, if you want to store documents, you need a content management system specially designed to store files. Although you can save your files in relational databases or object-oriented databases, you will often find that your work is in duplicate content management systems. Similarly, although a content management system is typically built on an object-oriented database or relational database, it is possible to use a content management system as a database as a database. You need to store data or a document, and the answer often depends on your XML document. The reason is that the XML file is divided into two categories: data-centered and documentation.

.4.1 Data-centric files in data-centric files are characterized by quite specification, and data granules are good (that is, the smallest independent unit in data is a PCDATA element or attribute), rarely or no mixed content. . The order in which the same level of elements and PCDATA is not important. A typical example is that the XML document contains sales orders, flight arrangements, restaurant menus, and more. Data-centric documents are often used for the use of machines. At this time, XML may be redundant - it is just a means of data transmission. For example, the following sales order document is data-centric: ABC Industries 123 Main St. Chicago IL 60609 981215 Turkey wrench: Stainless steel, one-piece construction, lifetime guarantee 9.95 10 Stuffing separator:. Aluminum, one-year guarantee 13.27 5 In the world of XML, many of the rich documents are actually data centered. We use the Amazon.com website showing book information as an example. Although this page is quite huge text, the structure of this text is highly standard, where many of them are the same for any book description page, and the size of each part in the Features page is limited. That is, the page can be established by a simple, data-centric XML document, which contains text information obtained from the database and an XSL style sheet. Typically, any websites currently dynamically constructing the HTML page by filling database data in the template can be replaced by the above-centered XML document and one or more XSL style meters. ABC Industries Agrees To Lease The Property At 123 Main St., Chicago, IL from Xyz Properties for a Term of NOT THAN TIMEUNIT = "MONTHS"> 18 at a cost of currency = "USD" TimeUnit = "MONTHS"> 1000 can From the XML documentation and simple style sheet below: ABC Industries 123 Main St., Chicago, IL XYZ Properties 18 10004.2 Taking document-centric files with document-centric documentation: structure is not standardized, data grain is more Big (ie, the smallest independent data unit is an element including mixed content or is an entire XML document) and contains a lot of mixed content. The order of elements and PCDATAs in which the same levels are very important. Typical examples are books, email, advertisements, and most XHTML documents. Document-centric documents is used for human use.

For example, the following product description document is centered on document: Turkey Wrench Full Fabranation Labs, Inc. Like a monkey Wrench, But not as big. The Turkey Wrench, Which Comes in Both Right- and Left-Handed Versions (SkyHook Optional) , is made of the finest stainless steel The Readi-grip rubberized handle quickly adapts to your hands, even in the greasiest situations Adjustment is possible through a variety of custom dials You can:... Order your own turkey wrench Read more about wrenches Download THE CATALOG THE TURKEY WRENCH COSTS JUST $ 19.99 and, if you Order Now, Comes with a HAND-CRAFTED SHRIMES WITH A Bonus Gift.4.3 Data, Documents and Databases In realities, data-centric files and documents are centered The difference between files is not very strict. For example, a document-centric file (such as an invoice) may also contain coarse particles, irregular data (such as the description of the invoice). A document-centric file (such as a user manual) may also contain a good granularity, the structured data of the rules (usually metadata), such as the author and the revision date. In addition, let your documentation have data-centered or as a document as a document. It helps you determine whether to care about data or documentation, which will also determine what kind of system you need. To store or retrieve data, you can use a database (usually a relational, object-oriented or hierarchical) and middleware (word or use third part), you can also use XML servers (ie create distributed applications) The platform, such as e-commerce applications that use XML for data transmission). To save a document, you will need a content management system or a consistent DOM implementation system. Discussions on various systems are discussed in 5.0 "Storage and Retrieve Data" section and 6.0 "HREF =" # StoringRetrieVingDocs "> Storage and Retrieve Document" section. You can also learn more about a detailed related product list in href = "http://www.rpbourret.com/xml/xmldatabaseprods.htm"> XML database products. 5.0 Storage and retrieving data in data-centered documents may come from database (at this point you want to export data as an XML format), or you may be an XML document (this point you want to store data in the database). The former example is a large number of existing data (or lyric data) stored in the relational database; the latter example is to publish the data as XML in the web, and you want to store in your database for more Multiple treatment. In this way, according to your needs, you may need to transfer XML documents to the database, or you may need to transfer from the database to the XML document, or both support.

5.1 Transfer Data When you store data in a database, you often need to discard information related to documents, such as document names and DTDs, as well as their physical structure, such as the definition of entities, definitions, attribute values, and the order of the same level, The storage method of binary data (is Base64 encoding, is an entity or hell of method), character data segment and other encoded information. Similarly, when the data is retrieved from the database, the resulting XML document result In addition to non-predefined entity LT (<"), GT (">), AMP ("&"), APOS ("'), quot ("), quot "" ") Does not contain any CDATA or entity reference. The order of appearance of the same elements and attributes is often the order of data returned from the database. Although it is a bit surprised, this is often reasonable. For example, suppose you need to use XML as a data format to transfer one from a database to another in another database. In this case, the number of the sales list is not concerned in the XML document is to save the date of the sales list or it is not followed, nor does it use to save the customer's name in the character data (CDATA) or as an external entity Or directly as a PCDATA. The most important thing is that the relevant data is transferred from the first database to the second database. In this way, this data transmission software needs to consider the hierarchy of the data (this structure is grouped), while others do not have to consider too much. One of the consequences of ignoring document information and its physical structure is the inconsistency of the "reverse regression" of the document, which is to store data of a document in the database and then reorganize into new documents based on these data. Even according to the standard format, it is often often different from the previous document. Whether this can be accepted to depends on your needs, and will also affect your choice for your database and data transfer middleware.

5.2 Mapping from the document structure to the database structure In order to transfer data between XML and the database, it is necessary to perform mutual mapping between the document structure and the database structure. Such mappings are usually divided into two categories: template drivers and mode drivers. 5.2.1 Template Drive Map In the template-driven map, there is no predefined mapping relationship between the document structure and the database structure, but uses the method of embedding the template within the command statement, allowing the data transmission middleware to process the template. . For example, consider the following template (note that the template does not apply any actual product), in the element embeds the SELECT statement: The Following Flights have available seats: SELECT Airline, FltNumber, Depart, Arrive FROM Flights We hope one of these meets your needs when the data transmission processing middleware When this document, each SELECT statement will be replaced by the respective execution results to get the following XML format: The Following Flights Have Available Seats: ACME 123 DEC 12, 1998 13:43 DEC 13, 1998 01:21 ... We Hope One of these Meets Your Needs This type-driven map can be quite flexible. For example, some products allow you to replace the content you want (including parameters in Select) in any result set, not simply simply in the example above. In addition, it also supports the use of programming, such as cyclic and conditional judgment structures. There are also a parameterization of the SELECT statement, such as passing the parameters by HTTP. Currently, the template-driven mapping only supports the conversion from a relational database into an XML document.

5.2.2 Model Drive Mapping In a model-driven map, the data model corresponding to the XML document structure will be explicitly or implicitly mapped into the structure of the database, and vice versa. Its disadvantage is that flexibility is not enough, but it is easy to use because it is based on a specific data model to map, and it is usually possible to achieve many conversion work for users. Since the result of converting data from the database into XML is in accordance with a single model, the flexibility in which the template-driven system is typically combined in this manner to provide the XSL. Data views in XML documents typically have two models: Table models and specific data object models. Other models may sometimes appear. For example, by adopting ID and IDREF properties, an XML document can be used to specify a graphic. However, many existing middleware do not support these models. 5.2.2.1 Table Models Many middleware packages are converted between XML and relational databases in XML and relational databases. It looks like an XML model as a single table or a series of forms. That is, the structure of the XML document is similar to the following example. In the case of a single table, does not appear:

... ... ... ... ... The term "table" is understood as a single result set (when from the database to XML When transforming data), or a separate table or updatable view (when converting data from XML). If the data needs from multiple result sets (when the data is from the database) or a collection of a series of tables (when the data to the database) is reached, the XML document contains a deeper nesting element, then similar The conversion is almost impossible. 5.2.2.2 Specific Data Object Model XML Document The second universal data model is a tree structure of a particular data object. In this model, the element type usually corresponds to the object, and the content model, attribute, and PCDATA in XML correspond to the properties of the object. This model is mapped directly to object-oriented databases and hierarchical databases, of course, by means of traditional object-relational mapping technology and SQL 3 object views can also be mapped into relational databases. It should be noted that this model is not a document object model (DOM). The DOM is modeling the document itself instead of data in the document. As mentioned in HREF = "# WriteYourown"> 6.1.2, the DOM is used to establish a content management system on the basis of the relational database. For example, the above sales settle document can be considered as a tree structure consisting of five classes. As shown in the following view, including Orders, SalesOrder, Customer, Line, and Parts: Orders | SalesOrder / | Customer Line Line | | Part Part It is not necessary to model an XML document to a specific data object tree. The elements must have to correspond to the object. For example, if an element contains only PCDATA, such as a CustName element in the sales setup document, it can be processed as an attribute, so the attribute contains only a single, scalar type value. Similarly, it is also very useful to simulate the mixed element or element content.

An outgoing example is the processing of the Description element in the sales setup document: Although it has a mixed content in the XHTML format, it is more useful to see the Description element as a single property, because its components are nothing significance. 5.3 Data types, null values, character sets, and other this section will explore some storage issues related to XML documents from the database. Usually, you decide how you choose the middle piece to solve these problems, but you should be aware of the existence of these issues, because this helps you choose your middleware. 5.3.1 Data Type XML does not support any actual data types. In addition to the alias, the data in all XML documents is treated as text, even if it can be represented by other data types (such as dates or integers). Typically, the data conversion middleware will convert text in the XML document into data types in other databases, and vice versa. However, the text format recognized by a particular data type is limited, for example, the limitations of data types supported by JDBC Driver. In these numerous data types, the date type is usually caused. Differences in digital formats in different international regions may also have problems. 5.3.2 Binary data typically there are two ways to save binary data into an XML document: a case where the entity and base64 encoding process (a MIME encoding method can be mapped into US-ASCII subsets). For relational databases, these two methods may have problems because the rules of saving and retrieving binary data are very strict, which will cause problems in middleware. In addition, there is no standard symbol to illustrate the elements in an XML document containing the base64 encoded data, so that the middleware may not recognize this coding at all. Finally, when storing data into the database, the symbols associated with the unhabited entity or base64 encoding element may be ignored. So, if you say that binary data is very important to you, please confirm whether your middleware supports binary data. 5.3.3 Null Values ​​In the database world, null data means that the data does not exist. However, this is very different from a string of a number or length of 0 or a length of 0. For example, suppose your data comes from a weather station. If the temperature of the weather station is measured, you can't read the temperature value, then a NULL value is stored in your database instead of one 0. Obviously, the value is 0 is exactly another thing in XML's null value concept can be implemented by setting an optional element type or attribute. If the element type or attribute value is null, XML is as long as the document does not contain this element or attribute. However, for the database, empty elements or attributes containing a 0-length string are not null NULL: their value is a string of length 0. When the XML document and database structure are mapped in mutual mapping, you must pay special attention to whether the optional element type or attribute corresponds to the null value item in the database. If you don't do this, you are likely to appear insert errors (when data is converted to the database) or an invalid document error (when data is read from the database). Because it is also necessary to use symbolic null values, more flexible to the database in XML. Specifically, many XML users are likely to include empty elements or properties of empty strings. At this time you have to consider how to choose the right middleware to solve this problem. Some middleware allows users to choose what to define in an XML document to form an empty value. 5.3.4 The character set According to the definition, in addition to some control characters, the XML document can contain any UNICODE characters. But unfortunately, many databases are limited or unicode, and some special configurations are required to handle non-ASCII encoded character data. If your data contains non-ASCII characters, you must verify that your database and middleware can handle these characters.

5.3.5 Processing Instruction Processing Instructions is not a "data" section in the XML document, so many middleware may not process. The problem is, especially when the XML document structure is strictly mapped into a database structure, the processing instruction is often difficult because they can virtually appear anywhere in the document. Therefore, the middleware is difficult to determine where to save them to where and when to retrieve it. If the loop reply of the handle and document ("Round-Tripping") is very important to you, you must check that your middleware is like solving this problem. 5.3.6 Storage Mark In HREF = "# markup"> 4 href = "# markup". 2.2 in the section, sometimes it is very common to save the elements containing elements or mixed contents without further resolution. useful. The most common way is to simply save this tag itself directly into the database. Unfortunately, a problem will occur when retrieving data from a database: It is impossible to determine that the label in the database is true or representative of the entity that represents the tag character, such as the character of the LT and GT escape. For example, the following Description element: confusing example: Store in the database: confusing example: The database cannot be judged that and are tagging or text. There are several possible solutions, such as marking markers in a certain way or using entities for non-marked tag characters. But at this time you have to pay more attention to whether such a way is compatible with other applications that use these data. For example, if you want to query the smaller than the number ("<") and the LT entity ("<") in the database ("<"). 5.4 Generating DTD and its mutual inversion process from the database to convert data between XML documents and databases: how to generate XML DTD from the structure of the database, if the database is generated from the XML DTD . In short, this is a very direct operation, but the resulting result is usually some distance from the expectations of many users. (Note that this is usually one-time operation, while most applications, especially all vertical applications combine the collection of known DTDs and relational SCHEMAs. Obvious special case is to store random XML documents in relational databases or Posted relational data into a tool for an XML document; and in the back, the DTD is not obvious.) For each of the types of properties and sub-element types containing only PCData contents in this TA BLE is newly established (fields). If the child element type or attribute is optional, the field is allowed to be empty. For each multi-value attribute or more only the child element type of PCDATA content, a separate table is established to save their values, and the primary keywords of their parent tables are connected to the parent table. For each sub-element, these sub-elements itself include elements or mixed content, connect the parent element table to the child element table using the keywords in the parent table. The following is a process that generates an XML document from the structure of the relational database (simplified): for each table, create a new element.

转载请注明原文地址:https://www.9cbs.com/read-48460.html

New Post(0)
CopyRight © 2020 All Rights Reserved
Processed: 0.041, SQL: 9