Map DTD to Database [On]

xiaoxiao2021-03-06  67

Original author: Ronald Bourret

May 09, 2001

Translation: 蝉 退 退

The translator statement: the translator does not make any guarantees to the translation, the translator does not have any rights to the translation and does not afford any responsibility and obligation.

original:

http://www.xml.com/pub/a/2001/05/09/dtdtodbs.html

table of Contents

1 Overview

2. Table-based mapping

3. Object - Relationship Mapping

3.1. Basic mapping

3.1.1. Mapping DTD to Object Mode

3.1.2. Mapping object mode to database mode

3.1.3. Miscellaneous

3.2. Mapping complex content model

3.2.1. Mapping sequence

3.2.2. Mapping selection

3.2.3. Mapping Repeat Child Elements

3.2.4. Mapping can be selected

3.2.5. Mapping subunits

3.3. Mapping Mixed Content

3.4. Map order

3.4.1. Same order, level order, and document order

3.4.2. Map of the same order

3.4.2.1. Order properties and columns

3.4.2.2. Store in the mapping

3.5. Mapping properties

3.5.1. Map single value and multi-value properties

3.5.2. Mapping ID / IDREF (s) attribute

3.5.3. Mapping Note Attribute

3.5.4. Mapping Entity / Entities attribute

3.6. Alternative mapping

3.6.1. Mapping complex element type to scalar type

3.6.2. Map Scalar class Properties to Properties

3.7. Conclusion

4. Generate mode

4.1. Generate relational database mode from DTD

4.2. Generate DTD from database mode

5. Mapping XML mode to the database

6. Topics

1 Overview

A common problem in the XML community is how to map XML to the database. This article discusses two mappings: mapping and object-relational mappings based on tables. Both mappings model in XML documents

Data, not the document itself. This makes these mappings for data-centric documents is a good difficult ≡ ≡  谝 谝 牡   牡 牡 牡 牡 瘛 瘛 瘛 瘛 瘛 瘛 瘛 瘛 瘛 瘛 瘛 瘛 瘛 瘛 瘛 瘛 瘛 瘛 瘛 瘛 瘛 瘛 瘛 瘛 瘛 瘛 瘛 瘛荩 荩  夏 夏 莸? The relationship mapping is very inefficient.

Both mappings are usually based on software for software to transmit data between XML documents and databases, especially relational databases. One of them in this regard is two directions. That is, they can be used to transmit data from the database from the XML document to the XML document from the database. One of the results is that they can be used

Specification mapping, the XML query language on it can be built on a non-XML database. Specifications Mappings will define virtual XML documents that can be queried with something like XQuery.

In addition to transmitting data between XML documents and databases, the object relationship mapping primary role is used in "data binding", which is data set and dispersion between XML documents and objects.

2. Table-based mapping

There is a significant mapping between the XML document and the table below:

CCC Table A

DDD -------

eee c d e

--- --- ---

<=> ... ...

FFF CCC DDD EEE

GGG FFF GGG HHH

hhh ... ...

It is called

Table-based mapping. It looks into a single table or a set of tables. The structure of the document must be

...

...

...

...

...

...

...

or it could be

...

...

...

...

...

...

...

...

...

Also in this limitation, the column data can be represented as the only PCDATA element (displayable) or attribute.

The obvious advantage of this mapping is its simplicity. Because it matches the structure of the tables and result sets in the relational database, this mapping write code is easy, fast, scaled, and is very useful for specific applications, such as a table transfer data between a table between the database.

This mapping has many disadvantages; first, it can only process very small subsets of the XML document. In addition, it does not save physical structures (such as characters and entity references, CDAT segments, character encodes, and isolated declarations) or document information (such as document type or DTD), annotation, or processing instructions.

The table-based mapping is usually used by the middle part to transfer data between the XML document and the relational database. It is also used in some web application servers to return the result set data to XML.

3. Object - Relationship Mapping

Since the table-based mapping can only process a limited subset of the XML document, some middleware tools, most enable XML relational databases, and most enabled XML object servers use a more complete mapping, which is called object-relational mapping. It models XML documents to a tree that is specific to the objects of data in the document, then maps these objects to the database.

(The name "object-relationship" is actually improper useful - better name is

Object-based mapping. This is because objects can be mapped to non-relational databases, such as object-oriented databases or hierarchical databases, or simply do not depends it, this is done in the data binding system. However, since the object-relationship is a well-known term and this mapping is usually used with the relational database, this term is used here. In addition, all examples use the relationship table. )

To understand the object - the relationship map, it is best to look at some simple examples. As a start, note that there is obvious mapping between the XML document, the object, and the lines in the table:

XML object table

========================= ================

Table a

Object a {-------

bbb b = "bbb" b C D CCC <=> c = "ccc" <=> --- --- ---

DDD d = "ddd" ... ...

} BBB CCC DDD

... ...

Similarly, there is also a significant mapping between the following element type definitions, classes and table mode:

DTD class pattern

================================== =========================================================================================================================================================== ==========

Class a {create Table A

String b; b varchar (10) Not null,

<=> String C; <=> C Varchar (10) Not Null,

String D; D VARCHAR (10) Not null

})

As a more complex example, consider the following XML documentation:

1234

Gallagher Industries

29.10.00

A-10

12

10.95

b-43

600

3.99

It is mapped to the following object:

Object salesorder {

Number = 1234;

Customer = "gallagher industries";

Date = 29.10.00;

Items = {PTRS to Item Objects};

} / /

/ /

/ / / Object item {Object item {

Number = 1; Number = 2;

Part = "a-10"; part = "b-43";

Quantity = 12; quantity = 600;

Price = 10.95; price = 3.95;

}

And then mapped into the following table:

SaleRDERS

------------

Number Customer Date

--------------------------------

1234 GALLAGHER INDUSTRIES 29.10.00

... ...

... ...

Items

-----

Sonumber Item Part Quantity Price

-------- ---- ---- -------- -----

1234 1 A-10 12 10.95

1234 2 B-43 600 3.99

... ... ...

All of this is an object-relational map from DTD to a relational database table.

3.1. Basic mapping

Object-Relational mapping is done with two steps. First, map the XML mode (here the DTD) into an object mode, then map the object mode into database mode. These two-step mappings can be combined with direct DTD-to-database mapping, as most software today.

When considering this map, it is important to understand that the object involved is specific to each DTD, not an object from the DOM. In particular, these objects model data in the XML document, while the DOM modeling the structure of the XML document. For example, the object tree of the data-specific object and the DOM object is shown below.

SalesOrder Document

/ / |

Item item element_______

______ / / / / / ______ / _________

/ / / / /

ELEMENT Element Element Element Element Element

| | | | / | / / ETC.

Text text text / | / / / _______

/ | / /

ELEMENT Element attr

| | | | | |

TEXT TEXT TEXT TEXT

This difference is very important when you consider how the data is stored in the database. To store data-specific objects, you will need salesORDERS and ITEMS tables; you want to store DOM objects, you will need Document, Element, Text, and Attr tables. The most important point is that non-XML applications can use data-specific tables without using DOM-specific tables.

3.1.1. Mapping DTD to Object Mode

This mapping begins to recognize that the element type is the data type. The element type of only PCDATA is called simple element type; the term is taken from W3C XML Schema. They hold a single data value and equivalent to the scalar data type in the object-oriented programming language. (Note that the word "scalar" herein means "consisting of a single data value." In some languages, "Scalar" data type - in the sense of this word - use objects. Most The example is the String data type in Java.) The attribute type is also a simple type. Element or mixed content, or attribute type of element type is called complex data type; the term is also taken from XML Schema. They hold a structured value and equivalent to the structure in the class or c in the object-oriented programming language. Note that the type of empty content and attributes is still "complex". The protocol is also provided by the attributes and is equivalent to the monolithic PCDATA element.

Object-Relational Mapping First Map Simple Types into Scalar Data Types. For example, element type Title can be mapped into string, and element type Price can map into float. It then maps complex types to classes while maping each element type in this complex type of content model into class properties of this class. The data type of each class property is the data type mapped in the referenced element type. For example, a reference to a Title element will be mapped into a String class property, and one reference to the Price element will be mapped into a float class properties. Reference to complex element types is mapped to pointers / references that are mapped to the class maps mapped in complex element types.

The last part of the mapping is to map attributes to class properties and determine the data type of class properties by the data type of the attribute. Note These attributes are equivalent to the reference to the element type in the content model. This is because it is like a reference in the content model, which is limited in a given element type. They are only conceptual, and the attribute type is local definition, not in a global (DTD range) level, which is the case for element types.

For example, both the simple element type B, D, E, and attribute f are mapped to string and complex element type A and C are mapped into class A and C. The content model and attributes of A and C are mapped to class A and C class properties. In the content model of A and C, references to B, D, and E are mapped to String class properties (because these types are mapped to string), and attribute F is also mapped into a String class property. In the content model of A, the reference to C is mapped into a class property of the type of pointer / reference with an object of class C, because element type C is mapped to Class C.

DTD class

=======================================

Class A {

string b;

C C;

F CData #Required> String F;

}

Class C {

==> String D;

string e;}

One of the points to reiterated here is that the reference to the element type in the content model is different from the mapping element type itself. The element type is mapped into data types, and the element type reference is mapped into class properties with structured data types (classes). This difference is clear when you consider an element type in two different content models. In this case, each reference must be individually mapped, and the data type of the result class property is determined by the type of data mapped by the element type itself (not reference).

For example, consider the following Title and section element type. All of these elements are referenced in Chapter and Appendix content models. Each reference is individually mapped for each parent element type. Map the data type of class attribute referenced to Title into String because title contains only PCDATA and is mapped into a string. Map the data type referenced to the Section class property into a pointer / reference for a section object because the section element type is complex and mapped into a section class.

DTD class

==================================================================================================================================================== ====

Class Chapter {

==> string title;

section [] sections;

}

Class appendix {

String Title;

Section;

}

Another point to reaffirmed here is that simple element types and properties can be mapped into other types other than String. For example, an element type called Quantity can be mapped into an integer. This requires artificial interference when mapping from DTD, because the target data type cannot be predicted from the only PCDATA element type. However, when mapping from XML Schema, because XML Schemas has data types, the target type is known.

3.1.2. Mapping object mode to database mode

In the second part of the object-relational map, map the class into a table (called class table), map the scalar class property to the column, and map the pointer / reference class property to the primary key / foreign key, for example:

Class table

============ ================= Class A {Table A:

String B; Column B

C C; ==> COLUMN C_FK

String f; Column F

}

Class C {Table C:

String D; ==> Column D

String E; Column E

} Column c_pk

Note These tables are connected by primary keys (C.C_PK) and foreign bonds (A.c_fk). Since the connection between the parent element and the child element is one-to-one, the primary key can be in any table. If you contact is a couple, the primary key must be in the "one" end, whether it is in the parent element or child elements. For example, if the SalesOrder element contains multiple Item elements, the primary key must be in the SalesOrder table (parent element). However, if each ITEM element contains a part element, the primary key must be in the Part Table (child element), because a part can appear in multiple ITEMs.

A primary key column can be established as part of the map, and the case of column C_PK is like this, or uses an existing column or multiple listed as a primary key. For example, if a SalesORDER element type has a Number child element, you can map it into a primary key column.

If a primary key column is established as part of the map, its value must be generated by a database or transfer software. Although it is generally considered to be better database design than using data, this is a disadvantage when used with XML, this generated key is meaningless other than the source database. Therefore, when data is transmitted to the XML document with the data generated, it either contains meaningless primary keys (if the primary key is transmitted) or there is no primary key (if there is no transfer primary key). In the latter case, it is possible that the source of data may not be brought to the data, if the data is modified and the XML document is returned to the database, this is a problem.

3.1.3. Miscellaneous

Before we continue to map more complex parts, you need to mention two things. First, the name can be changed during the mapping. For example, DTD, object mode, and relational mode can all use different names. For example, the following DTD uses different names with the following classes:

DTD class

================================================== ==

Class Partclass {

==> string numberprop; float priceprop;

}

It uses different names with the following tables:

Class table

===================== ===================

Class partclass {TABLE PRT

String NumberProp; ==> Column PRTNUM

Float PriceProp; Column PRTPRICE

}

Second, the object involved in the mapping is conceptual. That is, when transmitting data between XML documents and relational databases, it is not necessary to instantiate. (This is not to say that the object cannot be instantiated. If an object is used, it relies on practical applications.)

3.2. Mapping complex content model

The content model is relatively simple so far. What is the more complex content model below?

In this section, we will consider different parts of the content model. The mapping of the above example will leave the reader to do exercises. (I always hope to say this.)

3.2.1. Mapping sequence

As already seen, map each element type in a sequence into a class property, which will be mapped into a column or a primary key, foreign key contact. E.g:

DTD class table

====================== ============================

Class A {Table A

==> string b; ==> Column B

C C; Column C_FK

}

Class C {Table C

==> string D; ==> Column D

String E; Column E} Column C_pk

3.2.2. Mapping selection

As a sequence, each element type referenced in one selection is also mapped into a class attribute, then map into a column or a primary key, foreign key contact. The only difference between the same sequence is this kind of attributes and columns can be NULL. For example, suppose we change the sequence in a A content model to a selection. Mapping from DTD to object mode will be

DTD class

====================== ========================

Class A {

==> string b; // nullable

C c; // Nullable

}

Class C {

==> String D;

String E;

}

The mapping from object mode to database mode will be

Class table

============================================================================================================================================= =

Class a {TABLE A (

String b; // nullable ==> column b // nullable

C C; // Nullable Column C_FK // Nullable

}

Class C {Table C

String D; ==> Column D // Not Nullable

String E; Column E // Not Nullable

} Column c_pk // not nullable

To know why, consider the following XML documents, which comply with the DTD above. Since the selection requirements in the content model of A is either c (but not two) as child elements, one of the two corresponding class properties (and columns) will always be NULL. XML object table

============= ============ ===========

Table a

---------

Object a {b c_fk

bbb ==> b = "bbb" ==> ------

c = null bbb null

} ...

Note If the primary key used to connect the table is in Table A, the corresponding foreign key columns in Table C cannot be empty. If the A element does have a C child element, this column must have a value to connect it to the correct row in Table A. If the A element does not have a C child element, it is simple in Table C.

3.2.3. Mapping Repeat Child Elements

The child element can occur in their parent element, is called a repeating sub-element, and is mapped into a multi-value class attribute and map multiple columns in a table or to a separate table called attribute tables.

If a content model contains a simple type of repetitive reference, these references are mapped into a single class properties, which is an array of known sizes. It can be mapped into multiple columns or attribute tables in a table. For example, the following shows how to map a repetitive reference into multiple columns in a table:

DTD class table

========================= ==========================

Class A {Table A

==> string [] b; ==> Column B1

String C; Column B2

} Column B3COLUMN C

If a or * operator is provided to a reference, the reference is mapped to a single class property again, which is an array of unknown sizes. Because the number of values ​​can be anyga, this type of attribute must be mapped into a property table, which will contain a row for each value. Through the primary key, the foreign key connection connects the property table to the class list, where the primary key here is in the class list. E.g:

DTD class table

====================== ============== ====================================================================================================================================================================

Class A {Table A

==> string [] b; ==> column a_pk

String C; Column C

}

TABLE B

Column a_fk

Column B

3.2.4. Mapping can be selected

Map the optional sub-elements into a type of null value, followed by a column with a null value. This has been seen for the child elements in the selection group, such as mapping from DTD to object mode:

DTD class

===============================================================================================================================================================================

Class A {

==> string b; // nullable

String C; // Nullable

D D; // Nullable

}

Class D {

==> String E;

string f;

}

And object mode to database mode mapping:

Class table

=================================================

Class a {Table A

String b; // nullable ==> column b // nullable

String C; // Nullable Column C // Nullable

D D; // Nullable Column D_FK // Nullable

}

Class D {Table D

String E; ==> Column e // not nullable

String f; column f // not nullable

} Column d_pk // not nullable

It is also applicable when it is applied to the reference? Or * operators, such as mapping from DTD to object mode:

DTD class

================================================

Class A {

==> string b; // nullable

string c []; // nullable

}

And object mode to database mode mapping:

Class table

========================== ================================================================================================================================================================= =======

Class a {Table Astring B; // Nullable ==> Column B // Nullable

String C []; // Nullable Column A_PK // Not Nullable

}

Table C

Column a_fk // not nullable

Column c // not nullable

Note that the column used to store C (in the attribute table C) cannot be empty. This is because if there is no C child element a in the A element, it is simple to be in Table C.

3.2.5. Mapping subunits

Map the references in the subgroup into the class properties of the parent class, then mapped into columns in class tables, such as mapping from DTD to object mode below:

DTD class

========================= ================================================================================================================================================================================================== ===

Class A {

==> string b; // not nullable

String C; // Nullable

D D; // Nullable

}

Class D {

==> String E;

string f;

}

And object mode to database mode mapping:

Class table

============================ ======================== ====

Class a {Table A

String b; // not nullable ==> column b // not nullable

String C; // Nullable Column C // Nullabled D; // Nullable Column D_fk // Nullable

}

Class D {TABLE D

String E; ==> Column E

String f; Column F

} Column d_pk

You may be imagined how it is possible. What happened to the structure of the subgroup? In fact, this structure only appears in the content model, not in the instance document. For example, both of the following documents meet the above content model:

bbbbbb

cccccc

bbbbbb

eee

fff

The existence of subgroups cannot be determined from the document. In terms of structure, C and D cannot distinguish between B; they are just a child elements of A. Therefore, they can be mapping as the child elements in the subgroup.

One result of the class attributes directly mapped in the subgroup into the parent class is duplicity and optionality. For example, in the following content model, C, D, and E are either optional and repeatable. They are repeatable because the operator is indirectly applied to them. C is optional because it indirectly in a selection group, while D and E are optional because they are indirectly in a selection group.

DTD class

================================================== ==========

Class A {

string b;

==> string [] C; // May be null

D [] D; // May Be Null

String [] E; // May Be Null

}

3.3. Mapping Mixed Content

In addition to the PCDATA that can contain a mixing between the sub-elements, the mixed content is only a select group on which the operator is indirectly applied. Therefore, the element type reference in the mixed content can first be mapped into an annular type class attribute of an unknown size, followed by mapping the property sheet. To see how to map the mixed content, consider the following XML documents:

THIS TEXT CC makes

bbbb no sense

CCCC Except AS

BB an esample.

Then pay attention to it in nature with the following documents, where PCDATA is packaged in element:

this text cc makes

bbbb no sense

CCCC Except as

bb an example.

Since then, it is easy to see that PCDATA can be treated like other child elements. Therefore, the PCDATA in the mixed content is mapped into an array that has known a small empty value, and then maps into a property table. The following shows how to map the mixed content from the DTD to the object mode:

DTD class

==================================================

Class a {

string [] PCDATA;

==> string [] b;

string [] C;

}

And from the object mode into database mode:

Class table

=================== ==================================================================================================================================================================================================== ====

Table PCData

------ COLUMN A_FK

Class a {/ column pcdatatring [] pcdata; Table A / Table B

String [] B; ==> Column A_PK -------- Column A_fk

String [] C; / Column B

} / Table C

------ COLUMN A_FK

Column C

To see something actually stored in the database, consider the documentation displayed at this section, which is mapped to the following object, then maps the rows in the table below. (We assume a primary key to generate a value of 1 in the table of the table. Use it to connect the rows in Table A to other tables.)

Object table

============================ ======================== =========

Table PCData

A_fk pcdata

---- ------------

1 THIS TEXT

1 Makes

Object a {1 no sense

PCDATA = {"this text", 1 Except as

"Makes", Table A 1 An Example.

"no sense", A_PK

"eXCEPT AS", ==> ---- TABLE B

"an esample."} 1 a_fk b

B = {"bbbb", "bb"} ---- ----

C = {"cc", "cccc"} 1 BBBB

} 1 BB

Table C

A_fk c

---- ----

1 CC

1 CCCC

One thing that is obvious from this example is that the object-relational map is not very efficient on storage mixed content. To this end, it is more commonly used in data-centric applications, which tend to have few mixed content.

转载请注明原文地址:https://www.9cbs.com/read-93821.html

New Post(0)
CopyRight © 2020 All Rights Reserved
Processed: 0.031, SQL: 9