The key to the relational database is the decomposition of the relationship. Only two or two relationships between data are defined in the database, and more complex data relationships related to the application requires the operation of dynamic join to be constructed, that is, these relationships are stored in The program is not in the database. In fact, an implicit assumption that the relational database is less associated between data, and in the actual application, single table and master-slave table are also the most often the case. When an application requires a large number of connection operations, it is often means that the relationship data model is invalid. At this time we will have to give up the non-redundantness of the data, and you need to construct an instantiated view by pre-connection. Curing complex relationship between data and is explicitly defined. In the data warehouse, abstract discussion Star Schema and Snowflake Schema which superior is a meaningful problem. What extent should be aggregated, and it is necessary to determine the specific situation according to data applications.
The relational database itself defines two or two relationships between data, lacks some global data access means. One basic concept of the data warehouse is data space, which can access data directly through global coordinates, rather than accessing data by two or two connections. The most important thing in the data warehouse is the time dimension because it is a coordinate dimension shared by all the data. We can take both the data that happen to the same point in time, regardless of whether it defines the association between them.
The basic data access mode of the relational database is as follows:
SELECT attribute list
FROM table a, table B
WHERE table a.data_id = Table B.ID
AND Table B.attr = 'a'
In the data warehouse "from table a, table B where table a.data_id = table B.ID" This section places a correlation between multiple data tables and tables together as a so-called topic.
The part of Table B.attr = 'A' is separated from the WHERE clause as coordinate conditions.
There are two ways to establish a time coordinate in the data warehouse. For events that occur at the time point, we directly establish a point coordinate, expressed through the his_date field, and we can create interval coordinates for a period of time, we can create interval coordinates, via from_date and TO_DATE two fields are represented.