Initial data warehousing - "Set off your cover" I am here to talk about some of my awareness and opinion, I hope these can be some simple introduction to those who begin school, and I hope to take bricks Effect.
Speaking of the data warehouse, let's take a look at the background he generated. Since DBASE III (DBase is the development of DOS development, the database management program is first in Borland), the application of the database enters the personal computer, making SMEs have also entered the commercial database application phase. At this time, the database application is primarily recorded, and we call this type of OLTP (online transaction) database system. He is mainly designed for transaction processing, so it is convenient to perform corresponding transaction processing, for example: support a large number of users to make new or modify data.
However, due to the diverse requirements of data processing, the management needs to decision-making analysis on data. It is necessary to regularly access a large number of historical data, although there is an experience in the traditional database structure and master the inquiry syntax. It can be achieved by complex statements, but it is not really practical to make performance and security. And, in real, the data information in the enterprise is mostly distributed among different systems, so if a comprehensive analysis process is performed, you may need to get information from different types of data sources, which brings the actual application. difficult.
In order to solve the problem, "Data Warehouse" came into being. According to the parent Bill inMon of the data warehouse, it is:
Data warehouse is a database system-oriented, stable, and time-changing, mainly used, mainly used for decision-supporting, stable, and time-changing, mainly used for decision-supporting database systems.
Simplely talk about: The data of the data warehouse is read-only, taken from the transaction system according to the given interval, and re-constructs according to the needs of the analysis query.
We have the following characteristics on the point of view of the data:
1. Integrate the dispersed heterogeneous data into a single data source.
2. Use the structure that is suitable for analyzing the query to store data.
3. Include data is new data after transactional conversion, which is convenient for decision-making personnel to analyze.
4. The data has stability. Usually, every time you enter newer data, unlike normal work operations as regular job operations.
5. Filter out of the data related to the analysis.
6. The data in the data warehouse has historic, often a data record for many years.
7. The data in which the past historic records, so once the entry does not need to be changed, simply enter new data periodically.
The data warehouse is a "warehouse" that is a large collection of data from the name. Due to the size and practical need of each application department, the data warehouse can be divided into: Δ data warehouse (data warehouse application standard type, is the whole company to architecture of)
Δ data supermarket (scale relative to the above, suitable for departments in the enterprise, should be considered from the perspective of the formation of data warehouses)
Δ multi-level data warehouse (is the integrated data warehouse and data supermarket, the lower data supermarket is obtained by the upper data warehouse. This maintained the data consistency, and reduced the burden on the data warehouse)
Δ Joint Data Warehouse (for schemes for integrating data supermarkets as data warehouses in stock)
With regard to the structure, we know that the model of the relational database is "er modal", and the data warehouse uses the Dimensions Model, and in this dimension model, the most commonly used data warehouse structure is "" Star shaped "structure and its expanded" snowflake type "structure.
(
Dimensions are classified information for organizing data warehouse data, such as time, address, person in charge, etc. )
Structure patterns are as follows:
Star shaped structure is a relational database structure, in the middle of this mode, the surrounding dimension table, the data maintained in the fact table, dimension data maintained in the dimension table. Each dimension table is directly associated with the fact table through a keyword. This star structure is very suitable for the design of the data warehouse database. The reason is as follows:
1. This design is easy to quickly modify and add when data warehouse growth or application changes, thus having sufficient flexibility.
2. It is easy to understand for developers and end users.
3. Imitating the typical ways to consider and use their business data.
4. Easy to be implemented as a physical database, and many DBMs can identify this structure and optimize it, so this design structure produces an efficient query.
It can be seen that this structure is an expansion of the star-shaped structure, just adding auxiliary detail. The dimension table stores the normalized data, which improves query performance by reducing the number of disk readings. The dimension table is decomposed into the primary meter and the secondary meter associated with the main dimensional table.
Some basic common sense of the data warehouse is introduced above. For the design and management of the data warehouse, I think it is a very practical job, which not only needs adequate theoretical knowledge, but also needs to have constantly explore and summarize in the application. As a beginner who just contact data warehouse, I understand these and have more learning passion. Talking here, I hope that the master is not laughing. If you have a better understanding and experience, I hope everyone can communicate together. I believe that good communication is our "high quality calcium tablets". :)