Preliminary data warehouse

zhaozj2021-02-16  71

Original address: http://blog.9cbs.net/sunjx119/Archive/2004/07/25/51551.aspx

??? Data Warehouse is an integrated, integrated, relatively stable, reflecting historical data set for support management decisions. ??? 1. Theme: Unlike the operational database, the theme is an abstract concept refers to the key aspects of the user who uses the data warehouse for decision-making. Reflecting the side of the business process, not as separated as an operating database. ??? 2. Integrated: Operating databases are usually related to certain applications, while the database is often independent of each other, they are heterogeneous. The data warehouse is integrated with the original dispersion data, and the data type transformation is performed, and the inconsistency in the data is eliminated. ??? 3. Relative stability: The data warehouse is stable. Because it is necessary to do business analysis decision-making, it will be reserved for a long time, and reference query in the future. And he is relatively stable because the data warehouse should save historical information to reflect the changes of a matter, as the original information of analysis decisions, so the data warehouse must maintain historical information, so he is relatively stable. Write Once, Read Many Times.4. Reflecting History: To support analysis decisions, the processes have been summarized by the past development history, thus making reasonable forecasts for future.

??? Data Warehouse Environment ??? 1. tap: Data Extraction, Transformation, and Load ??? 2.OLAP: Online Analytical Processing Engine ??? 3.DSS: Decision Support System ??? 4. Customer Analysis and Report Tool ??? 5. Other Data Collection and Data Output Tools

• Data warehouse architecture ??? Data warehouse can be divided into four levels: ??? Data source, data management and storage layer, OLAP server, and front-end tools. ??? 1. Data Source: He is the basis of the data warehouse, the bottom of the data warehouse architecture, is the data source of data warehouse. Includes information of each business handling subsystem. ??? 2. Data management and storage layer: It is the core of the data warehouse. How efficient management data is different from the main criteria for the operational database. Follow the subject management data, the aggregate data is stored in the multidimensional database. ??? 3.OLAP server: Effective integration of data and organizes by multidimensional model. ??? 4. Front-end tool: mainly includes various report tools, query tools, data mining tools, etc.

??? Process for ETL ??? data into the data warehouse generally takes three processes extracted, transformed and loaded, which are referred to as ETL. Generally use existing tools to implement ETL process. ??? 1.e: Data extraction. Data cannot be modified while data extraction. The file format that can be extracted is: Database object, such as the table can be exported from the source system throughout. For example, MS SQL Server 2000 BCP "Select * from northwind..customers" Queryout "D: Temp.txt"? -C -p -u "sa" -p "sa". The extraction process is dynamically extracted, that is, the target changes, which does not affect the efficiency. One implementation is to increase timestamp; another method is to describe in different tables, respectively. ??? 2.T: Data transformation. The data will be transferred from one system to another system. The order is: Source System -> Staging Database -> Data Warehouse -> Data Mart. There are three ways: A.FLAT Files B. Distinguish the operation c. Use the switch partition. ??? 3.l: Data input. ???

??? Data warehouse does not have to use the database to implement

??? Data warehouse does not satisfy the third paradigm, and does not satisfy the paradigm, he only contains "key". (The first paradigm: There is a primary key, subjective image, is a classification or organization that meets the common sense; the second paradigm: meet the first paradigm, and all the columns other than the primary key are related to all the primary key; third paradigm : Meet the second paradigm, and all columns other than the primary key have no relationships)

转载请注明原文地址:https://www.9cbs.com/read-13298.html

New Post(0)