Online analysis processing overview
Development Background
With the widespread application of database technology, the enterprise information system has a large amount of data, how to extract information useful for corporate decision management from these massive data to become an important problem facing corporate decision management personnel. Traditional enterprise database system (management information system) is an ON-Line Transaction Processing, an abbreviation OLTP as a data management means, which is mainly used for transaction processing, but it has always been satisfactory for analytical processing. Therefore, people gradually attempt to process data in the OLTP database, form a comprehensive, analytical, better support for decision support system (DSS). The current information of the company is generally managed by DBMS, but the decision database and operational database have different features and requirements in terms of data source, data content, data mode, service objects, access methods, transaction management and even power storage, etc., Therefore, it is not appropriate to establish DSS directly on the database running. Data warehouse technology is developed in such a context. The concept of data warehouse is proposed in the mid-1980s. In the 1990s, the data warehouse has taken from the early exploration phase to the practical phase. Industry-recognized data warehouse concept founder Whint's definition of data warehouse in "Building the DataWarehouse" is: "Data Warehouse is a long-lasting, integrated, integrated, integrated, integrated, long-lasting data for management decision-making processes set". The process of building a data warehouse is to extract data from the OLTP database distributed across the enterprise within the OLTP database that is distributed within the enterprise within the OLTP database of the enterprise. The core of the current data warehouse is still a database system under RDBMS management. The amount of data in the data warehouse is huge. In order to improve performance, RDBMS generally take some measures to improve efficiency: adopting parallel processing structures, new data organizations, query strategies, indexing technologies, etc.
Many application traction, including On-line Analytical Processing, referred to as OLAP drove the emergence and development of data warehouse technology; and data warehouse technology has promoted the development of OLAP technology. The concept of online analysis processing is earlier by the father E.f.codd of the relational database. CODD believes that online transaction processing (OLTP) does not meet the requirements of end users on database query analysis, and SQL's simple query for large databases cannot meet the needs of user analysis. User's decision analysis requires a lot of calculations for relational databases to get results, and the results of the query do not meet the needs of decision makers. Therefore, CODD proposes the concept of multi-dimensional database and multidimensional analysis, namely OLAP. The OLAP Board is defined for online analysis: enabling analysts, managers, or executives to transform from raw data from a variety of angles, and truly reflect the information of enterprise-style Quick, consistent, interactively access, thereby obtaining a more in-depth understanding of data. OLAP's goal is to meet decision support or multi-dimensional environment specific queries and report requirements. Its technical core is the concept of "dimension", so OLAP can also be said to be a collection of multi-dimensional data analysis tools.
OLAP and OLTP
OLAP final data source is the same as OLTP, all from the underlying database system, but the user group facing the two is different, the data content is also different. The difference between the two is as follows:
OLTP data OLAP data
Original data export data
Detailed data comprehensive and refining data
Current value data history data
Update not updatable, but periodic refresh
The amount of data that has been processed at a time is large.
Applications, transaction drive-oriented analysis, analysis-driven
Operators, support daily operations for decision-making staff, support management needs
OLAP Features and Evaluation Guidelines
The features of OLAP can be represented by five keywords: Fast Analysis of Shared MultidImental Information (fast analysis of Fasmi, shared multidimensional information). This is also a guidelines for designers or managers to judge whether an OLAP design is successful.
FAST: The system responds to the user's time is quite fast. To achieve this goal, the database's mode should be developed towards a broader technology, including special data storage formats, pre-calculated and hardware configurations.
Analysis: The system should be able to handle any logic analysis and statistical analysis related to the application. Users can define new specialized computations without programming, as part of the analysis, and the report is given in the user's ideal way. The user can perform data analysis on the OLAP platform, or on other external analysis tools, while providing flexible open report processing capabilities to save the analysis results.
Shared: This means that the system must be able to meet the security requirements of data confidentiality, even if multiple users are used simultaneously, they can only see the information they should see according to the security level to which users belong.
Multidimensional: OLAP Significant Jasque is that it can provide data to provide multi-dimensional views and analysis of data analysis, including full support for hierarchical dimension multiple levels.
Information: Regardless of the amount of data, the OLAP system should get information in a timely manner regardless of the data storage, and manage large capacity information. There are many factors that require consideration, such as data reproducibility, available disk space, performance of OLAP products, and combining with data warehouses.
OLAP logic concept and typical operation
OLAP is presented in front of the user is a multi-dimensional view.
Dimension: It is a specific point of view of people to observe the data. It is a class of properties when considering problems, and the attribute set constitutes a dimension (time dimension, geographic, etc.).
Level (Level): A specific angle (i.e., a certain dimension) of the data can also have a different degree of detail (time dimension: date, month, quarter, year).
Member of the dimension: One value of the dimension is a description of the location of the data item in a certain dimension. ("One day on a certain day" is a description of the position in time.
Measure: The value of the multidimensional array. (January 2000, Shanghai, Laptop, $ 100000).
OLAP's basic multi-dimensional analysis operates with drill-up and dust-down, slice (SLICE), and Dice, and rotation (Pivot). Drilling: It is the level of changing the dimension, the particle size of the transformation analysis. It includes DRILL-DOWN and DRILL-UP / UP (ROLL-UP)
. Drill-Up is summarized in a certain dimension to a high-level summary data, or reduces dimension; while DRILL-DOWN is reversed from the summary data to detail or increase new. Slices and cuts: After a selection value is selected, the distribution of the measured data in the remaining dimension is concerned. If there is only two remaining dimensions, it is slice; if there are three or more, it is cut. Rotate: It is the direction of the change in dimension, that is, rearrange the dimension in the table (such as ranks interchange). Architecture and classification of OLAP systems
The relationship between the data warehouse and OLAP is complementary. Modern OLAP systems generally take a data warehouse as a basis, i.e., a subset of detailed data from the data warehouse and stored in the OLAP memory to the front-end analysis tool. A typical OLAP system architecture is shown below:
The OLAP system can be divided into three types of relational OLAP (Relational OLAP, referred to as ROLAP), multi-dimensional OLAP (Multidimensional OLAP, referred to as HOLAP), according to its memory format.
ROLAP
ROLAP stores the multi-dimensional data used in the relational database and defines a batch of realtum as a table in the relational database based on the needs of the application. It is not necessary to save each SQL query as a log, only the query that is relatively high in terms of application frequency, and the calculation workload is compared as a realt. For each query for the OLAP server, you can use the calculated realtum to generate query results to improve query efficiency. The RDBMS used as the ROLAP memory is also optimized for OLAP, such as parallel storage, parallel query, parallel data management, cost-based query optimization, bitmap index, SQL OLAP extensions (Cube, rollup), and more.
2. MOLAP
MOLAP stores the multi-dimensional data used by the OLAP analysis as a multi-dimensional array, forming a "cube". The dimensionality of the dimension is mapped to the range of the subscript value or subscript of the multi-dimensional array, and summary data is stored in the unit of the array as the value of the multi-dimensional array. Since MOLAP uses a new storage structure, it is also known from the physical layer, and the physical OLAP (Physical OLAP) is also known as a physical OLAP ("ROLAP mainly implements the storage structure of the relational database mainly through some software tools or intermediate software, the physical layer still adopts the storage structure of the relational database, For virtual OLAP (Virtual OLAP).
3. HOLAP
Since MOLAP and ROLAP have their own advantages and disadvantages (as shown in the following table), and their structure is quite different, this is a problem for analyzer design OLAP structure. For this new OLAP structure - hybrid OLAP (HOLAP) is proposed, it can combine the advantages of both MOLAP and ROLAP. So far, there is no formal definition for HOLAP. However, it is obvious that the HOLAP structure should not be a simple combination of MOLAP and ROLAP structure, but the organic combination of these two structural technologies, which can meet the various complex analysis requests of users.
ROLAP
MOLAP
Techniques along the existing relationship database
Designed for OLAP
The response speed is slower than MOLAP;
The existing relational database has been optimized for OLAP, including parallel storage, parallel query, parallel data management, cost-based query optimization, bitmap index, SQL OLAP extension (Cube, rollup), etc., performance has improved
Good performance, fast response speed
Data loading speed is fast
Data loading speed is slow
The storage space is small, and there is no limit to the dimension.
It is necessary to pre-count, which may result in data explosion, limited dimension; unable to support dynamic changes
Borrow RDBMS storage data, no file size limit is limited by the file size in the operating system platform, it is difficult to reach TB (only 10 ~ 20g)
Detailed data can be implemented with SQL storage
Standards lack of data models and data access
- Does not support read and write operations
-SQL unable to complete partial calculation
• Unable to complete multi-line calculations
• Unable to complete the calculation
- Support high performance decision support calculation
• Complex cross-dimensional calculations
• Multi-user read and write operations
• Row calculation
Easy to maintain difficulties
Main OLAP Manufacturer Product Introduction
Hyperion
Hyperion EssBase OLAP Server, with more than 100 applications above, more than 300 developers with EssBase as a platform. With hundreds of calculation formulas, scripting predictions for support, and statistics and dimensional calculations.
Powerful OLAP query capabilities, using EssBase Query Designer, business users do not need to help their components complex inquiries without IT staff.
Extensive application support, can extend the value of data warehouses and ERP systems, establish an analytical program for applications such as e-commerce, CRM, finance, manufacturing, retail and CPG (Consumer Packaged Goods).
SPEED-OF-THOUGHT response time supports multiple users at the same time
Web-enabled, server-centric architecture, support SMP
Powerful partners offer a complete solution, more than 60 packaging solutions, more than 300 consultments and implementation companies.
Rich front-end tools, more than 30 front-end tools are available, including Hyperion's own Wired for OLAP, SPIDER-Man Web Application, Objects, EssBase Spreadsheet Add-in, Web Gateway, Reporting.
Hyperion Enterprise, a solution for financial integration, reporting and analysis of multinational companies. More than 3,000 organizations are using this system.
Enrichment: Supports multiple financial standards US GAAP, Canadian GAAP, UK GAAP, International Accounting Standard (ISA), FASB, HGB. Automatic payment of the transaction between the branch. Fas52 currency conversion. Fas94.
Easy to use: Access the system via Excel, Lotus 1-2-3 and various browser.
Support the company's structure adjustment.
Multinational companies support: At the same time support 6 languages and legal and taxation requirements in various countries.
Complete process control and audit tracking, and security level settings.
Can integrate with ERP or other data sources
Hyperion Pillar, budget and planning tools. More than 1,500 global users, providing an active budget, project-based planning, centralized planning, sales forecasting and comprehensive programs.
Distributed architecture
Detailed plan: allowing first-line manager to formulate detailed plans
Complex modeling and analysis capabilities
Oracle
Express Server provides comprehensive OLAP capabilities with more than 3,000 users worldwide
Users can use through web and spreadsheets
Flexible data organization, data can be stored in Express Server or directly on RDB
Have built-in analytical functions and 4GL to customize your own query
Cognos
PowerPlay provides a comprehensive report and analyzing environment for business efficiency. Provide various key data to decision makers to operate efficiency, conduct a variety of analyzes.
Just click on the mouse, drag can browse the cube
Automatic analysis report obtained by Web
Support multiple OLAP Server: Microsoft Olap Services, Hyperion Essbase, SAP BW, IBM OLAP for DB2 fully authorization and security system
Novaview is a client application for Microsoft SQL Server 7.0 OLAP Services.
MicroStrategy
MicroStrategy 7 is a new generation of Intelligence Platform facing e-business application E-Business and electronic customer relationship management ECRM.
Powerful analysis capabilities
Center-centric interface
Support for millions of users and TB data
Quick development capabilities, directly utilize existing data modes
Intelligence Server, One for All Analytic Applications
Microsoft
SQL Server 7.0 OLAP Services is an OLAP module of SQL Server 7.0 that can be used as a data source using any relational database or a flat file, where the PivotTable Service provides a client's data cache and computing power.
Intelligent Client / Server data management, improve response speed, reduce network traffic
Allow different client access by Ole DB for OLAP
BusinessObjects
BusinessObjects, is an easy-to-use BI tool that allows users to access, analyze, and share data.
A variety of data sources can be applied: RDB, ERP, OLAP, Excel, etc.
Develop customization can be applied to VBA and open object models
IBM
DB2 OLAP Server is a powerful multidimensional analysis tool that integrates the Hyperion EssBase's OLAP engine and DB2 relationship database.
Completely compatible with the EssBase API
Database Star model stores in relational database DB2
Brio
Brio.Enterprise, is a powerful and easy-to-use BI tool, providing query, OLAP analysis, and reporting capabilities
Support multiple languages, including Chinese
Brio.report, powerful enterprise report tool
OLAP related standards
APB-1 OLAP BENCHMARK RELEASE II (Sponsored By Olap Council)