The relational database design is the process of organizing and structured data, and the core problem is the design of the relationship model. The relationship model is mathematically, describes the model between the contacts between the entities with two-dimensional table data; it is all the relationship patterns, the gathering of the property name, and the object of the keyword, which is the object described in the relational mode. Relationship mode refers to a relationship attribute name table, a table frame of a two-dimensional table. The design of the relationship model is the soul of the relationship model design. Therefore, the design of the relationship model is the core of the core of the relational database design.
The design of the relationship model directly determines the performance of the relational database. Currently, in the design of the design of the design, normalization design is dominated, it is produced and mature in the long-term development of the database decades. However, in recent years, there has been a new trend in this area, and a relational pattern called unmatition (DenorMalization) has caused the industry's attention and has been applied within a certain range. For this new design idea, the parties reacted is different, so there is a unclear standardization and non-standardization dispute in the relevant theoretical community. This paper briefly introduces the basic idea of standardization and non-standardization design, and summarizes the main points of the debate between the two sides, for reference in the domestic industry.
First, standardized design
The basic idea of the standardization design of the relationship model is to replace the original relational model by breaking the relational model, replacing the original relational mode, eliminating the partial dependencies (including function dependent and multi-value dependencies). Make a relationship only one connection between an entity or entity. This process must be made under the premise of ensuring lossless connectivity and maintain function dependence, that is, ensure that the original data is not destroyed, and can restore the differentiated relationship to the original relationship.
Specifically, the process of standardized design is to constantly decompose a two-dimensional table into a plurality of two-dimensional tables and establish a table between a table, and finally reach a table with only one entity or entity. Contact the goal. The main paradigm, which is currently followed by 1 NF, 2 NF, 3 NF, BCNF, 4NF and 5NF; 3NF in the project, BCNF is the most widely used, and 3 NF is recommended as standard.
Advantages of standardized design include effectively eliminating data redundancy, streaming data from the dependence, maintaining the integrity of the database, enhances the stability, scalability, adaptability of the database. The main problem that is usually considered to standardize design is to increase the connection library table operation at the time of query, resulting in a loss of computer time, space, system, and operating efficiency. In most cases, this problem can be solved by a good index design.
Second, non-standardized design
The basic idea of non-standardized design is that the real world does not always depend on a perfect mathematical relationship model. It is mandatory to standardize things, and the form is simple, and the content tends to complicate, more importantly, the decrease in the efficiency of the database operation. Non-normalization requirements appropriately reduce the paradigm of even abandoning relationship modes, no longer require a table to describe a link between an entity or entity. Its main purpose is to improve the operational efficiency of the database.
The main techniques of non-normalized processing include increasing redundant or derived columns, merge, segmenting, or increasing repetition tables. It is generally believed that in the following circumstances, a non-standardized process can be considered: (1) The table involved in a large number of frequent query needs to be connected; (2) The main application should be inquiry when executed; ( 3) The calculation of the data requires a temporary table or a complex query.
The main advantage of unregulated design is to reduce the connection required for query operations; reduce the number of external keys and indexes; the statistical calculation can be performed, and the response speed at the time of query is improved. The main problem with unmatition existence is to increase data redundancy; affect the integrity of the database; reduce the speed of data updates; add the physical space occupied by the storage table. The most important of which is the database's integrity issue. This problem can generally be resolved by building a trigger, using transaction logic, running a batch command or stored procedure in an appropriate time interval.
Third, the main points of standardization and non-standardization debate
Partial support for non-standardized design believes that the higher the database specification, the more the number of tables, the standardization degree is directly related to the number of tables; the more the number of tables, the more the connection operation is increased, inevitable Reduce the speed of the database, affect the performance of the database. Only the number of tables is significantly reduced by unconventional design, thereby reducing the dependence on the connection operation, accelerating the speed of the database execution, to ensure the normal performance of the database performance. For example, the non-standardized star model that is currently popular in decision support systems is far better than the application standardization design, and is the best example of non-standardized design. Unmatition design does not mean confusion and ignore the rules, which also follows the basic principles of software engineering such as protection information integrity. Part of supporting standardized design believes that standardization and non-standardization are just a logical concept, emphasizing that the non-standardized designer confuses logic and physical relationships. The performance of the database is determined by the physical level, that is, the size of the database, the method of physical design, data storage, and access. The extent of the database management system, the number of concurrent access, etc .; unconventional design does not change the physical level of the database Therefore, it is impossible to improve the performance of the database. Standardization is not just to avoid data redundancy, more importantly, to ensure database integrity. The biggest problem with non-standardized design is that it is difficult to ensure the consistency of data in the database, there is a risk of destroying data. In addition, non-normalization has multiple entities in a table. Different entities are mixed together to strengthen the complexity of the database, improve the difficulty of user understanding, and lead to the difficulties of describing problems, increasing the correct response risk. Only standardized design is the fundamental way to solve these problems. If you do not abandon the non-standard design concept, in order to obtain the improvement of the so-called performance, the risk of destroying database integrity is unable to motivate developers to study truly fully standardized and high-performance relational database management systems, and there must be affected databases. Healthy development. In a sense, the standardization and unmatition design of the database are not opposite, not this is the relationship. Perhaps one of them will gradually die, maybe the two have an intermediate road. I have a process of understanding a spiral rise. This debate has not yet ended, and it is not possible to predict the final result. But it is certain that no matter how the results will have a profound impact on the development direction of future databases.