14 tricks in database design
1. The relationship between the original document and the entity can be one-on-one, one-to-many, multi-to-many relationships. In general, they are one-on-one relationship: that is, one original document corresponds to and only one entity. In special cases, they may be a couple of relationships, that is, a plurality of entities corresponding to multiple entities, or multiple original documents. The entity here can be understood as the basic table. After clearing this correspondence, it is good for our design entry interface.
Example 1: An employee resume information, in the human resource information system, there correspond to three basic tables: the basic situation of employees, social relational table, and work resume. This is a typical example of "a original document corresponding to multiple entities".
2. In general, the primary key is generally in general, an entity cannot be neither a primary key and has no foreign key. In the E-R diagram, the entity in the leaf portion can define the primary key, or the primary key may not be defined (because it is not descendants), but must have foreign keys (because it has a father). The main key and foreign key is designed to occupy an important role in the design of the global database. When the design is complete of the design completion of the global database, there is a US database design expert says: "The key is the key everywhere, except for the key, nothing", this is his database design experience, but also reflects his information The height abstraction of the system core (data model). Because: The primary key is the high abstraction of the entity, the main key and the foreign key pair, indicating the connection between the entity.
3. The nature of the basic table is different from the intermediate table, the temporary table, because it has the following four characteristics: (1) atomicity. The fields in the basic table are no longer decomposable. (2) Primitive. The record in the basic table is the record of the original data (basic data). (3) Interpretation. The owned output data can be derived from the data in the basic table and the code table. (4) Stability. The structure of the basic table is relatively stable, and the record in the table is to be stored for a long time. After understanding the nature of the basic table, you can distinguish between the base form and the intermediate table, and temporary tables.
4. The relationship between the paradigm standard Basic Table and its fields should be met to meet the third paradigm. However, it is often not the best design in a database design that meets the third paradigm. In order to improve the operational efficiency of the database, it is often necessary to reduce paradigm: appropriately increase redundancy and reach the purpose of spatial transfer.
Example 2: There is a basic table of stored goods, as shown in Table 1. The existence of the "amount" field indicates that the design does not satisfy the third paradigm, because "amount" can be multiplied by "unit price" by "quantity", indicating that "amount" is a redundant field. However, adding the "amount" redundant field to improve the speed of the query statistics, which is the method of spatial change. In Rose 2002, there are two types: data columns and computational columns. The columns such as "amount" are called "calculated columns", and the columns such as "unit prices" and "quantity" are called "data columns". Table 1 Table Structure Product Name Product Table Structure Product Name Product Model Unit Price Quantity Situation Television 29 吋 2,500 40 100,000 5. It is popular to understand three paradigms in three paradigms, and it is good for database design. In the database design, in order to better apply three paradigms, it is necessary to understand three paradigms (popular understanding is enough understanding, not the most scientific and most accurate understanding): The first paradigm: 1NF is the attribute Atomic constraints, require atomic, non-decomposable; second paradigm: 2NF is the uniqueness constraint for records, requiring records with the only identification, that is, the uniqueness of the entity; third paradigm: 3NF is a field redundancy Sexual constraints, that is, any field cannot be derived from other fields, which requires no redundancy. There is no redundant database design. However, there is no redundant database that is not necessarily the best database, and sometimes the paradigm is reduced to properly reserve redundant data in order to improve operational efficiency. The specific approach is to follow the third paradigm when designing the concept data model, and the work of reducing paradigm is placed in the design of the physical data model. Reducing paradigms are increasing fields, allowing redundancy. 6. Be good at identifying and correcting multiple relationships between two entities, such relationships should be eliminated. The elimination is that the third entity is added between the two. In this way, a multi-to-many relationship is now changed to two one-to-many relationship. To reasonably assign the properties of the original two entities to three entities. The third entity here is essentially a more complicated relationship, which corresponds to a basic table. Generally speaking, database design tools cannot identify multi-to-many relationships, but can handle multiple-to-many relationships.
Example 3: In the Library Information System, "Book" is an entity, "Reader" is also an entity. The relationship between these two entities is a typical multiple-to-many relationship: a book can be borrowed by multiple readers at different times, and one reader can borrow a lot of books. To this end, it is necessary to add a third entity between the two, the entity is named "Borrowing Book", its attribute is: borrowing time, borrowing sign (0 means borrowing, 1 means also book), In addition, it should have two foreign keys ("Book" primary key, "reader" primary key), enabling it to connect to "Book" and "Readers".
7. The value of the primary key PK is the intervals connected to the session service, which can be a physical meaning of digital string, which is automatically added by the program. It can also be a combination of physical meaning field name or field name. However, the former is better than the latter. When the PK is a combination of field names, the number of suggested fields should not be too much, but also not only the index is large, and the speed is slow.
8. Correct understanding of data redundancy primary key and foreign key repeatedly in multiple tables, do not belong to data redundancy, this concept must be clear, in fact, many people are still unclear. The repetitive appearance of the non-key field is data redundancy! And is a low-level redundancy, ie repetitive redundancy. Advanced redundancy is not a repetitive appearance of a field, but the derived generation of fields.
Example 4: The "unit price, quantity, amount" in the commodity, "amount" is divided by "unit price" by "quantity", it is redundant, and is a high-level redundancy. The purpose of redundancy is to increase processing speed. Only low redundancy will increase data in the data, because the same data may be entered multiple times from different times, locations and characters. Therefore, we advocate advanced redundancy (derived redundancy) against low-level redundancy (repetitive redundancy). 9. E - r diagram There is no standard answer, because its design and painting method are not unique, as long as it covers the service scope and function content of the system requirements, it is feasible. Easely want to modify E - r diagrams. Although it doesn't have the only standard answer, it doesn't mean you can design freely. The standard of good e-r is: clear structure, simple association, moderate entity, reasonable attribute assignment, no low-level redundancy.
10. View technology is useful in database design, and the view is different from the basic table, the code table, and the view is a virtual table, which exists on the real table of the data source. The view is a window for programmers using the database, which is a form of synthesis of the base table data. It is a method of data processing, which is a means of user data confidential. In order to perform complex processing, improve computing speed and save storage space, the definition depth of the view must not exceed three layers. If the three-layer view is still not enough, the temporary table should be defined on the view, and the view is defined on the temporary table. This repeatedly overlapses the definition, the depth of the view is not limited.
For some information systems associated with national politics, economy, technology, military and security interests, the role of views is more important. After these system's basic tables complete the physical design, the first floor view is established immediately on the basic table. The number and structure of this layer view is identical to the number and structure of the basic table. And stipulate that all programmers are only allowed to operate on the view. Only the database administrator has a "security key" that has a number of personnel to master to operate directly on the basics. Please read by: Why?
11. The middle table, report, and temporary table is a table that stores statistics, it is designed for data warehouses, output reports, or query results, sometimes it does not have primary keys and foreign keys (except for data warehouses). The temporary table is the programmer's personal design, and stores temporary records, which is used for individuals. The base table and the intermediate table are maintained by DBA, and the temporary table is automatically maintained by the programmer himself.
12. Integrity constraints are integrity in three aspects: to achieve constraints with Check, in the database design tool, when defining the value range of the field, there is a check button to define the value of the field. Reference integrity: Implement with PK, FK, and Table-level triggers. User-defined integrity: It is some business rules, implemented with stored procedures and triggers.
13. Preventing the design of the database design to play patch is "Three Less Principles" (1) The less the number of tables in a database is, the better. Only the number of tables is less, in order to explain the system's E - R map, remove the repeated excess entity, form a high abstraction of the objective world, the system's data integration, preventing patching Design; (2) The smaller number of the fields of the combination primary key in a table is, the better. Because of the role of the primary key, the first is to build a primary key construsion, and the other is the foreign key of the sub-table, so the number of characters in the combination primary key is less, not only saving the runtime, but also saves index storage space; (3) a table The number of fields, the better. Only the number of fields is less, in order to explain that there is no data repetition in the system, and few data redundancy, more importantly, supervise the reader to learn "column change", which prevents the fields in the subtray In the primary table, leave a lot of empty fields in the primary table. The so-called "column variable" is to pull some of the contents in the primary table and build a subtaby separately. This method is very simple, some people are not used to, no adoption, no implementation. The practical principle of database design is to find a suitable balance point between data redundancy and processing speed. "Three Less" is a holistic concept, comprehensive view, can't isolate a certain principle. This principle is relative, not absolute. The principle of "three" is definitely wrong. Imagine: If the same function is covered, the E-R map of the 100 entity (a total of 1,000 properties) is definitely more than the E - R map of the two hundred entities (a total of 2,000 properties). . Advocating the "three less" principles, is the reader learned to use the database design technology to perform system data integration. The step of data integration is to set the file system set into a application database, and the application database set is the topic database, and the topic database set is a global integrated database. The higher the integration, the stronger the data sharing, the less the information of the information, the number of entities in the overall E-R map of the entire enterprise information system, the less number of the primary key, the less attributes. The purpose of advocating the principle of "three" is to prevent readers from using playing technology, constantly increasing the database, so that the enterprise database has become the "garbage pile" of the free design database table, or the "big cluster" of the database table, final Causes the basic tables, code tables, intermediate tables, temporary tables in the database, no count, causing the information system of enterprises and institutions that cannot be maintained. "Three" principles can do anyone, this principle is the "Patching Method" Design Database's Database. The principle of "three less" is less and fine. It requires higher database design techniques and art, not anyone can do, because the principle is to eliminate the theoretical basis for designing the database with "Patching Method".
14. Improve the efficiency of database operation efficiency Under a given system hardware and system software conditions, improve the operational efficiency of the database system is: (1) When the physical design is physically designed, reduce paradigms, increase redundancy, less trigger , Multi-purpose stored procedure. (2) When the calculation is very complicated, and the number of records is very huge (for example, 10 million), the complex calculation should be outside the database, after the file system mode is completed, and finally the library is added to the table. Go in. This is the experience of telecommunications billing system design. (3) It is found that there are too many records of a table, such as more than 10 million, will be horizontally segmented. The horizontal segmentation is to divide the recording level of the table into two tables in a certain value of the table master key PK. If there are too many fields of a table, for example, more than 80, the table is vertically divided into two tables. (4) System optimization of Database Management System DBMS, that is, various system parameters, such as buffer numbers. (5) Try to take an optimization algorithm when the program is used to design the data-oriented SQL language. In summary, to improve the operational efficiency of the database, you must optimize the database system level, database design level optimization, and program implementation level, these three levels are simultaneous. The above 14 tricks is that many people have gradually summed up in a large number of database analysis and design practices. For these experiences, readers can't help hard sleeves, die hard, and to digest, seek truth from facts, flexible. And gradually: developing in the application, in development.
Http://www.chinahtml.com/databases/2/2006/msql11377841443242.SHTML