Transfer "Talking about Database Design Skills"
Speaking of the database, I don't think you can't talk about the data structure first. In 1996, when I first study computer programming, the teacher told us: Computer program = data structure algorithm. Although the current program development has gradually transitioned to the object-oriented process, I still endorsed the formula of the teacher to tell us 8 years ago: Computer Program = Data Structure Algorithm. Object-oriented program development, the first thing to do is to analyze the data that needs to be processed throughout the program, extracts the abstract template, designs classes in this abstract template, then add a function of processing its data (ie, Algorithm), finally, the data member and function in the class are partitioned to achieve a package.
The initial prototype of the database is said to be a record of a cows in the United States (paper, which can be seen, the database does not necessarily be the data stored in the computer ^ _ ^), recorded the balance of the dairy farm Accounts, the programmer is inspired when it is enjointed it and enters the computer. When the amount of data collected in accordance with the specified data structure is large to a certain extent, the programmer separates the retrieval, update maintenance and other functions for program execution efficiency, and makes a separately called module, this module Later, slowly develop, evolve into a database management system (DBMS) - a important branch in program development now.
Let's enter the topic, first click on the skill of the database designer in the program that I personally touched: 1. There is no system to learn the programmer of the data structure. Such programmers' works are often just their impromptu toys. They are often only a limited number of tables, achieving all types of functions in a table, almost unrelated between the tables. There is a lot of free management software online, when the program is limited, the data is not much, and the program runs does not have any problems, but if it is more important, it is very important. 2, the system has learned the data structure, but has not developed programmers for management software for program efficiency requirements. These people have just graduated from school shortly. They are strictly in accordance with the provisions of the database table, and they will die in accordance with the provisions of the textbook, and the E-R map and 3NF (do not discourage, all database design masters start from this step). Their works are enough for general Access type light level management software. But once the system needs to add new functions, the original database table is almost a large blood. 3, second types of programmers, after experiencing the efficiency of several programs, and after the functional upgrade, finally upgraded to the old bird designed by the database, the highest people in the eyes of the first class. Such programmers can be eligible for the development of medium-sized commercial data management systems above more than twenty tables. They know what to keep certain redundant data to improve program efficiency, and their design is better, when the user needs to add new functions, the original database table is only necessary to do a small amount of modification. . 4. After the repetition design of the top ten similar database management software, the third type of programmer insisted that there was no transfer, but it hoped that the "lazy" tricks will slowly realize, so that the completion of the volume change Conversion. The database table structure they designed has a certain vision, predicting the data required for future functional upgrades, thereby pre-detail. Such programmers currently advance in advanced software developers in data mining. 5. Third class programmers or collected programmers, after a certain research on the principles and development of existing database management systems, or on the basis of the second development, either develop a set A general database management system with independent copyright.
I personally is in the end of the third category, so some of the design techniques listed below are only suitable for the second class and some third type of database designer. At the same time, because I rarely encounter interested in this area, I will inevitably have errors and omissions in the text. In this first declaration, welcome everyone to correct, don't hide, 8), tree-type relationship data sheet Many programmers encountered data of tree-type relationships when they perform database design, such as common categories, which are a large class, and there are several subclasses, some subclasses, and subclasses. When the category is uncertain, the user wants to add a new subclass at any category, or delete all the subclasses under the category, and it is expected to grow gradually, and we will consider using a data sheet. To save this data. According to the teachings in the textbook, the second type of programmer will probably design a data sheet structure similar to this:
Category Table_1 (Type_Table_1) Name Type Constrained Condition Description Type_ID INT No Duplicate Category ID The top node is set to a unique value
Such design is short, fully satisfying 3NF, and can meet all the requirements of the user. Is it the case? The answer is NO! Why?
Let's estimate how users want to list data from this table. For the user, he certainly expects all the categories to be listed in the level of the level he set, for example: Total Category 1 Category 1.1 Category 1.1.1 Category 1.2 Category 2 Category 2.1 Category 3 Category 3.1 Category 3.2 ......
See how many times will you do with the top table in order to achieve such a list (sequence traversal of the tree)? Note that although the category 1.1.1 may be added after the category 3.2, the answer is still N times. Such efficiency has no effect on small amounts of data, but after the lane type is expanded to dozens of or even hundreds of records, the single column is to retrieve dozens of tables, and the operation efficiency of the entire program does not dare. Perhaps the second type of programmer will say, then I built a temporary array or temporary table, specifically saving the order of traversal traversal of the type table, which only retrieves ten times in the first runtime, once again all types of type relationship Read the temporary array or temporary table directly. In fact, use no new memory to save these data, just make certain expansion of the data table, and then constrain the number of added types, and to complete the above list. Just retrieve it. Below is the expansioned data table structure:
Category Table_2 (Type_Table_2) Name Type Constrained Condition Description Type_ID INT No Duplicate Category Identifier, Primary Key Type_name Char (50) Does Not Allows Type_father Int Not Allows Type_father Int Not Allows Empty Category Miki Identification, if yes The top node is set to a single value type_layer char (6) Limited 3 layers, the initial value is the order traversal of 000000 category, mainly for retrieving the number of retrieval databases
According to such a table structure, let's take a look at what the above example records in the table:
TYPE_ID TYPE_LAYER1 Total 0 0000002 Category 1 1 0100003 Category 1.1 2 0101005 Category 1.2 2 0102005 Category 2 1 0200006 Category 2.1 5 0201007 Category 3 1 0300009 Category 3.1 7 0301009 Category 3.2 7 0301010 Category 1.1.1 3 010101 ...... now press now TYPE_LAYER Size to search: SELECT * from Type_Table_2 Order by Type_Layer
The list of records is listed as follows:
TYPE_ID TYPE_NAME TYPE_FATHER TYPE_LAYER1 Total Category 0 0000002 Category 1 1 0100003 Category 1.1 2 01010010 Category 1.1.1 3 0101014 Category 1.2 2 0102005 Category 2 1 0200006 Category 2.1 5 0201007 Category 3 1 0300008 Category 3.1 7 0301009 Category 3.2 7 030200 ......
The record order listed now is exactly the result of the order traversal. When the level of the display category is controlled, as long as the value in the TYPE_LAYER field is judged, each 2-bit group, such as greater than 0, one to right. Of course, the restrictions set in this example are up to 3 floors, up to 99 subcategories per floor, as long as the length and bits of Type_Layer can be modified according to the user's needs, you can change the limit layer and subclass number. In fact, the above design is not only used in the category table, and some of the discussion of the forum programs displayed by the tree list is mostly similar design.
Perhaps some people think that type_father fields in Type_Table_2 are redundant data, which can be removed. If so, insertion, delete a category, you have a cumbersome judgment on the content of Type_Layer, so I don't eliminate the Type_father field, which is also in line with the appropriate retention of redundant data in the database design to reduce the complexity of the program. The principle of degrees, I will give a case in deliberately add data redundancy.
Second, the design of the product information form is assumed to be the developer of the department store computer department, a day boss requires you to develop a set of online e-commerce platforms for the company, and the department store has thousands of merchandise for sale, but currently intended to be There are dozens of convenient transportation, of course, will increase new commodities on the e-commerce platform in the future. The design of the product information table of the platform database is now started. Each sale of goods will have the same properties, such as product number, product name, product belongings, related information, suppliers, internal components, inventories, in stock, price, price. You quickly designed 4 tables: Product Type Table (Wares_Type), supplier (WARES_PROVIDER), product information table (WARES_INFO):
Product Type Table (Wares_TYPE) Name Type Constrained Condition Description Type_ID INT No Duplicate Category Identifier, Primary Key Type_name Char (50) Does Not Allows Type_father Int Not Allows Type_father Int Not Allows Empty Categories of Categories, if it is top If the node is set to a single value type_layer char (6) limit 3 floors, the initial value is the order traversal traversal, mainly for retrieving the retrieval database
Supplier table (WARES_PROVIDER) Name Type Constraint Condition Description Provider_ID INT No Reproduction Supplier, Probiors PROVIDER_NAME CHAR (100) Does not allow the supplier name
Product Information Table (Wares_Info) Name Type Constraint Condition Description Wares_ID INT No Duplicate Product Identity, Master Wares_Name Char (100) Does Not Allows Empty Product Name WARES_TYPE INT Not allowed to be an empty product type ID, and WARES_TYPE.TYPE_ID association WARES_INFO CHAR (200) Allowing the empty related information, the provocier Int is not allowed to be empty supply vendor identity, and WARES_PROVIDER.PROVIDER_ID association setNum Int initial value is one internal component, the default is 1stock int initial value of 0 stock, the default is 0buy_price Money not allowed to ship Price Sell_Price Money does not allow for empty sales price discount money does not allow for empty discounts, you have these three tables to check the boss, the boss hopes to add a field of items, but only some of the goods have pictures. OK, you add a HASPIC BOOL field in the product information sheet (Wares_Info), then build a new table - Product image (WARES_PIC):
Product Image Table (Wares_PIC) Name Type Constrained Condition Description PIC_ID INT No Duplicate Product Image Identity, Primary Key Wares_ID INT Does Not Allows Empty Ordinary Commodity ID, and WARES_INFO.WARES_ID Association PIC_ADDRESS CHAR (200) Does not allow empty picture storage path
After the process is completed, fully meet the current requirements of the boss, so it is officially enabled. After a while, the boss intends to launch new product sales on this platform, of which certain types of goods need to be added "Length" attributes. The first round is tossing ... Of course, you add a new table in the product information sheet (Wares_Info) in the product information sheet (Warength) in the product information sheet (Wares_Info), and built a new table (Wares_Length) ):
Product Length Table (Wares_length) Name Type Constrained Condition Description Length_ID INT No Duplicate Product Image Identification, Primary Key WARES_ID INT does not allow the empty belonging, and WARES_INFO.WARES_ID association Length Char (20) does not allow empty goods length Description
I have just finished finishing, the boss plans to put a new batch of goods, this kind of goods need to add "width" attributes. You bite your teeth and take care of the medicine and add a commodity width table (Wares_WIDTH). After another time, there are some properties that need to add "height" in new products. Do you start to think that the database you designed is growing in this way, soon become a maze? So, is there any way to curb this unpredictability, but is similar to repeated database expansion? I am reading "Agile Software Development: Principle, Mode and Practice", the author has found similar example: 7.3 "Copy" program. Among them, I very much agree with agile software development. At present, I originally designed it, but once the demand changes, it is a programmer who pursues excellence, it should be reviewed from the beginning of the architecture, and designed in this revision It is possible to meet a system architecture similar to modified in the future. Below is the modified scheme I provide when I need to add "length":
Remove the Haspic field in the product information sheet (Wares_Info), add a product additional property sheet (WARES_EX_PROPERTY) and product extra information sheet (Wares_ex_info) 2 tables to complete the function of adding new properties.
Product Additional Property Name Type Constrained Condition Description EX_PID INT No Duplicate Product Additional Properties Identifier, Primary Keys P_NAME CHAR (20) Does Not Allow Extraction Attribute Name Product Extra Information Table (Wares_ex_info) Name Type Constraint Conditions EX_IID INT No Duplication Product extra information identification, primary key WARES_ID INT does not allow the empty belonging, and WARES_INFO.WARES_ID association Property_ID INT does not allow additional attribute identity for empty goods, and WARES_EX_PROPERTY.EX_PID Association Property_Value Char (200) Not allowed for empty trade additional attribute values
Add 2 records in the extra property of the product (Wares_ex_property): EX_PID P_NAME1 Product Image 2 Product Length
Then add a function of additional attribute management in the background management function of the entire e-commerce platform. The new attributes appear when adding new items, just add a record to the product additional property table (Wares_ex_property). can. Don't be afraid of changes, it is not a bad thing in the first bullet, the bad is the second, the third bullet hits by the same orbit. The first bullet has been more early, the stronger the injury, the stronger the resistance is 8)
Third, the design and development of multi-user and its rights management software, it is impossible to consider the problem of multi-user and user permission settings. Although the current market, the medium-sized background database system software provides multi-user, and the functions of the permission settings for a table in a database, I am personal suggestion: a ripe database management software, should still be self Designing the user manages this function, the reasons are two: 1. The multi-user and its permissions settings provided by the large and medium background database system software are all the share of the database, and it is not necessarily to fully meet the needs of certain special cases; 2. Do not excessively dependent on certain special features of the background database system software, multiple large, medium-sized background database system software is not fully compatible. Otherwise, once in the future, it is necessary to convert the database platform or the background database system software version upgrade, the previous architecture design is likely to not be reused.
Let's take a look at how to design a more flexible multi-user management module, that is, the system administrator of the database management software can add new users to the new user, modify the privileges of existing users, and delete existing users. First, analyze user needs, list all the functions required to implement the database management software; then, the function is classified according to certain contacts, that is, the function of a type of user needs to be used as a class; final start construction form: function Table (Function_Table) Name Type Constraint Condition Description F_ID INT No Duplicate Function Identity, Motor Keys F_NAME CHAR (20) Does Not Allowing F_Desc Char (50) Allows Empty Differential Description
User Group (User_Group) Name Type Constraint Condition Description Group_ID INT No Duplicate User Group Identification, Motor Key Group_Name Char (20) Does Empty User Group Name Group_Power Char (100) Allows the empty user group rights list, content is functional table F_ID Collection
User_Table Name Type Constraint Condition Description User_ID INT No Duplicate User ID, Master Key User_name Char (20) None User Name User_PWD Char (20) Does not allow the empty user password User_Type Int Not allowed to be empty belongs to the user group identifier, and User_group.group_id association
The architecture design of this user group is designed. When you need to add new users, just specify the user group to which the new user belongs; when the system needs to add new features or modify the old function permission, only the operational function table and user The record of the group table, the function of the original user can change accordingly. Of course, this architecture design moves the function of the database management software to the front desk, making the front desk development relatively complex. However, this price is worthwhile when the number of users is large (more than 10 people), or the probability of software upgrade is large. Fourth, a concise batch m: N design encounters the relationship between M: N, generally built 3 tables, M, N, M: N. However, M: N sometimes encounters batch processing, for example, to library borrowing books, generally allow users to borrow N this book at the same time, if requested by batch query, listed a certain approval of a user All books, how to design it? Let us build 3 tables must be first:
Bookmark (BOOK_TABLE) Name Type Constrained Condition Description Book_ID INT No Repeated Book Identification, Master Keys Book_no Char (20) No Repeated Book No. Book_name Char (100) Does Not Allow Empty Book Name ...
Borrowing User Table (RENTER_TABLE) Name Type Constrained Condition Description Renter_ID INT No Duplicate User ID, Master Renter_name Char (20) Does not allow empty user name ...
Borrowing Recording Table (Rent_Log) Name Type Constrained Condition Description Rent_ID INT No Repeater Recording ID Not allowed to be asged for the free ...
In order to implement a batch query borrow record, we can build a table to save the batch borrow information, for example:
Batch Borrowing Table (Batch_rent) Name Type Constrained Condition Description Batch_ID INT No Duplicate Birage Borrow Identification, Primary Key Batch_no Int does not allow for an empty amount borrow number, the same bike_no the same RENT_ID INT does not allow for the air borrow record identification, and Rent_log.rent_ID Related Batch_Date DateTime does not allow for an empty borrowing time
Is this design? Let's take a look at all books for a certain user's book. How do you need to query? First, retrieve the bulk borrowing table (Batch_rent), save the data of all the Rent_ID fields of the eligible, and then use this data as the query condition to the borrow record table (Rent_Log) to query. So, is there any way to improve? Here is a simple batch design, you don't need to add a new table, just modify the borrowing record table (Rent_Log). The modified record table (Rent_log) is as follows:
Borrowing Recording Table (Rent_log) Name Type Constraint Condition Description Rent_ID INT No Repeater Recording ID, Primary Keys R_ID INT does not allow empty user ID, and renter_table.renter_id association B_ID INT does not allow for empty book identity, and Book_Table.book_id associated Batch_no int Do not allow for an empty binding number, the same bits_no of the same borrowed Borrow, the same Rent_Date DateTime, is not allowed to be a severity ...
Among them, BATCH_NO, the same borrowing, and the RENT_ID of the batch of the first entry. Example: Suppose the current maximum Rent_ID is 64, then a user borrows 3 books once, the BATCH_NO of the 3 borrowed records of batch insertion is 65. The other user rents a set of discs, and then inserts the rent_id of the rental record is 68. With this design, query information about batch borrows, just use a standard T_SQL nested query. Of course, this design does not conform to 3NF, but which is better than the 3NF design of the above standard? I don't have to say anything. 5. A redundant field is retained in the "Dataset Table" of the redundant data. The example here is further - add a redundant table. Let's take a look: I originally located in order to solve the work meal of the employee, contact with a nearby restaurant, eat accounting every day, the cost is flattened, the month is settled by the company, every month's work meals The salary is deducted. Of course, the number of people and people eating every day is not fixed, and because the cost of each donkey is different, the cost of each meal is different. For example, Monday Chinese 5 people spend 40 yuan, 2 people for dinner spend 20, 6 people spend 36 yuan on Tuesday, 3 people spend 18 yuan for dinner. In order to facilitate the calculation of each month's work meal, I wrote a simple dining account management program, 3 tables in the database:
Employee Table (Clerk_Table) Name Type Constrained Condition Description Clerk_ID INT No duplicate employee ID, primary key clerk_name char (10) does not allow empty employee name
Total Table (EATDATA1) Name Type Constrained Conditions TOTLE_ID INT No Repeat Each Metrodun Identity ) Do not allow empty dining types, used to distinguish, dinner TOTLE_PRICE MONEY does not allow empty total cost Persons_Num Int not allowed to eat empty meals
Name Table Fine Table (EATDATA2) Name Type Constrained Condition Description ID INT No Duplicate Table Fine Table Identification, Primary Keys T_ID INT does not allow empty total table identification, and EATDATA1.TOTLE_ID association C_ID INT is not allowed to be empty Identification, and clerk_table.clerk_id associated price Money not allowed to spend empty per person
Among them, the recording of the dining fee fine table (EATDATA2) is to open a record of each meal (EATDATA1), and it is a non-criminal redundancy. Of course, you can also merge the partial fields of each meal (EATDATA1) into the dining counting fee detail table (EATDATA2), so that the total table of each meal (EATDATA1) has become redundant tables, but the meals designed. The test table is more than the data compared to the above scheme. However, it is the redundancy table of the meal billing table (EATDATA2). When doing a monthly meal statistics, it greatly simplifies the complexity of programming, only use such a query statement to count each person Monthian dining fee general:
SELECT clerk_name AS personname, COUNT (c_id) as eattimes, SUM (price) AS ptprice FROM Eatdata2 JOIN Clerk_tabsle ON (c_id = clerk_id) JOIN eatdata1 ON (totleid = tid) WHERE eat_date> = CONVERT (datetime, ' "& the_date &"') And eat_date imagined, if you don't have this redundancy, you will have more trouble every month's monthly meal account, and the program efficiency is also. So, when can I add a certain redundant data? I think there are 2 principles:
1. The user's overall demand. When the user is more concerned, the data is processed by a certain algorithm for the specification record of the database. If the algorithm can directly utilize the built-in function of the background database system, it can be appropriately added redundant fields, even redundant tables to save these data processed after the algorithm. To know, for large queries of data, modify or delete, the background database system is much higher than the code we have written by.
2. Simplify the complexity of development. Modern software development, achieving the same function, there are many ways. Although it is not necessary to ask the programmer to master most of the development tools and platforms, it still needs to understand which method is more concise and more efficient. The nature of redundant data is to use space for time, especially the current development of hardware is much higher than the software, so appropriate redundancy is acceptable. But I still emphasize that don't have too much dependency platform and development tools to simplify development. If this degree is not grasped, the post-maintenance upgrade will be planted.