Transfer "Talking about Database Design Skills"

xiaoxiao2021-03-05  47

Speaking of the database, I don't think you can't talk about the data structure first. In 1996, when I first study computer programming, the teacher told us: Computer program = data structure algorithm. Although the current program development has gradually transitioned to the object-oriented, I still endorsed the formula of the teacher to tell us in 8 years: computer program = data structure algorithm. Object-oriented program development, the first thing to do is to analyze the data that needs to be processed throughout the program, extracts the abstract template, designs classes in this abstract template, then add a function of processing its data (ie Algorithm), finally, the data member and function in the class are partitioned to achieve a package.

The initial prototype of the database is said to be a record of a cows in the United States (paper, which can be seen, the database does not necessarily be the data stored in the computer ^ _ ^), recorded the balance of the dairy farm Accounts, the programmer is inspired when it is enjointed it and enters the computer. When the amount of data collected in accordance with the specified data structure is large to a certain extent, the programmer separates the retrieval, update maintenance and other functions for program execution efficiency, and makes a separately called module, this module Later, slowly develop, evolve into a database management system (DBMS) - a important branch in program development now.

Let's enter the topic, first click on the skill of the database designer in the program that I personally touched: 1. There is no system to learn the programmer of the data structure. Such programmers' works are often just their impromptu toys. They are often only a limited number of tables, achieving all types of functions in a table, almost unrelated between the tables. There are many free management software on the Internet. When the program is limited, the data is not much, and the program runs does not have any problems, but if the management is more important than the more important data, the risk is very large. 2, the system has learned the data structure, but has not developed programmers for management software for program efficiency requirements. These people have just graduated from schools soon, they are strictly based on the design of the database table structure, and they are strictly buckled with the E-R map and 3nf (don't discourage, all database design masters start from this step). Their works are enough for general Access type light level management software. But once the system needs to add new functions, the original database table is almost a large blood. 3, second types of programmers, after experiencing the efficiency of several programs, and after the functional upgrade, finally upgraded to the old bird designed by the database, the highest people in the eyes of the first class. Such programmers can be eligible for the development of medium-sized commercial data management systems above more than twenty tables. They know what to keep certain redundant data to improve program efficiency, and their design is better, when the user needs to add new functions, the original database table is only necessary to do a small amount of modification. . 4. After the repetition design of the top ten similar database management software, the third type of programmer insisted that there was no transfer, but it hoped that the "lazy" tricks will slowly realize, so that the completion of the volume change Conversion. The database table structure they designed has a certain vision, predicting the data required for future functional upgrades, thereby pre-detail. Such programmers currently advance in advanced software developers in data mining. 5. Third class programmers or collected programmers, after a certain research on the principles and development of existing database management systems, or on the basis of the second development, either develop a set A general database management system with independent copyright.

I personally is in the end of the third category, so some of the design techniques listed below are only suitable for the second class and some third type of database designer. At the same time, because I rarely encounter interested in this area, I will inevitably have errors and omissions in the text. In this first declaration, welcome everyone to correct, don't hide, 8), tree-type relationship data sheet Many programmers encountered data of tree-type relationships when they perform database design, such as common categories, which are a large class, and there are several subclasses, some subclasses, and subclasses. When the category is uncertain, the user wants to add a new subclass at any category, or delete all the subclasses under the category, and it is expected to grow gradually, and we will consider using a data sheet. To save this data. According to the teachings in the textbook, the second type of programmer will probably design a data sheet structure similar to this:

Category Table_1 (Type_Table_1) Name Type Constrained Condition Description Type_ID INT No Duplicate Category ID The top node is set to a unique value

Such design is short, fully satisfying 3NF, and can meet all the requirements of the user. Is it the case? The answer is NO! Why?

Let's estimate how users want to list data from this table. For the user, he certainly expects all the categories to be listed in the level of the level he set, for example: Total Category 1 Category 1.1 Category 1.1.1 Category 1.2 Category 2 Category 2.1 Category 3 Category 3.1 Category 3.2 ......

See how many times will you do with the top table in order to achieve such a list (sequence traversal of the tree)? Note that although the category 1.1.1 may be added after the category 3.2, the answer is still N times. Such efficiency has no effect on small amounts of data, but after the lane type is expanded to dozens of or even hundreds of records, the single column is to retrieve dozens of tables, and the operation efficiency of the entire program does not dare. Perhaps the second type of programmer will say, then I built a temporary array or temporary table, specifically saving the order of traversal traversal of the type table, which only retrieves ten times in the first runtime, once again all types of type relationship Read the temporary array or temporary table directly. In fact, use no new memory to save these data, just make certain expansion of the data table, and then constrain the number of added types, and to complete the above list. Just retrieve it. Below is the expansioned data table structure:

Category Table_2 (Type_Table_2) Name Type Constrained Condition Description Type_ID INT No Duplicate Category Identifier, Primary Key Type_name Char (50) Does Not Allows Type_father Int Not Allows Type_father Int Not Allows Empty Category Miki Identification, if yes The top node is set to a single value type_layer char (6) Limited 3 layers, the initial value is the order traversal of 000000 category, mainly for retrieving the number of retrieval databases

According to such a table structure, let's take a look at what the above example records in the table:

TYPE_ID TYPE_NAME TYPE_FATHER TYPE_LAYER 1 Total Category 0 000000 2 Category 1 1 010000 3 Category 1.2 2 010200 5 Category 2 1 020000 6 Category 2.1 5 020100 7 Category 3 1 030000 8 Category 3.1 7 030100 9 Category 3.2 7 030200 10 Category 1.1.1 3 010101 ... Now press Type_Layer's size to retrieve it: SELECT * from Type_Table_2 Order by Type_Layer

The list of records is listed as follows:

TYPE_ID TYPE_NAME TYPE_FATHER TYPE_LAYER 1 Total Category 0 000000 2 Category 1 1 010000 3 Category 1.1 2 010100 10 Category 1.1.1 3 010101 4 Category 1.2 2 010200 5 Category 2 1 020000 6 Category 2.1 5 020100 7 Category 3 1 030000 8 Category 3.1 7 030100 9 Category 3.2 7 030200 ......

The record order listed now is exactly the result of the order traversal. When the level of the display category is controlled, as long as the value in the TYPE_LAYER field is judged, each 2-bit group, such as greater than 0, one to right. Of course, the restriction conditions set in this example are up to 3 floors, up to 99 subcategories per floor, as long as the length and bit of Type_Layer can be modified according to the user's needs, you can change the limit layer and subclass number. In fact, the above design is not only used in the category table, and some of the discussion of the forum programs displayed by the tree list is mostly similar design.

Perhaps some people think that type_father fields in Type_Table_2 are redundant data, which can be removed. If so, insertion, delete a category, you have a cumbersome judgment on the content of Type_Layer, so I don't eliminate the Type_father field, which is also in line with the appropriate retention of redundant data in the database design to reduce the complexity of the program. The principle of degrees, I will give a case in deliberately add data redundancy.

Second, the design of the product information form is assumed to be the developer of the department store computer department, a day boss requires you to develop a set of online e-commerce platforms for the company, and the department store has thousands of merchandise for sale, but currently intended to be There are dozens of convenient transportation, of course, will increase new commodities on the e-commerce platform in the future. The design of the product information table of the platform database is now started. Each sale of goods will have the same properties, such as product number, product name, product belongings, related information, suppliers, internal components, inventories, in stock, price, price. You quickly designed 4 tables: Product Type Table (Wares_Type), supplier (WARES_PROVIDER), product information table (WARES_INFO):

Product Type Table (Wares_TYPE) Name Type Constrained Condition Description Type_ID INT No Duplicate Category Identifier, Primary Key Type_name Char (50) Does Not Allows Type_father Int Not Allows Type_father Int Not Allows Empty Categories of Categories, if it is top If the node is set to a single value type_layer char (6) limit 3 floors, the initial value is the order traversal traversal, mainly for retrieving the retrieval database

Supplier (WARES_PROVIDER) Name Type Constraint Condition Description Provider_id Int No Reproduction Supplier Oddress, Master Key Provider_name Char (100) Does Not Allow Empty Supplier Name Product Information Table (Wares_Info) Name Type Constraint Condition Description Wares_ID Int Commodity ID, primary key WARES_NAME CHAR (100) does not allow empty product name WARES_TYPE INT does not allow empty product type identity, and WARES_TYPE.TYPE_ID association WARES_INFO CHAR (200) Allow empty related information Provider int 不 不 不 不 不 供 供 供 标 标 标, And WARES_PROVIDER.PROVIDER_ID association setNum Int initial value is 1 internal part, the default is 1 stock int initial value of 0 stock, default 0 Buy_price Money does not allow air supply SELL_PRICE MONEY Do not allow empty sales price discount Money is not allowed Air discount price

You hold these three tables to check the boss, the boss hopes to add a field of a product image, but only some of the goods have pictures. OK, you add a HASPIC BOOL field in the product information sheet (Wares_Info), then build a new table - Product image (WARES_PIC):

Product Image Table (Wares_PIC) Name Type Constrained Condition Description PIC_ID INT No Duplicate Product Image Identity, Primary Key Wares_ID INT Does Not Allows Empty Ordinary Commodity ID, and WARES_INFO.WARES_ID Association PIC_ADDRESS CHAR (200) Does not allow empty picture storage path

After the process is completed, fully meet the current requirements of the boss, so it is officially enabled. After a period of time, the boss intends to launch new product sales on this platform, of which certain types of goods need to be added "long" properties. The first round is tossing ... Of course, you add a new table in the product information sheet (Wares_Info) in the product information sheet (Warength) in the product information sheet (Wares_Info), and built a new table (Wares_Length) ):

Product Length Table (Wares_length) Name Type Constrained Condition Description Length_ID INT No Duplicate Product Image Identification, Primary Key WARES_ID INT does not allow the empty belonging, and WARES_INFO.WARES_ID association Length Char (20) does not allow empty goods length Description

I have just finished finishing, the boss plans to put a new batch of goods, this kind of goods need to add "width" attributes. You bite your teeth and take care of the medicine and add a commodity width table (Wares_WIDTH). After another time, there are some properties that need to add "height" in new products. Do you start to think that the database you designed is growing in this way, soon become a maze? So, is there any way to curb this unpredictability, but is similar to repeated database expansion? I am reading "Agile Software Development: Principle, Mode and Practice", the author has found similar example: 7.3 "Copy" program. Among them, I very much agree with agile software development. At present, I originally designed it, but once the demand changes, it is a programmer who pursues excellence, it should be reviewed from the beginning of the architecture, and designed in this revision It is possible to meet a system architecture similar to modified in the future. Here is the modification scheme I provide when you need to add a "length" property: remove the Haspic field in the product information sheet (Wares_Info), add a product extra property table (Wares_ex_property) and the product extra information sheet (Wares_ex_info) 2 tables Complete the function of adding new properties.

Product Additional Property (WARES_EX_PROPERTY) Name Type Constrained Condition Description EX_PID INT No Duplicate Product Additional Properties Identifier, Primary Key P_Name Char (20) Does Not Allow Extra Product Name

Product extra information sheet (Wares_ex_info) Name Type Constrained Condition Description EX_IID INT No Duplicate Product Additional Information ID, Master WARES_ID INT does not allow the merchandise identifier, and WARES_INFO.WARES_ID association property_id int 不 不 不 不 空 商品,, 不 空 商品 不.ex_pid associated property_value char (200) Does not allow additional attribute values ​​for empty products

Add 2 records in the extra property of the product (Wares_ex_property): EX_PID P_NAME 1 Product Image 2 Product Length

Then add a function of additional attribute management in the background management function of the entire e-commerce platform. The new attributes appear when adding new items, just add a record to the product additional property table (Wares_ex_property). can. Don't be afraid of changes, being hit by the first bullet is not a bad thing, and bad is the second, third bullets in the second, third bullets. The first bullet has been more early, the stronger the injury, the stronger the resistance is 8)

Third, the design and development of multi-user and its rights management software, it is impossible to consider the problem of multi-user and user permission settings. Although the current market, the medium-sized background database system software provides multi-user, and the functions of the permission settings for a table in a database, I am personal suggestion: a ripe database management software, should still be self Designing the user manages this function, the reasons are two: 1. The multi-user and its permissions settings provided by the large and medium background database system software are all the share of the database, and it is not necessarily to fully meet the needs of certain special cases; 2. Do not excessively dependent on certain special features of the background database system software, multiple large, medium-sized background database system software is not fully compatible. Otherwise, once in the future, it is necessary to convert the database platform or the background database system software version upgrade, the previous architecture design is likely to not be reused.

Let's take a look at how to design a more flexible multi-user management module, that is, the system administrator of the database management software can add new users to the new user, modify the privileges of existing users, and delete existing users. First, analyze user needs, list all the functions required to implement the database management software; then, the function is classified according to certain contacts, that is, the function of a type of user needs to be used as a class; final start construction form: function Function_Table) Name Type Constraint Condition Description F_ID INT No Duplicate Function Identity, Motor Keys F_Name Char (20) Does Not Allowing F_Desc Char (50) Allows F_Desc Char (50) Allows Empty Features Description User Groups (User_Group) Name Type Type Constraint Condition Description Group_ID INT No Duplicate User Group Identification, Motor Key Group_Name Char (20) Does Allow Empty User Group Name Group_Power Char (100) Allows the empty user group rights list, the content is a collection of functional table F_ID

User_Table Name Type Constraint Condition Description User_ID INT No Duplicate User ID, Master Key User_name Char (20) None User Name User_PWD Char (20) Does not allow the empty user password User_Type Int Not allowed to be empty belongs to the user group identifier, and User_group.group_id association

The architecture design of this user group is designed. When you need to add new users, just specify the user group to which the new user belongs; when the system needs to add new features or modify the old function permission, only the operational function table and user The record of the group table, the function of the original user can change accordingly. Of course, this architecture design moves the function of the database management software to the front desk, making the front desk development relatively complex. However, this price is worthwhile when the number of users is large (more than 10 people), or the probability of software upgrade is large.

Fourth, a concise batch m: N design encounters the relationship between M: N, generally built 3 tables, M, N, M: N. However, M: N sometimes encounters batch processing, for example, to library borrowing books, generally allow users to borrow N this book at the same time, if requested by batch query, listed a certain approval of a user All books, how to design it? Let us build 3 tables must be first:

Bookmark (BOOK_TABLE) Name Type Constrained Condition Description Book_ID INT No Repeated Book Identification, Master Keys Book_no Char (20) No Repeated Book No. Book_name Char (100) Does Not Allow Empty Book Name ...

Borrowing User Table (RENTER_TABLE) Name Type Constrained Condition Description Renter_ID INT No Duplicate User ID, Master Renter_name Char (20) Does not allow empty user name ...

Borrowing Recording Table (Rent_Log) Name Type Constrained Condition Description Rent_ID INT No Repeater Recording ID Not allowed to be asged for the free ...

In order to implement a batch query borrow record, we can build a table to save the batch borrow information, for example:

Batch Borrowing Table (Batch_rent) Name Type Constrained Condition Description Batch_ID INT No Duplicate Birage Borrow Identification, Primary Key Batch_no Int does not allow for an empty amount borrow number, the same bike_no the same RENT_ID INT does not allow for the air borrow record identification, and Rent_log.rent_ID Is the associated batch_date datetime not allowed to be designed for an empty binding time? Let's take a look at all books for a certain user's book. How do you need to query? First, retrieve the bulk borrowing table (Batch_rent), save the data of all the Rent_ID fields of the eligible, and then use this data as the query condition to the borrow record table (Rent_Log) to query. So, is there any way to improve? A simple batch design is given under the next surface, and you need to add a new table, just modify the borrowing record table (rent_log). The modified record table (Rent_log) is as follows:

Borrowing Recording Table (Rent_log) Name Type Constraint Condition Description Rent_ID INT No Repeater Recording ID Do not allow for an empty binding number, the same bits_no of the same borrowed Borrow, the same Rent_Date DateTime, is not allowed to be a severity ...

Among them, BATCH_NO, the same borrowing, and the RENT_ID of the batch of the first entry. Example: Suppose the current maximum Rent_ID is 64, then a user borrows 3 books once, the BATCH_NO of the 3 borrowed records of batch insertion is 65. The other user rents a set of discs, and then inserts the rent_id of the rental record is 68. With this design, query the information of the borrowing borrowing, just use a standard T_SQL nested query. Of course, this design does not conform to 3NF, but which is better than the 3NF design of the above standard? If you don't have to say it.

5. A redundant field is retained in the "Dataset Table" of the redundant data. The example here is further - add a redundant table. Let's take a look: I originally located in order to solve the work meal of the employee, contact with a nearby restaurant, eat accounting every day, the cost is flattened, the month is settled by the company, every month's work meals The salary is deducted. Of course, the number of people and people eating every day is not fixed, and because the cost of each donkey is different, the cost of each meal is different. For example, Monday Chinese 5 people spend 40 yuan, 2 people for dinner spend 20, 6 people spend 36 yuan on Tuesday, 3 people spend 18 yuan for dinner. In order to facilitate the calculation of each month's work meal, I wrote a simple dining account management program, 3 tables in the database:

Employee Table (Clerk_Table) Name Type Constrained Condition Description Clerk_ID INT No duplicate employee ID, primary key clerk_name char (10) does not allow empty employee name

Total Table (EATDATA1) Name Type Constrained Conditions TOTLE_ID INT No Repeat Each Metrodun Identity ) Does not allow for empty dining type, used to distinguish, dinner TOTLE_PRICE MONEY does not allow empty total cost Persons_Num Int Not allowed for empty meals Number Fine table (EATDATA2) Name Type constraint Conditions ID INT No duster The billing fine table identifier, the primary key T_ID INT does not allow the annual total table identification, and EATDATA1.TOTLE_ID association C_ID INT is not allowed to identify the empty worker identification, and the Clerk_Table.clerk_id association Price Money does not allow for empty forenson

Among them, the recording of the dining fee fine table (EATDATA2) is to open a record of each meal (EATDATA1), and it is a non-criminal redundancy. Of course, you can also merge the partial fields of each meal (EATDATA1) into the dining counting fee detail table (EATDATA2), so that the total table of each meal (EATDATA1) has become redundant tables, but the meals designed. The test table is more than the data compared to the above scheme. However, it is the redundancy table of the meal billing table (EATDATA2). When doing a monthly meal statistics, it greatly simplifies the complexity of programming, only use such a query statement to count each person Monthian dining fee general:

SELECT clerk_name AS personname, COUNT (c_id) as eattimes, SUM (price) AS ptprice FROM Eatdata2 JOIN Clerk_tabsle ON (c_id = clerk_id) JOIN eatdata1 ON (totleid = tid) WHERE eat_date> = CONVERT (datetime, ' "& the_date &"') And Eat_Date

Imagine, if you don't have this redundancy, every time you count the monthly meal fee receipt, you will have more troubles, and the program efficiency is also awkward. So, when can I add a certain redundant data? I think there are 2 principles:

1. The user's overall demand. When the user is more concerned, the data is processed by a certain algorithm for the specification record of the database. If the algorithm can directly utilize the built-in function of the background database system, it can be appropriately added redundant fields, even redundant tables to save these data processed after the algorithm. To know, for large queries of data, modify or delete, the background database system is much higher than the code we have written by. 2. Simplify the complexity of development. Modern software development, achieving the same function, there are many ways. Although it is not necessary to ask the programmer to master most of the development tools and platforms, it still needs to understand which method is more concise and more efficient. The nature of redundant data is to use space for time, especially the current development of hardware is much higher than the software, so appropriate redundancy is acceptable. However, I still emphasize that I don't want too much dependency platform and development tool to simplify the development. If this is not grasped, the post-maintenance upgrade will be planted.

转载请注明原文地址:https://www.9cbs.com/read-36578.html

New Post(0)