Database design three major paradigm Application instance analysis of the database design paradigm is a specification that needs to meet the database design, and the database that meets these specifications is simple, the structure is clear, and insertion (INSERT), delete, and update. (UPDATE) Operation exception. It is a mess, but not only make troubles to the database programmers, but also the face, there may be a lot of unwanted redundant information.
The current relational database has six paradigm: the first paradigm (1NF), a second parameter (2nF), a third paramode (3NF), the fourth paradiglass (4NF), the fifth parameter (5nf), and sixth paradigm (6nf). A paradord that satisfies the minimum requirement is the first paradigm (1nf). On the basis of the first paradigm, more requirements are further satisfied as the second paradigm (2nf), and the remaining formula is pushed by secondary. Generally speaking, the database only needs to meet the third paradigm (3NF).
Is the design paradigm not very difficult? Non-also, the university textbook gives us a bunch of mathematical formulas. Of course, don't understand, can't remember. So many of us don't design the database at all.
In essence, the design paradigm can be clear, and the very simple words can be clear and how. This article will make a popularity of paradigms, and a database of a simple forum used as an example to make these paradigms to actual projects.
Paradigm description
The first paradigm (1NF): The fields in the database table are single attributes, and they cannot be divided. This single property consists of basic types, including integer, real numbers, characters, logical, date types.
For example, the following database table is in line with the first paradigm:
Field 1 Field 2 Field 3 Field 4
And such database tables do not conform to the first paradigm:
Field 1 Field 2 Field 3 Field 4 Field 3.1 Field 3.2
Obviously, in any current relational database management system (DBMS), fools are not possible to make a database that does not meet the first paradigm, as these DBMS do not allow a column of the database table to two or more columns. Therefore, you want to design a database that does not meet the first paradigm in the existing DBMS is impossible.
The second paradigm (2NF): There is no non-critical field to the partial function of any candidate key field in the database table (partial function dependent is that some fields in the combination key are determined to determine the non-key field. Although all non-critical fields are completely dependent on any set of candidate keywords. Assume that the class relationship table is SELECTCOURSE (student number, name, age, course name, grade, credits), keywords are combined keywords (student numbers, courses), because there is a decision relationship:
(Student number, course name) → (Name, age, grade, credit)
This database table does not satisfy the second paradigm, because there is a decision relationship:
(Course Name) → (credit)
(Student number) → (Name, age)
That is, there is a case where the field in the combination key is determined to be non-key.
Since it does not meet 2NF, this course has the following questions:
(1) Data redundancy:
The same course is repaired by n students, "credits" repeats N-1 times; the same student has elected M door courses, name and age repeats M-1.
(2) Update exception:
If a course of a course is adjusted, the "credit" value of all its rows in the data sheet is updated, otherwise the same course score will appear.
(3) Insert anomalies:
Suppose there is a new course, there is no need to repair it. This way, because there is no "learning" keyword, course name and credits can not be recorded into the database. (4) Delete exception:
Suppose a group of students have completed the elective elective study, these elective records should be removed from the database table. However, at the same time, the course name and credit information are also deleted. Obviously, this will also cause insertion abnormalities.
Change the selection table selectcourse as the following three tables:
Student: Student (Learn, Name, Age);
Course: Course (course name, credits);
Choice: SELECTCOURSE (student number, course name, grade).
Such a database table is in line with the second paradigm, eliminating data redundancy, updating exception, anomaly, and removal exception.
In addition, all single-key database tables are in line with the second paradigm because there is no combination keyword.
Third paradigm (3nf): On the basis of the second paradigm, if there is no non-critical field to the transfer function of any candidate key field, it meets the third paradigm. The so-called transfer function is dependent, and it is to refer to the decision relationship of "A → B → C", depending on A. Therefore, the database table that meets the third paradigm should have the following dependencies:
Key Field → Non-Key Fields x → Non-Key Fields Y
Suppose the student relationship table is Student (student number, name, age, school, college location, college phone), keyword is a single keyword "student number", because there is a decision relationship:
(Learn) → (Name, Age, College, College Location, College Phone)
This database is compliant with 2NF, but does not meet 3NF because there is a decision relationship:
(Student number) → (the college) → (College location, college phone)
There is a non-key field "College place", "college phone" to conversion function on the transfer function of key field "Learning".
It also exists data redundancy, updating exception, insertion abnormality, and removing exceptions, readers can analyze their own analysis.
Assign a student relationship table into two tables:
Student: (student number, name, age, college);
College: (College, place, phone).
Such database tables are in line with the third paradigm, eliminating data redundancy, updating exception, anomalies, and delete exceptions.
Bowes - BSKEA (BCNF): On the basis of the third paradigm, if there is no function dependent on the transfer function of any field to any of the candidate key fields, if there is no derived function of any field, it meets the third paradigm.
Suppose the warehouse management relationship table is StorehouseManage (warehouse ID, storage ID, admin ID, number), and one administrator is only working in a warehouse; a warehouse can store a variety of items. There is a decision relationship in this database table:
(Warehouse ID, Storage Item ID) → (Admin ID, Quantity)
(Admin ID, Storage Item ID) → (Warehouse ID, Quantity)
Therefore, (Warehouse ID, Storage Item ID) and (Admin ID, Storage Item ID) are all candidate keywords for StoreHouseManage, and the only non-key field in the table is quantified, which is in line with the third paradigm. However, due to the following decisions:
(Warehouse ID) → (Admin ID)
(Admin ID) → (Warehouse ID)
That is, there is a case where the key field determines the key field, so it does not conform to the BCNF paradigm. It will have the following abnormalities:
(1) Delete exception:
When the warehouse is emptied, all "Storage Item ID" and "Quantity" information are deleted, while the "warehouse ID" and "admin ID" information are also deleted. (2) Insert anomalies:
When the warehouse does not store any items, the administrator cannot be assigned to the warehouse.
(3) Update exception:
If the warehouse is changed, the administrator ID of all rows in the table must be modified.
Decompose the warehouse management relationship table into two relationships:
Warehouse management: StoreHouseManage (Warehouse ID, Admin ID);
Warehouse: Storehouse (Warehouse ID, Storage Item ID, Quantity).
Such database tables are in line with BCNF paradigm, eliminating deleting exceptions, inserting exceptions and update exceptions.
Paradigm application
Let's gradually get a discussion of a forum, there is the following information:
(1) User: Username, Email, Home, Phone, contact Address
(2) Post: Post title, post content, reply title, reply content
For the first time, we design the database as just a table:
Username Email Home Phone Contact Address Posting Title Post Content Reply Title Reply Content
This database table meets the first paradigm, but there is no set of candidate keywords to determine the entire rule of the database table, the only key field username does not fully determine the entire tuple. We need to add "Post ID", "Reply ID" field, will be modified to:
Username Email Home Phone Contact Address Posting ID Posting Title Post Content Reply ID Reply Title Reply Content
The keywords (user name, post ID, reply ID) in the data table can determine the whole rule:
(Username, Post ID, Reply ID) → (Email, Home, Phone, Contact Address, Post Title, Posting Content, Reply Title, Reply Content)
However, such a design does not meet the second paradigm because there is a decision:
(Username) → (Email, Home, Phone, contact address)
(Post ID) → (Post title, post content)
(Reply ID) → (Reply to the title, reply content)
That is, the non-key field part of the function is dependent on the candidate key field, which is clear that this design will result in a large amount of data redundancy and operation exception.
We decompose the database table to (with the next line as a keyword):
(1) User information: User name, email, home page, phone, contact address
(2) Post information: Post ID, title, content
(3) Reply information: Reply ID, title, content
(4) Posting: Username, Post ID
(5) Reply: Post ID, Reply ID
Such a design is to meet the requirements of the first, 2, 3 paradigm and the BCNF paradigm, but this design is the best?
Not necessarily.
Observing that the "user name" and "Post" in the 4th "Post" are 1: N relationship, so we can consolidate "post information" in the second item; The "Post ID" and "Reply" in the item "Reply" are also 1: N relationship, so we can merge "Reply" into "Reply Information" in Article 3. This can reduce data redundancy, new design as: (1) User information: User name, email, home page, phone, contact address
(2) Post information: User name, post ID, title, content
(3) Reply information: Post ID, Reply ID, Title, Content
Database Table 1 obviously meets all paradigm requirements;
There is a non-critical field "title", "Content" in the database table 2, depending on the partial function of the key field "Post ID", that is, the requirements of the second paradigm, but this design does not cause data redundancy and Operation exception;
There is also a non-critical field "title", "content", "content", depending on the "Reply" of the key field "Reply ID", does not satisfy the requirements of the second paradigm, but this design is similar to the database table 2. Not caused data redundancy and operation exception.
It can be seen that it is not necessary to force the paradigm requirements, for 1: N relationship, when 1 is merged into N, the N is no longer satisfying the second paradigm, but this Designed is better!
For the relationship between M: N, the one side or N cannot be merged to the other side, which can cause non-conforming paradigm and causing operation abnormal and data redundancy. For the 1: 1 relationship, we can consolidate 1 of the 1 or right on the left to the other side, and the design leads to unregulated paradigm, but does not cause operation abnormal and data redundancy.
in conclusion
The database design that meets the paradigm is clear, and data redundancy and operation exception can be avoided. This means that the design that does not meet the paradigm must be wrong. There is a 1: 1 or 1: n-related relationship in the database table, and the incompatible paradigm is not reasonable.
When we design a database, you must always consider the requirements of paradigm.