Database design method, specification and skills (recommended)
I. Database design process database technology is the most effective means of information resource management. Database design refers to a given application environment, constructing the optimal database mode, establishes a database and its application, and effectively stores data to meet user information requirements and processing requirements. During the design of the demand analysis phase, the needs analysis phase is integrated with the application requirements of each user (the demand of the real world), forming a machine characteristics in the conceptual design phase, independent of the concept model of each DBMS product (information world model), using E-R map. In the logic design phase, convert the E-R diagram into a specific database product supported data model, such as a relationship model, forming database logic mode. Then, according to the requirements of the user processing, the necessary views (view) to form the external mode of the data is formed on the basis of the basic table. In the physical design phase, the physical storage arrangement, design index is performed, and the physical storage schedule, design index is formed, and the internal mode of the database is formed. 1. Demand Analysis Stage Demand Collection and Analysis, resulting in the data demand described by the data dictionary (and the processing requirements of the data flow diagram). The focus of demand analysis is to investigate, collect and analyze information requirements, processing requirements, security and integrity requirements in data management. Method of demand analysis: Investigate the status of organizations, investigating business activities of various departments, assisting users to identify the various requirements of the new system to determine the boundaries of the new system. Commonly used surveys: follow-up operations, open investigations, please introduce, ask, design survey sheets, please fill in, check the record. The methods of analyzing and expressing user needs mainly include two types of methods from top-down and bottom up. STRURED Analysis, SA method, from the top-level system organization, using a layer-by-layer decomposition mode analysis system, and describes each layer of data flow graphs and data dictionary. Data flow graph expresss the relationship between data and processing processes. The data in the system is described by means of the Data Dictionary, referred to as DD). The data dictionary is a collection of various types of data descriptions, which is about data in the database, namely metadata, not the data itself. Data dictionaries typically include five parts of data items, data structures, data streams, data storage, and processing (at least the data type of each field) and the primary keys within each table). Data item description = {data item name, data item meaning description, alias, data type, length, value range, value meanings, logical relationship with other data items} Data structure description = {data structure name, meaning description, composition : {Data item or data structure}} Data flow description = {data stream name, description, data stream source, data stream go, composition: {data structure}, average traffic, peak flow} Data Store Description = {Data Storage Name , Illustrated, numbered, inflow data stream, outflow data stream, composition: {data structure}, data amount, access method} process description = {processing process name, description, input: {data stream}, output: { Data flow}, processing: {Brief Description}} 2. Concept Structure Design Phase By integrating user needs, forming a conceptual model independent of concrete DBMS, can be represented by ER diagram.
Concept model is used in the model of information. The conceptual model does not rely on a data model supported by a DBMS. The conceptual model can be converted to a specific data model supported on a computer on a computer. Conceptual model Features: (1) Has strong semantic expression capability, convenient, direct expression of various semantics knowledge in the application. (2) It should be simple, clear, easy to understand, and is the language of communicating with database designers. A common method of concept model design is an IDEF1X method, which is a semantic modeling technology applied to a semantic data model to establish a system information model. The steps to create an ER model using the IDEF1x method are as follows: 2. The task of the first step-initialization project is to start from the purpose of the description and range description, determine modeling target, development modeling plan, organize modeling team, collecting Materials, formulate constraints and specifications. Collection source materials are the focus of this phase. Basic data sheets were formed by investigating and observing results, business processes, input and output of original systems, various reports, collected raw data. 2.2 First Step - Define entity entity set members have a common feature and property set, which can directly or indirectly identify most of the entities directly or indirectly in the collected source material - basic data sheet. According to the terms of the source material name table, the term "code" ends, such as customer code, agent code, product code, etc., to initially identify the potential entity, resulting in a potential entity, forming a preliminary Entity table. 2.3 Step 2 - Define Dual Contacts in the contact IDEF1X model, N-yuan contact must be defined as n binary contacts. According to the actual business needs and rules, use the entity contact matrix to identify the binary relationship between the entity, and then determine the potential, relationship name, and instructions of the connection relationship according to the actual situation, determine the type of relationship, the identification relationship, non-identification relationship (forced Or optional) or non-deterministic relationships, classification relationships. If each instance of the sub-entity needs to be identified by the relationship between the parent entity, it is an identification relationship, otherwise it is a non-identification relationship. In the non-identification relationship, if the instance of each sub-entity is associated with a parent entity, it is forced, otherwise it is not forced. If the parent entity represents the same real object as the sub-entity, then they are classified. 2.4 Step 3 - Defining Codes by introducing a cross entity to remove a non-deterministic relationship generated by the previous phase, then start identifying the Hou Selection Code attribute from the non-cross-entity and independent entity to uniquely identify the instance of each entity, and then from the Hou Selection Keep the master code in the code. In order to determine the validity of the main code and relationship, the non-air rules and non-multi-value rules are guaranteed, that is, one attribute of an entity instance cannot be null, and there is no one or more values at the same time. Find a misunderstanding determination relationship, further decompose the entity, and finally construct a key-based view of the IDEF1x model (KB]. 2.5 Fourth Step - Define Properties to Develop Properties from the Source Data Sheets to Develop a property sheet to determine the owner of the property. Define non-primary code properties, check for non-empty and non-multileble rules for properties. In addition, check the full dependent function rules and non-delivery dependence rules, ensuring that a non-correct attribute must depend on the master code, the entire master code, just the master code. A full property view of the improved IDEF1X model that is at least in accordance with the third paradigm of the relationship theory is obtained. 2.6 Step 5 - Define the data type, length, precision, non-empty, default, constraint rules, etc. of other objects and rules define attributes. Define object information such as trigger, stored procedures, views, roles, synonyms, sequences. 3. The logical structure design phase converts the concept structure into a data model (e.g., a relational model) supported by a DBMS, and optimizes it. The design logic structure should be selected that is best suited to describe the data model that express the corresponding concept structure, and then selects the most suitable DBMS. Converting the E-R diagram into a relational model is actually to convert the contact between entities, entities, and entities to a relational mode, which is generally followed: 1) A physical type converted into a relational mode.
The properties of the entity are the properties of the relationship. The code of the entity is the code of the relationship. 2) A M: N Contact Translation into a relationship mode. The code for each entity connected to this connection and the properties of the contact itself are converted to the properties of the relationship. The relationship is a combination of each solid code. 3) A 1: n Contacts can be converted to a separate relational pattern or can be merged with the N-terminal corresponding to the relationship mode. If converted to an independent relationship mode, the code of each entity connected to the connection is converted to the property of the relationship to the relationship, while the code is the code of the N-terminal entity. 4) A 1: 1 Contact can be converted to a separate relationship mode, or the relationship mode corresponding to any end can also be merged. 5) One multi-connection between three or more entities is converted into a relational mode. The code for each entity connected to the multi-diverse connection and the properties of the contact itself are converted to the properties of the relationship. The relationship is a combination of each solid code. 6) The contact between the same entity set, that is, from the above 1: 1, 1: N and m: n, respectively, may be processed separately. 7) The relationship mode with the same code can be merged. In order to further improve the performance of the database application system, it is usually modified as a standardized theory, and the structure of the data model should also be modified, which is the optimization of the data model. Determine data dependence. Eliminate redundant links. It is determined that each relational model belongs to the first paragraph. Determine if they are merged or decomposed. Generally, the relationship is broken down into 3nf standard, namely: each value in the table can only be expressed once. • Every line in the table should be unique (with unique keys). Non-bonded information depends on other keys should not be stored within the table. 4. Database Physical Design Phase Select a physical structure (including storage structure and access method) that is best suited for the application environment for the logical data model. According to the DBMS features and the needs of the processing, the physical storage arrangement, design index, and form a database in the database. 5. Database Implementation Phase The data language (e.g., SQL) provided by DBMS and its host language (e.g., c), establishes a database, preparing a database, organizing data, and invoking a commissioning according to the results of logic design and physical design. . The database implementation mainly includes the following work: Define database structure with DDL, organize data storage, compile with debugging applications, database operations 6. Database operation and maintenance phase database application system will be officially run after the trial operation. Evaluate, adjust and modify them during the database system operation. Including: Database dump and recovery, database security, integrity control, supervision, analysis and improvement of database performance, database reorganization and reconstruction. Modeling tools are used to speed up the database design speed. There are currently many database accessories (Case tools), such as Rational Rational Rose, CA's Erwin and BPWIN, Sybase's PowerDesigner, and Oracle's Oracle Designer, etc. Erwin is mainly used to establish a conceptual model and physical model of the database. It describes the properties of entities, contacts and entities with graphical ways. Erwin supports the IDEF1x method. Automatically generate, change, and analyze the IDEF1x model by using Erwin modeling tools, not only can achieve excellent business features and data requirements models, but also enable transformation from the IDEF1x model to database physical design. The model drawn by Erwin tool corresponds to both logical models and physical models. In a logical model, the IDEF1x toolbox can easily build and draw an entity contact and entity properties in a graphical manner. In the physical model, Erwin can define the corresponding table, column, and can automatically convert to the appropriate type for various database management systems. Designers can use the corresponding database design modeling tools as needed. For example, after the demand analysis is completed, the designer can use Erwin drawer to convert the ER diagram into a relational data model, generate a database structure; draw data flow diagrams, generate an application.
Second, Database Design Skills 1. Design Database (Demand Analysis Stage) 1) Understand customer needs, ask users how to view future demand changes. Let customers explain their needs, and as the development continues, it is necessary to ask customers to ensure that their demand is still in the purpose of development. 2) Understanding business services can save a lot of time in future development stages. 3) Pay attention to the input and output. When defining database tables and field requirements (input), you should first check existing or designed reports, queries, and views (outputs) to determine which of these outputs are necessary and fields. Example: If the customer needs a report to sort by postal coding, segmentation and summing, you have to ensure that a separate postal coding field is included without the postal coding into the address field. 4) Creating a Data Dictionary and ER Chart ER Charts and Data Dictionary You can make any people who understand the database how to get data from the database. The ER diagram is useful to the relationship between the table, and the data dictionary illustrates the use of each field and any other alias. This is exact essential for documentation for SQL expressions. 5) Defining the standard object Naming Specification Database Various objects must be specified. 2. Design (Database Logic Design) Table Design Principle 1) Standardization and Standardized Data The standardization of data helps to eliminate data redundancy in the database. Standardization has several forms, but Third Normal Form (3NF) is often considered to have the best balance in performance, scalability, and data integrity. Simply, the table design principle of complying with the 3nf standard database is: "One Fact in One Place" means that a table includes only the basic properties of its own, and it is not necessary to decompose when it is not their own properties. The relationship between the tables is connected by foreign bonds. It has the following features: There is a set of tables that store the associated data connected by the key. Example: A 3NF database that stores customers and its related orders may have two tables: Customer and ORDER. The ORDER table does not contain any information for the order related to the customer, but a key value will be stored in the table, which points to the line of the customer information in the Customer table. In fact, for efficiency, it is also necessary to standardize the table. 2) Data driver uses data drivers rather than hard coding, many policy changes and maintenance will be much easier, greatly enhanced the flexibility and scalability of the system. For example, if the user interface is to access the external data source (file, XML document, other databases, etc.), you may wish to store the corresponding connection and path information in the user interface support table. Also, if the user interface performs tasks such as workflow (send mail, print letter, modifying record status, etc.), then data generating workflow can also be stored in the database. Role permission management can also be done by data drivers. In fact, if the process is data-driven, you can push considerable responsibility to the user, by the user to maintain its workflow process. 3) Consider which data fields may change in the future when designing the database. For example, the surname is this (note is the surname of Westerners, such as women who have been married). So, when you create a system storage customer information, store the last name field in a separate data table, and you also attach the starting day and terminating day, so you can track the changes in this data entry.
Field Design Principle 4) 3 useful fields that should be added in each table, DRecordCreationDate, default is now () under VB, and under SQL Server is defrauded to getDate () • RecordCreator, default under SQL Server For the NOT NULL DEFAULT USER • nRecordVersion, the recorded version tag; helps accurately explain NULL data or lost data from the record 5) The street address is not enough to use multiple fields to add multiple fields to the address and telephone. . Address_Line1, address_line2 and address_line3 can provide greater flexibility. Also, the phone number and email address are best to have its own data sheet, and it has its own type and tag category. 6) Use role entities to define a column that belongs to a category to create a specific time-related relationship with a role entity when you need to define a particular category or a specific role, so you can achieve self-documentation. Example: Use the Person entity and the Person_Type entity to describe the person. For example, when John Smith, Engineer is upgraded to John Smith, Director and even finally climbed to John Smith, CIO's high, and all you have to do is changing the key value between the two tableson and person_type, increasing one Date / Time fields to know when the change occurs. In this way, your Person_Type table contains all Person possible types, such as Associate, Engineer, Director, CIO or CEO, etc. There is also an alternative to changing the Person record to reflect changes in the new title, but this cannot track the specific time of the position where you are in place. 7) Selecting the digital type and text type as much as possible in SQL to use Smallint and Tinyint types. For example, if you want to see the total monthly sales, the total field type is smallint, then if the total amount exceeds $ 32,767, it cannot be calculated. The text field of the ID type, such as the customer ID or order number, etc. should be set to be more imagined. Suppose the customer ID is 10 digits. Then you should set the length of the database table field to 12 or 13 characters. But this additional space can achieve the growth of database size without the need to refactor the entire database in the future. 8) Increasing the deleted tag field contains a "Delete Tag" field in the table so that the row is marked as deletion. Do not delete a row separately in the relational database; it is best to use clear data programs and carefully maintain index integral. 3. Select the key and index (Database Logic Design) Key Selection Principle: 1) Key Design 4 The principle • Create foreign keys for the associated field. • All keys must be unique. • Avoid using a compound button. • The foreign key is always associated with the unique key field. 2) When using the system generated primary key to design the database, use the system generated key as the primary key, then actually controls the index integrity of the database. In this way, the database and non-manual mechanisms effectively control the access to each line of storage data. There is also an advantage that the system generated key is made as the primary key: when you have a consistent key structure, it is easy to find logical defects. 3) Do not use the user's key (not let the primary key to have upded) When you are determined what the field is used as a table, you must be careful that the user will edit. In general, do not select the user-edit field as a key. 4) Optional button Sometimes you can make a primary key to further use the option key to make the primary key, you can have the ability to build powerful indexes. Index usage Principles: Index is one of the most efficient ways to get data from the database. 95% of database performance issues can be resolved using indexing techniques. 1) The logical primary key uses the unique group index, and the unique non-group index is used to the system key (as a stored procedure), which uses a non-group index for any foreign key column.
How much space considering the database, how to access, and whether these access is mainly used as read and write. 2) Most databases index automatically created primary key fields, but don't forget the index foreign key, they are often used frequently, such as running a query showing a record of the primary table and all associated formats. 3) Do not index the MEMO / Note field, do not index large fields (there are many characters), so that the index will take much storage space. 4) Do not index the commonly used small tables Do not set any keys for small data tables, if they often have such a plug-in and delete operations. Index maintenance for these insertions and deletions may consume more time than scanning tables. 4. Data Integrity Design (Database Logic Design) 1) Integrity Implementation Mechanism: Entity Integrity: Primary Key Reference Integrity: Remove Data in the Parent Table: Cascad Delete; Restricted Delete; Set Data in Blank Value Parent Table: Restricted insertion; recursive insertion parent table update data: cascaded update; restricted update; set null DBMS There are two ways to implement: Foreign key implementation mechanism (constraint rules) and trigger implementation mechanism user definition Integrity: NOT NULL; Check; Trigger 2) Force data integrity with a constraint rather than business rules to achieve data integrity of data using database systems. This includes not only the integrity of standardization and also includes functionality of data. You can also increase the trigger to ensure the correctness of the data when writing data. Do not rely on the business layer to ensure data integrity; it does not guarantee the integrity of the table (foreign bond), so it is impossible to impose on other integrity rules. 3) Forced instruction integrity to remove it before harmful data enters the database. Activate the integrity features of the database system. This allows the data to be cleaned to force the developer to put more time handling error conditions. 4) The best way to use the search to control data integrity control data integrity is to limit the user's choice. As long as it is possible to provide a clear value list for users to choose from. This will reduce the incorrect and mishand of the type of code to provide data consistency. Some public data is especially suitable for finding: national code, status code, etc. 5) Adopt the view in order to provide another layer of abstraction between the database and the application code, you can establish a special view for the application without having to access the data table directly. This is also equal to providing more freedom when processing database changes. 5. Other Design Skills 1) Avoid using trigger triggers can usually be implemented in other ways. The trigger may become interference when debugging the program. If you really need a trigger, you'd better focus on its documentation. 2) Use common English (or any other language) instead of using the encoding to create a drop-down menu, a list, and a list of English names. If you need to encode, you can attach the user known by the user. 3) Save the usual information to make a table to store general database information is very useful. Store the current version of the database in this table, recently check / fix (for access), the name of the design document, and the information, etc. This enables a simple mechanism to track the database. When the customer complains that their database does not meet the desire to meet your requirements, this is especially useful for non-client / server environment. 4) Containing version mechanisms to introduce version control mechanisms in the database to determine the version of the database in use. For a long time, the needs of users will always change. It may eventually be required to modify the database structure. It is more convenient to store version information directly into the database. 5) Compile documents to prepare documents for all shortcuts, naming specifications, restrictions, and functions. Use database tools that are added to tables, columns, triggers. Very useful for development, support and tracking modifications. Database documentation, or build a document inside or separately in the database itself. In this way, after a year, I will return to the 2nd version after more than a year, and the chance to make mistakes will be greatly reduced. 6) After testing, testing, repeated testing or revising the database, you must use the data field of the user to enter the data field.
Most importantly, let users test and ensure that the selected data type meets business requirements. Test needs to be completed before putting the new database into actual services. 7) Checking the design of the database designed during development is the application prototype inspection database through its supported application prototype. In other words, for each final expressed data, you can check that you check the data model and see how to remove the data. Third, the database naming specification 1. Naming 1 of the entity (table) is named in terms of noun or noun phrase, determining the number of table names or singular or singular form, and the alias for the list is a simple rule (for example, if the table name is one Words, the alias takes the first 4 letters of the word; if the table name is two words, each of the first two letters of the two words constitute four letters long alias; if the name of the table consists of 3 words, from the head Each of the two words and then removes two letters from the last word. The result is an alias of 4 letters, and the rest is pushed.) For the work Table, the table name can be added to the prefix Work_ The name of the application of the table. In the naming process, according to semantic scattless abbreviation. Note that because ORCLE will unify the field name into one of the uppercase or lowercase, it is required to add an underscore. Example: Defined abbreviation SALES: SAL Sales; Order: ORD Order; Detail: DTL Details; Sales Order Illustrated Table is named: Sal_ORD_DTL; 2) If the table or the name of the field has only one word, then it is recommended not to use abbreviations. But use a complete word. Example: Defined abbreviation Material MA items; items Name are: Material, rather than ma. But field items are: ma_id; rather than material_id3) All stored value lists The pre-prefix z is the prefix z The list class is sorted in the database. 4) All redundant class names (mainly accumulated tables) Front addition prefix X redundancy category is to improve the field or table 5 of the database efficiency, the non-standardized database, or Table 5) The association class is connected to two underline. After the class, the prefix R is added to the manner, followed by the alphabetical sequence, the abbreviation of the table name. Association tables for saving multiple pairs of relationships. If the associated table name is greater than 10 letters, the original table name must be abbreviated. If there are no other reasons, it is recommended to use abbreviations. Example: Table Object is much more-to-many relationships, saving many-to-many tables named: r_object; table depart and employee; there is a multi-to-many relationship; the association table is named r_dept_emp2. Properties (column) Name 1) Use a meaningful column name, the column in the table should use a set of design rules for the key. Each table will have an automatic ID as a primary, logically masterjian as the first group of candidate, if the database is automatically generated, unified named: ID; if it is a custom logical The encoding is named using an abbreviated method. If the key is a digital type, you can use _no as a suffix; if it is a character type, you can use the _code suffix. The column name should adopt a standard prefix and suffix. Example: Name field named: Sal_ord_id; if there is still a database generated automatic number, name: ID. 2) All attributes plus the suffix of types, note that if other suffixes are needed, they are placed before the type suffix. Note: The data type is the field of the text, and the type suffix TX can not be written. Some types are more obvious fields, and they can write the type suffix. 3) Adopt the prefix named to each table with a unified prefix, then it will be greatly simplified when writing SQL expressions.