Recently visited the bookstore to find a good book for data modeling - "Data Modeling: Tools and Skills for Analysis and Design" (Tools and Techniques for Analysis and Design), author Steve Hoberman. After reading it, I feel that this book is indeed worthy of the translator and foreign experts: "This book is full of technologies and techniques for improving data models and design, and it is also very fun - one Against combination! Any data modeling should have a book about data modeling tools and technologies for Steve Hoberman. "
Although I have a certain self-assignment of the data modeling knowledge you have mastered, after reading the book, it is better to benefit. In the idea of sharing with a good book, I will listed the summary and skills of each chapter of the book to make it easier for hand to temporarily don't have a photo of the book in data modeling. The tools and templates described in the book can be downloaded in the author's Web site, address is: www.wiley.com/compbooks/hoberman
Chapter II: Use a job, class ratio, and presentation to clarify the concept of data modeling
In general daily communication. We may say and hear a lot of stories, or the topics involved in these stories are large. Some examples are things happening to our own weekend, or have experience related to our work items. These interests help to strengthen our relationships with people, enhance our pleasure, and have an educational role in us. We can express things that are expressed by the language. Sometimes, when the story is over, leave us the information or more understanding of the previous unidentified. When explaining the data modeling concept, the interest is extremely effective. The reasons are as follows: They build a durable image. They are fascinating, making people pleasing. They increase the relationship between people. They slow down the pressure.
Successfully fabricated and telling a data modeling of data modeling has the following three simple steps: 1) Define a topic. To ensure that this interest you tell has a specific goal or topic, that is, this story is to explain the concept or terminology of a data modeling. 2) Select your story. We can choose a variety of story types. We must consider choosing a short story that is interesting and beneficial, and can understand the short story of the keying intent. 3) Exercise your story. Once you find the right story, you have to take a good exercise until you confident it can fully express your topic in two minutes. To avoid telling the story of dragging pull.
The data model ratio ratio is compared to two or more concepts to emphasize similar or differences between them. The species is a good trick to introduce foreign things or fresh things, especially when introducing computer professional knowledge to non-computer professional people. Hoberman's most common types in data modeling, such as (he uses these classes to have a double salary ^ _ ^): The main domain model is a point of view. The data model is a design map. The business model is a world map. Standard is urban planning. The metadata warehouse is a library. The data warehouse is "heart".
Chapter 2: The metadata Bingo is simple, that is, through the way of the Bingo game, mobilize the enthusiasm of the project team member to determine the data model, and determine the effectiveness of the metadata. The metadata Bingo is emphasized "win-win". If you are lucky, everyone will win at the end of the game.
Chapter 3: Ensuring high quality definitions This chapter discusses a tool called "Definition checklist), which contains the highest level of defined quality.
Chapter 4: Data Modeling Project Plan This chapter focuses on identifying four tools for determining data modeling stages, tasks, tools and time limits: • Tools in the data modeling phase: Data modeling steps for determining the highest level. · Phase - Task - Tool: Extract at all phases of the "Data Modeling Stage" and breaking them into data modeling tasks. · Priority Triangle: You can take two extremes from the following three items: high quality, shortest time with the lowest cost, but you will never think that the triple. • Reliable Estimation Tools: "Subject Work Dominations" Determines the percentage of the entire project according to the type of application, according to the type of application. The Task Workload tool extracts each task determined in the Stage-Task-Tools, and lists them to account for a percentage of the entire data modeling work product. The combination of these two tools allows you to provide a reasonable estimate of a certain accuracy to the project manager. Chapter 5: Subject Domain Analysis This chapter mainly explores five key tools, these five tools have a gangster role in the main domain analysis phase of data modeling. They should be completed one by one below: 1) Main domain checklist: The complete list of the main domain in the new application, as well as the definition and synonym (or alias) of each main domain. 2) The main domain crUd (CREATE READ UPDATE) matrix: Contains the differences and repetitions of the main domain between the new applications and existing applications, and determine the scope of the application. 3) In-the-know template: Determine the person and documentation required to complete the data mammill working product of this new application. 4) Main domain family tree: contains source applications and several other key information for each main domain, and clarify where the main domain data will come from. 5) Main domain strength matrix: use an electronic form format, record the release level of each metric and the factual main domain.
Chapter VI: Main Domain Modeling This chapter describes the powerful tools for modeling three team main domain information: "Business cleaning board" model. · "Application Cleanup" model. · "Early Realistic Check" model.
Chapter 7: Logic Data Analysis This chapter focuses on four logical data analysis tools, they should be used in the order below: 1) Data Elements Family Tree: Contains the full list of data elements of the application, and the source of each data element and Transform information, there are several other critical data element metadata. 2) Data Element Particle Matrix: Use a spreadsheet to record the release level of each metrics and facts. 3) Data Quality Record Template: Display the data of each data element and some actual data contrast. 4) Data Quality Confirmation Template: Record the metadata of each data element and the result of some actual data.
Chapter 8: Standardized Tour and Reverse Standardization Survival Guide (strongly recommended: It is the standardized technical documentation of the best relational database ") Standardization is a process that eliminates redundancy and application rules. It is To better understand and express dependence and participation in the data elements. Standardization includes 6 levels, the highest layer is the fifth parameter (5nf). The general technical documentation is considered to be 3NF, and Steve Hoberman indicates a higher goal: 5nf. Graeme Simsis wrote a book named "Data Modeling Essentials", in this book, he wrote: "The higher level paradigm is often misunderstood and therefore ignored, or to support unreliable construction The model is quoted. "However, we need to understand these higher levels of standardization, because they reflect additional standardization opportunities, and help us further reduce redundant information, improve design flexibility. Although the remaining three standardization levels may only have fewer changes in the number of times, they still have some opportunities to improve flexibility and efficiency. The following is the definition of BCNF & 4NF & 5NF (more likely to understand more than the mathematical formulas of the domestic textbooks): BCNF = 3nf The following rules: Each data element is completely dependent on the key, the entire key, and except dependent on this button Do not rely on any other data element. 4nf = 3nf The following rule: To have three or more outbound data elements in the primary key, those entities that do not have constrained between the existence bonds into two or more entities. 5nf = 4nf The following rules: Have three or more foreign key data elements in the primary key, and these foreign key data elements have a constrained entity decomposition to become a multi-to-many relationship that requires all constraints. When we climb the top peak of 5nf, the "reverse standardization" increase data redundancy according to the actual demand situation, thereby simplifying development and improving query speed. Reverse standardization is such a process: After defining a reliable, fully standardized data structure, you can select some duplicated data to promote special performance requirements by means of this process. Steve Hoberman 's "Reverse Standardized Survival Guide" gives a set of calculated scoring criteria for properly increased redundancy. By examining 6 problems of each relationship, after the score of each problem, when the score is greater than or equal to 10, we will reopeize the relationship.
Rule Rule of "Reverse Standardized Survival Guide: 1. What type of relationship is: This problem determines the type of relationship we have analyzed. What is the relationship between the parent entity for the sub-entity? Hierarchical relationship (20 points) equivalent (-10 points) Determination of relationship (-20 points) 2. How much participation rate is: This problem determines the participation of each entity in a relationship. In other words, there will be several sub-entity values for a given parent factor value? The closer the relationship between the father and the child is, the greater the chance to reverse the standardization of it. Up to "a pair of five" ratios (20 points) Up to "one pair one hundred" ratios (-10 points) exceed "one-to-one hundred" ratios (-20 points) 3. How many of the parent entities? Data elements less than 10 data elements (20 points) Data elements are between 10 and 20 (-10 points) more than 20 data elements (-20 points) 4. What is the usage: When the user needs from When the child's information is usually, do they still need information from the father? In other words, what is the coupling or related degree of these two entities? The association between each other (30 points) is weaker or there is no association (-30 points) 5. When the parent entity is a placeholder: In the near future, we still intend to interact to the father's entity Add more data elements or relationships? If the answer is "not", then the reverse standardized feasibility is stronger. Yes (20 points) does not (-20 points) 6. The change is the same. The problem is to determine, in the same time period, the insertion of the two entities and the frequency of updates are similar. If one of the entities rarely changes, and the other entity is frequent, then we are very inclined to keep their standardized state and put them in their respective tables. The same (20 points) Different (-20 points) How to use the "reverse standardized survival guide": 1) Put the relationship in the model in accordance with priority 2) Select a relationship 3) Answer to this relationship 4) If the score is equal to Or greater than 10, on the reverse normalization 5) Returns to step I until all the relationship is completed.
Chapter 9: Abstract Safety Guide and Components I have seen my "Talking about Database Design Skills (on)" Friends should still remember the second example of me: the design of the product information table on the online e-commerce platform. This chapter rises to the method used in the above example to the theoretical stage, using the object-oriented design, extracts the common attributes of all items, abstraction into a superclass, add a table to record details between different entities To achieve superclad derived, thereby achieving flexibility in design. When any occasion of the following two conditions, the abstraction is extremely useful: the design needs to be permanently maintained: the demand for non-modifying database design requirements will change as much as possible: the demand for application changes, and requires business process restructuring Or perform functional upgrade data warehouse: When the new classification type comes from the source application, we do not need to make any changes to the design of the data warehouse, and just add a new row in the classification type entity, the warehouse library: Similar to the requirements of the data warehouse
Of course, abstraction will greatly increase the complexity of workload and development, and people usually pay attention to very short-term applications and the cost of their eyes, but don't care about the future costs. Therefore, I agree with agile software development, this point of development: I originally not pre-design, but once demand changes, this time, as a programmer pursuing excellence, you should review the entire architecture design, in this revision design It is possible to meet a system architecture similar to modified in the future.
"Abstract Components" is a small abstract model fragment, which can be repeatedly used in many modeling occasions (in any industry, organization, even the modeling of the main domain). After using abstraction multiple times during the key mode, you will begin to see the trend of the abstraction structure. These "Abstract Components" have the following purposes: Accelerate design speed to speed up the development speed to provide universal and useful institutions Chapter 10: Data Model Beautifying Tips This chapter adopts how to improve the visual appearance of how to improve logic and physical data models, so that our design beyond direct Application needs. This chapter discusses five categories of beautification techniques: logical data elements: these techniques are a recommended method for sorting data elements of each entity in your logical data model. Physical data elements Sort tips: These techniques focus on the best layout of each entity in the data model. Entity layout tips: These techniques focus on the best layout of every entity in the data model: these techniques focus on how to adjust the overlapping relationship lines and the relationship of the unrelated entity that does not look through (rather than bypass) attractive attention : These techniques focus on how certain elements, entities, or relationships that have been prominent in our involvement.
Chapter 11: Planning a Top Ten Advice on Data Models: 1) Remember: Flexibility, Accuracy and Background 2) Modeling is just your job Part 3) Try other roles 4) Learn 95/5 rules: 95% of the time will cost 5% data element 5) Data modeling is never bored: If you have been doing data modeling, but also found I often feel bored, then you have changed it. This may not be the data modeling field itself is bored, but your specific tasks, the company or industry is no longer exciting. Adventure, try to work in a different project or industry to work in data modeling! 6) Standing in the technology frontier 7) Try not to involve emotional factors on the model: Models must understand that people's opinions in the review process are not for the design of the model, but for this model. That is, the old saying: it is not right. 8) Let your creativity to expand wings: Tight to creativity when considering record data requirements and improved design. Creative may mean to modify certain tools in this book. This may also mean to propose your own spreadsheet or other tools. 9) Simple Theory is too expensive: In the process of design activities, you must make sure you keep this view in mind. The department and organization of this application will be able to see the practical results. 10) Become a person who is governing story: As a data modeling, telling a story is a very important part of work. In order to help group teaching and affect the project manager, we need to tell stories or interests.
Finally, I personally think that the concept of "abstract components" proposed by Steve Hoberman and "design model" in object-oriented design is very similar. That is, after the database expert is modeled in multiple data, the similar part of each project is abstracted, extracts a specific modeling model fragment, and then refine the model fragment in the new project, you can quickly A database architecture suitable for the project is constructed. However, these modeling model fragments have not unified, forming standards, and there is currently no books in this class. I am here to sum up my experience in this area, but my self-known level is limited, I don't dare to get ax in front of the high-person, I only hope that the relevant articles that I will release in the future can play the role of jade, and strive for programmers from China. The first "design pattern" in the field of data modeling.