About OR-maping learning experience
Note: This article only uses OOA (object-oriented analysis) and OOD (object-oriented design) mode, and data storage is used in system storage.
Talk from the OR-maping error
During the design development process of the workflow management system and information sharing platform, constantly explore object-oriented analysis and object-oriented methods, because the object-oriented program is more close to people's thinking activities, using this idea to design, can Large extent improve programming capabilities, shorten software development cycles, reduce the overhead of software maintenance. Object-oriented technology has become a trend in software development.
Through the design and implementation of the workflow management system and the information sharing platform, our understanding of object-oriented programming is also getting deeper, gradually transitioning to a hierarchical, distributed programming, based on object-oriented ideas. Through theory, practice, theory of theory, continuously summarize the process of learning, it is found that it is incorrect to OR-MAPPING. First outline the following original software design:
First, from demand, user needs, define system functions, and then do detailed design, the detailed design phase is mainly tale business processes, according to business processes, complete data dictionary design, if we use object-oriented development, we will abstract The business entity object, and linked his table and fields of his and relational databases, mapping, and completing mapping from the table to the object with a relational type, this is what I started to understand the or-mapping, and later realized this The understanding is wrong, there is a problem with object-oriented development process.
Object-oriented analysis design Method emphasizes the analysis and design of all object-oriented methods. From user requirements, analyze object-oriented services, and the analysis model of the obignable object should be constructed in the analysis phase, which is a conceptual model, and no database or language. No data dictionary is generated, but the concept level of the UML map. Enter the design phase, UML is refined, designing class diagrams and class interfaces. The structure of the database is not considered during this process. When the class diagram and interface are designed, the corresponding database structure is generated according to the objects. Instead of establishing objects mentioned above, based on the database structure, this process of generating the corresponding database structure based on the object is actually the mapping problem of the or-mapping we have to discuss, object relationship.
The following chapters will be discussed in detail in detail, and the way object-oriented database design is also discussed, as well as a little thinking about the object analysis design.
Why do we have to do or-mapping?
If we use the object-oriented ideas to analyze design, and the database uses a relational database that will have this problem that cannot be ignored. Object-oriented design and relational database design is very different. The "impedance does not match" between the object model and the relationship model. The object model is based on some principles of software engineering. The theory of object-oriented design includes encapsulation, association, aggregation, inheritance, polymorphism, and relational models mainly target data storage, "impedance mismatch" has become their main contradiction. We hope to complete the business process through object-oriented design, this can be implemented, but it is finally how to save objects and must be considered. If our storage media uses a relational database, then when the object is saved to the relational database, the contradiction will appear. It is also the main reason why we have to make or-mapping, the bridge of the object and relational database. OR-mapping mainly solves the map, object relationship mapping, and object's persistence issues.
OR-mapping is an object-oriented product product, and is also one of the problems to be solved in a hierarchical design. OR-mapping will bring those benefits to program design? In the system system for object-oriented hierarchical design, the upper program performs the final result to operate the database, and the database is a relational, not object-oriented, is a mapping of the object relationship, so that we realize only the upper layer The operation of the object implements the operation of the table, which feels like there is no database existence, and the upper layer is only the object orientation. Or-mapping principle
O-R mapping usual approach is to map classes in the program to one or more tables of the database, however, which will bring some problems in this way. The first is that the data entity is different in the database and program. For some "coarse granular objects" involved in multiple tables, one entity class may reference multiple other entity classes, which means that the object is involved. There are some problems in the modeling of granularity; followed by interacting with the database, there is also a conversion problem, if an object involves the operation of multiple tables, the problem is greater; finally, when the system is inquiry, When multiple objects need to be returned, the efficiency is relatively low because the problem involves conversion.
These mentioned above are the problems we actually consider, which involves the principle of o-r mapping. We do what kind of criteria when we do data mapping, which is guided by the principle, and those factors must be considered in the process of mapping. Here I will talk about this problem.
1, performance:
When an object is mapped to a table, performance will be the most important consideration. Object maps to tables will have a significant impact on database access. Database Access is usually performed on a disk or other external medium, and their access time is approximately in milliseconds, while the CPU's processing period is in the nanosecond. Therefore, better practices are wasting some storage cycles and memory spaces to improve the performance of slow IO. Some of them is that the computer can do it yourself, such as cache, DMA mode, etc., there are some need to manually process us, reduce the number of access to the database, such as delay reading, and data marking, and more.
Delayed reading core idea is to use data to read those data, such as a person object, some departmental properties, personnel and divisions are stored in different tables, and there is no need to put the department this property when initializing this object. Initialization, when this property is needed, the value is paid to this property from the database.
The main purpose of the data sign is to reduce IO overhead between memory and disk. In general, the data from the database is read into memory, whether there is no modification, and finally need to write back to the database. This is the easiest, most common, but also the worst approach. A 100K data may only be changed by 1K data, but it has to write extra data from 99K. It can be considered for each row, even the data of each field or even modify the bit, and then selectively write back the database according to the status of the modified bit.
2, spatial consumption:
Different mapping methods will make the access space of the data is very different, and some mappings do not waste the database space (similar to the null field), and others require a large number of useless database records. Space consumption will increase performance. We are sacrificing space or to improve performance, it may be necessary to make the best trade-off due to incidents.
3, maintenanceability and performance of the database
The maintenanceability and performance of the database is conflicting. While optimizing data model performance, the maintenance cost of changing the application is also rising. We also need to weigh, pay.
As mentioned above, or mapping is not tight to map the object to the database, and also map the object's relationship into the database, in order to better elaborate or mapping, first describe several different relationships between the objects, so that The corresponding mapping rules are described according to different relationships. The object has the following relationship between the object: inheritance, association, aggregation, and composition. To effectively map these relationships, you must understand the differences between them.
Inheritance: Inherit is also known as the general relationship, is the most basic relationship between classes and classes, which will not be described in detail here.
Associations: Association relationship is a connection between classes and classes, which makes a class to another class attributes and methods. For example, the relationship between the warehouse class and the product class.
AGGREGATION: The polymerization relationship is a kind of association relationship and is a strong relationship. The polymerization is the relationship between the overall and individuals. For example: the relationship between automotive and engine, tire class. The two classes involved in the association relationship are in the same level, and in the polymerization relationship, two classes represent the whole, one representative part.
Composition: It is also a synthetic relationship. The constituent relationship is one of the relationships of the association. It is a strong relationship with a polymerization relationship. It requires a general polymerization relationship to represent the overall object to be responsible for the life cycle of the object representing the object, and the constituent relationship cannot be shared. That is to say, the object represents the object can only be synthesized with an object at a certain moment.
Below the relationship between the above classes, the following we will explain the principles of the object to the relationship mapping. In order to map more clearly, we will use mode language to describe. The mode is a methodology that solves a certain type of problem. We use the pattern language to describe it. One is to facilitate communication, and the other is to make a summary, as a specific way to solve problems.
The pattern language describes the solutions to the three major categories of issues that are mapped to the table. Depending on your needs, you may optimize mapping methods based on flexibility, easy maintenance, low database spatial consumption, and performance. Take the object to several modes of the table, please see the following mode language diagram:
Each mode will be explained in detail below:
Inherit mapping mode:
There are many ways to map inheritance hierarchy to relational database tables. The following displayed mode is a single mapping method. In practice, you can mix different mapping methods.
The following discussion did not involve multiple inheritance. There is little inheritance in the field.
Mode: One Inheritance Tree One Table
Method: The complete works of all objects in the inheritance level correspond to a single table of the database. Use a NULL value in each record.
in conclusion:
A. Space: As you described in the above mapping, the property of the storage object will waste some space, and the waste space depends on the depth of the inheritance level. The deeper the level, the more attributes, the larger the attributes, the more waste space.
B. Maintenance costs: When the mapping is directly, when it is relatively simple, as long as the level of inherits will not be many, the modifications of the mode will be relatively straight and simple.
C. Query: Due to the simple mapping, the query is relatively easy, just access a table
As shown below:
Mode: One Class ONE TABLE
Method: Put the properties of each class in different tables. Insert a Synthetic OID in each table, associate the row of the subclass and the row of the parent class.
in conclusion:
A. Virtual class: Note that the virtual class also needs to be mapped to a separate table.
B. Space: This map has almost the best space saving. The only easy is an additional Synthetic OID to connect different levels of hierarchy. C. Maintenance cost: Since this mapping is directly, it is easy to understand, and the modification of the mode is also directly, easy.
D. Query: Data modes for obtaining an object instance due to mapping: One Inheritance Path One Table
Method: Map the properties of each class to different tables. And add all the properties of the parent class of the class in the table.
in conclusion:
A. Virtual category: Note that the false category is not mapped to the table.
B. The space occupied: This mapping provides the best space occupancy. There is no redundant attribute, and even the additional Synthetic OID is not.
C. Maintenance cost: Increase or delete the parent class's properties will result in modifications to the table structure of all subclasses.
D. Query: The query is relatively easy
Mode: Objects in blobs
Method: The table includes two fields: one is synthetic OID, the other is a growing blob, which encompasses all data in an object. Store the data of the flow to the BLOB.
Note:
Since this approach is not enough in the relational database, if you use this method, you will lose the meaning of the relational database, which is no longer discussed.
Associated mapping mode:
For objects of association, it mainly includes two types, one is one-to-many, one is much more. Each type of relationship mapping will be set forth below.
Mode: Foreign Key Association
Method: This mode shows how to map the relationship between the object 1: M to the relational database table. Insert the OID of the object to which the object is inserted. This OID can be a keyword or a Synthetic Object Identity of the database.
in conclusion:
A. All space: In addition to the foreign key fields required in the object table, there is no redundant field.
B. Maintenance costs: low
C. Query: relatively simple
Mode Association Table
Method: This mode shows how to map the relationship between the object N: M to the relational database table. Create a separate table to store object flags (or foreign bonds) in the relationship between the two object types. Other object types to the table mapping can be processed using additional appropriate modes.
in conclusion:
A. Space: Added a new table, accounting for a certain amount of storage, but this is a better way to solve many more questions.
B. Maintenance costs: low
C. Query: relatively simple
Aggregate mapping mode:
Mode: SINGLE TABLE AGGREGATION
Method: Place the properties of the aggregated object and the properties of the polymer object in the same table.
in conclusion:
A. Performance: In terms of performance, the program is optimal, because only one table can get a polymerized object, and read all aggregate objects. On the other hand, because the field of the polymer object is increased, a reading will increase the number of pages read in the database, resulting in waste of IO bandwidth.
B. Maintenance cost: If the aggregated object type is referenced by multiple objects, it will reduce maintainability because modifications to the aggregate object type will result in modifications to all reference aggregate objects.
Mode: Foreign Key Aggregation
Method: Use a separate table for the polymer type. Insert Synthetic Object Identity in the table and use the object tag in the aggregate object table to make an outer key, connect to the aggregate object. AggregatingObject is mapped to a table, while AggregatedObject (aggregate object) is mapped to another table. The table contains synthetic object Identity in the table of the aggregate object. The SyntHeticoid field is referenced using the aggregatedObjectsoID foreign key field in the aggregation table. in conclusion:
A. Perform: Foreign Key Aggregation requires a connection operation or at least twice database access, while Single Table Aggregation requires only one database operation. If the probability of accessing the aggregate object is small, this performance is still acceptable, if the aggregate object always needs to return to the use of the aggregate object, then it is necessary to carefully think performance issues.
B. Maintainability: Decompose objects, such as separate addresStype, which will make maintenance easier and increase the flexibility of mapping.
Here, we have finished the standard of mapping mode, and the following is summarized as follows:
Comparison of each mapping mode:
Object-oriented database design
Here we will design the object-oriented database design, because after an object-oriented analysis design method, this step and the previous object relationship map has close contact, in fact, this step is one of the above analysis to the implementation. process. Here is mainly to clarify the difference and significance of object-oriented database design and traditional design.
There are two general database design methods, namely, primary, and entity dominant. Attribute dominates from the attribute of the attribute database application, the function dependencies between the attributes are maintained when the attribute set (entity) is intended. Entity dominates first from finding meaningful entities applying meaningful databases, then defining an entity by defining properties.
Usually we do the basic dominant design, and after the above OR-MAPPING, it is easy to implement the object-oriented database design with entity dominant.
In a sense, it is the object-oriented characteristic of the database design, which finally lays an object-oriented direction of the entire system, and its effect is summarized as follows:
1, the database structure is clear, easy to implement OOP
Due to the fully mapping of the application module object on the database object, the database logic model can naturally and directly simulate the real world entity relationship. The user's current physical world, system developer abstract system external function, corresponds to the internal database (data structure) of support system functions, so users, developers and database maintenance personnel can communicate with consistent languages. In particular, the program developers who do not understand the business, this kind of design method to encapsulate the object to the corresponding data object in the object unity, greatly alleviate the difficulty of the program, so that they only need to know the data and The required operation, and most of the application, you can inherit a variety of physical grade superclams that are abstracted by the designers.
2, the database object has independence, easy to maintain
In addition to the database table object and the application module object, we did not design multiple inheritance of generic relationships in the logical object model, so the database structure thus obtained is basically a tree hierarchy composed of the parent table and sub-table classes. There are very few complex relationships outside the form, which is a structure that meets the localization principles, thereby controlling the influence of database table data damage to local scope and facilitates repair, and the daily maintenance work after the system is opened. .
3, the program and database retrogenization rate is high when the change is changed, and there is less
When mapping the application object, a pair of table objects may occur after the relationship mapping is normalized, most of the application objects and table objects are one or one. We can see a database object from multiple tables that are normalized by normalization, and a table object mapped. Therefore, when some application requirements are changed, first, the system modification may not involve a portion that is not changed. Second, the modification of the change can be substantially limited to additional or deleting program modules or adding new library tables, and basically does not have to modify the original program code or the original library definition, thereby greatly reduces the workload, reducing the difficulty of work.