0 Introduction In the past few years, we use agile methods to design in database design. We summarize some techniques that make the database can evolve when the application develops, which is an important property of agile methods. Our approach is through continuous integration and automatic reconstruction, close cooperation with database management personnel (DBA) and application developers. These techniques are effective in each period of application development.
1 Agile Methodology In recent years, a new software development method has appeared - agile methodology. This gives a database design some new, huge demands. A center of these needs is an evolution design. In a agile project, it is necessary to assume that we cannot determine the needs of the system in advance. Therefore, there is an unrealistic idea in the initial stage of the project. The design of the system must evolve as software changes. Agility method, especially extreme programming (XP), which makes this evolutionary design by some practices. In the database design, the agile method is used, and it is repeated.
Many people suspect whether agile methods can be used for systems with large database components. But we have used many agility and XP skills to address evolution and iterative issues in projects based on large databases.
This article will introduce some practices in the design of agile methods in database design. Of course, this is not to say that we have completely solved the problem of database evolution, but we want to provide some effective ways.
2 A Significant Features of Actively Change Change Agile Programming is that it faces the attitude of changing. The general interpretation of the software process is to understand demand, stop demand, and stop the design of these requirements as the design of design, and then start building system. This is the waterfall method - based on the planned life cycle.
This method reduces changes by a large number of pre-sections. Once the previous work is completed, demand changes can cause great problems. Therefore, when the demand changes, such a method will have a big problem, so demand changes are a big problem in this process.
Agile programming is in another way. Hug changes, and even change in the later period of project development. Although the change will be controlled, this attitude will allow as many changes as possible. The variation part comes from the instability of the project demand, and part from the business environment to support changes in competitive pressure.
In order to do this, different design attitudes must be taken. The design is more than just a stage - a stage of most of the completion before starting the building; the design is a continuous process, related to coding, testing or even release. This is the difference between plan design and evolutionary design. An important contribution to agile methods is to propose an evolutionary design in controlled mode. Therefore, it is not because the design does not have a pre-planned, and it creates confusion. Agile methods provide control evolution design and skills.
An important feature of agile methods is iterative development, that is, running multiple complete software life cycle cycles throughout the project lifecycle. Agile process will spend a complete lifecycle in each iteration. Iteration can complete the demand subset of the final product, test, test, and integrated code. Agility method iteration is shorter, usually between two months, and we tend to be shorter iterative cycles.
When using agility methods, the biggest problem is how the database is evolved. Many people think that database design is a preliminary plan, while changing database design plans in later periods will cause the application software crash; change the database design plan after configuration will result in data migration issues.
In the past three years we participated in a large project, which used a method of practical evolution design. The project consists of 100 project groups, more than 200 forms, and the database has been evolving in the initial development of one and a half years, even in the process of multi-user distribution. At first we itered once a month, it turned into 2 weeks after a few months.
As we promote these experiences to more and more parts of the project, experience in more and more cases. At the same time, we also absorb some experience from other agile projects.
2.1 Restrictions Before telling the practice method, we must point out that we have not resolved all database evolution design issues, especially:
ü We are designed for separate applications, rather than trying to integrate multiple databases; u we did not do 24 * 7 database updates.
Although many people think that we cannot solve this problem, these problems can be solved. Of course, this requires further work, and light is not solving problems.
3 Practice We depends on some important practices about the method of decline in database evolution.
3.1 An important principle of database management personnel and developers is closely working with agile methods to have people with different skills and backgrounds to work closely. The formal conference and documentation cannot reach the exchange effect, so they need to work together, close cooperation. All project group members need close cooperation: system analysts, project managers, industry experts, developers, and database managers (DBA)
Each of the developers may require DBA help. Developers and DBA need to consider whether there is a big change in the database plan. How do developers should consult DBA to deal with changes: developers know what new features needed, and DBA has a global concept.
In order to achieve intimate cooperation, DBA must make yourself easily close. DBA needs to leave a few minutes and let developers ask questions. It must be ensured that DBA and developers are sitting together so they are easy to communicate. At the same time, it is important to ensure that the application design meeting is open, so that DBA can join in. In many cases we find that people build a barrier between DBA and application developers, these barriers must be removed, so that evolutionary database design is likely.
3.2 Members of each project group have their own database instance evolution design, think that people learn to learn. During how the developers do some tests before the implementation of some feature, apply a preferred solution. This is also true of database design. Therefore, each developer has an example of its own test, not to affect other people, this is important. Such everyone can test according to their own needs.
Many DBA experts think that multiple databases are troubles, not easy to actually apply, but we find that the database operates is easy. Of course, it is very important to have a convenient tool that allows you to operate the database like an operation file.
3.3 Developer Databases are often integrated into a shared master database although developers can trials in their own space, it is also important to meet different working regular convergence. Application development requires a shared master database, all work collected here. When developers start working, they get copied from the primary database to their own workspace, operate and modify, and then feedback to the primary database. Our provisions are that each developer should submit a convergence every day.
Assuming that the developer starts a development task at 10 am, part of this task is to change the database plan. If this change is simple, if you add a field, he can decide himself. With the help of the data dictionary, developers must ensure that they don't have to increase the field database. But if he discusses this possible change with DBA, then work is simple.
When he is ready to start, first get a copy from the primary database so you can freely change the database planning and code. Because he uses its own database instance, it will not affect others. At some point, such as 3 pm, he clearly needs what kind of database changes, and even at this time he has not completed his coding work. At this time he found DBA, telling him what the changes were. At this time, DBA can make problems that developers have not considered. Of course, most of the time is very good, DBA agrees this change (by one or more database reconstruction). DBA will happen immediately (unless they are destructive changes), such developers can continue his work, submit code at any time, because DBA has sent these changes to the main database.
This principle can be regarded as constant integration, and continuous integration is often used for source management. In fact, this is to view the database as another source code. Control the primary database as the configuration management system like the control source code. As long as we build a success, database and source code are sent to the configuration management system, so we have the full and synchronized version history. For source code, the problem in integration is handled by the source code control system. For the database, the work to be done is slightly more. All database changes need to be properly handled, such as automated database reconstruction. In addition, DBA needs to review any database changes to ensure that it meets the entire database. In order to make this work is relatively stable, there should be a big change in the process of integration - therefore requires DBA to work closely with developers.
We emphasize the frequent small integration because it is much easier than non-recurring large integration. The complexity of integration will increase as integrated scale. Therefore, many small changes are easier to achieve in practice, of course, this looks at the intuition.
3.4 Database Contains Plan and Test Data When it comes to the database, we don't just refer to database programs, but also query-scale data. These data include standard data required to apply, such as all provinces of all provinces, as well as sample data for some sample customers.
Data role:
1, easy to test
Use a large number of automation tests to help stabilize the development of applications. Such tests are commonly used in agile methods. In order to effectively carry out these tests, the reason is to work on a sample test data, so all tests can be done before the program is officially performed.
2, test the migration of the database
In addition to the test code, the sample test data allows us to test the migration of the database. After changing the database, we must also ensure that all plan changes can also handle sample data.
These sample data in most projects are fictitious. However, people in some projects use actual data as an example. In these cases, the data is extracted from the system previously migrated by automated data. It is clear that all data is migrated immediately because the database is only established in the early iteration. But we want to change the migration code when the application and database development. This can not only solve the migration problem as soon as possible, but also make industry experts to handle this system being developed. Because of their familiar data, they will point out where they may give databases and application design. So we recommend introducing actual data in the early iteration of the project.
3.5 All changes should be database reconstructed reconstruction techniques to apply all controllable technologies to change the existing code base. Similar to this similar to the database reconstruction also provides a similar control to the database.
The difference in database reconstruction is that it must complete three different changes simultaneously:
ü Change the database plan
ü for data migration
ü change data inventory tap
Thus when describing the database reconstruction, we must describe three aspects of the change and ensure that these three changes are completed before another reconstruction.
We must document different database reconstruction, so we can't describe them in detail. However, there are a few things to point out: like code reconstruction, the database reconstruction is very small. The concept chain series is small, the database and code are very similar. Three attributes of the change make it more important to keep small changes.
Many database reconstruction, if you add a field, you can do not need to update the code of all access systems. But if you don't understand it before using the new plan, this field will be useless, because the new plan does not know its changes. Many changes have not been considered the entire system plan, we call destructive changes, such as setting an existing null column as non-empty. Destructive changes require more attention, and the degree of paying attention to the degree of destructiveness. A small destructive example is to set a already existing null column to non-empty, in which case you can do it.
Reconstructing will consider null value data in the database. Developers will update the database mapping code, so the update does not undermine the code of other people; if you accidentally destroy, the developer will find problems when establishing and using testing. Dividing a commonly used table is a more complex damage. In this case, everyone knows that changes are coming, so they can be prepared. In addition, changes should be implemented in a safer time.
This is a very important point to choose the process that applies to the changes you make.
3.6 Automatic Reconstruction In Many languages in Code World Enable Automatic Reconstruction. This automation is also important for databases during program changes and data migration. Therefore, each database reconstruction can be done by writing SQL DDL (for planning changes) and DML (for data migration). These changes are not implemented by manual implementation, but through some SQL statements to dynamize changes.
Once the code is completed, we save these code files to generate a complete change record for database changes as the result of database reconstruction. We can update any instances to the latest primary database, and generate changes to changes in the earlier database instance by running in our copy of the master database.
Serialization of automated changes is a basic function of continuous integration and migrating product databases.
We do not implement changes in the regular iterative cycle for the final product database. We create a complete database reconstructed change log between every release. There is no doubt that this is a huge change, we must implement this change offline. Test the migration plan before practical application is definitely wise. To date, this technology is quite tube. By decomposition of large changes into small changes, we can make big changes to product data, and will not give us too much trouble. In order to use a sentence in the military law, it is "staining into zero".
In addition to automated forward changes, we should also consider the change in time backwards. If you can do this, you can return to the previous database status. There is no doing this in our project, because there is no such need, but this is also a very important principle.
3.7 Automatically updates all developers' database people to change and update the primary database, but how do you find that the primary database changes? In the conventional environment of continuous integration, developers update the primary database before committing changes. This way they can solve the problem on their own machines before committing changes to the shared master database.
When each primary database changes, we have to update the developer's database. When the primary database changes, we automatically update the database of all items. The same reconstruction code updates the main database, automatically updates the member database. Perhaps some people think that the update developer database will have a lot of problems in the case of uninformed developers, but we did not find any problems in practice. Of course, this is only used in people network. So when developers are offline, you must re-maintain synchronization with the primary database as soon as possible.
3.8 Clear all database acquisition code In order to understand the result of database reconstruction, it is important to understand how the application uses a database. It is difficult to do this if the SQL statement is distributed around the code. So a clear database acquisition layer is important, it is used to display how the database is used, where is used.
Clear database layer has a lot of benefits. It reduces the place where developers need to use SQL knowledge when developers to manipulate the database, which makes developers who are less familiar with the SQL statement easier to develop. For DBA, give him a clear code, you can clearly understand how the database will be used. This also helps prepare indexes, database optimization, optimize SQL statements, making DBA better understand how the database is used.
4 Change Law is like any practical, these principles must be based on your special environment. Without a constant project, we must respond to changes.
4.1 Keep multiple databases in a system simple project may only need a primary database. But complex projects require multiple databases, ie the database system. If the database must branch before investing in production, then we can create a new database system. The database system is similar to the branch of the code, requiring different test data sets to test. When the developer gets a copy from the primary database, you must register which database system is modified. When the DBA updates a database system of the primary database, all developers of all registered database systems are updated.
4.2 Do not need a full-time DBA all these seem to have a lot of work, but it does not require a lot of human resources. In the largest project, we have 30 developers, 100 items (including quality evaluation, analysts and managers), and we have more than 100 different series of products are distributed on each workstation. But all of these work requires only a full-time DBA, only two programmers amateur help.
Even the full-time DBA is not required in a small project. When we use these techniques for smaller projects - when we are about 12 people, it is found that the project does not require a full-time DBA. In contrast, we rely on two developers in the database that are interested in databases DBA tasks.
This is an automated credit. If you automate each task, you can do more work with fewer people.
5 Auxiliary Tool Database Emissions require a lot of repetitive work, we can develop some simple tools to help us solve a large number of repetitive work.
The most valuable place for automation is that there is a universal database task simple code set. Automated tasks include:
ü User information is consistent with the current administrator's information
ü Create a new user
ü Reproducing database plans and synergies
ü Mobile and synthesized database
ü Delete users
ü Export users so that project team members can distribute offline database backups.
ü Import users so that project team members can have database backups, import databases, and create new plans.
ü Export baseline, back up the primary database, which is a special case for exporting users.
u Create a report of different plans to compare.
u will plan to compare with the main plan, so developers can compare their local copying to the primary database.
ü List all users
Analysts and quality assessments often go to test data and need to change them. So we develop an Excel application with the VBA statement, extract data from the database to the Excel file, allow the user to modify this file, and return to the database after modification. Of course, you can also use other tools to browse and edit the contents of the database, but we use Excel because many people are familiar with it.
All members of the project group should easily obtain the details of the database design, thus find what form can be obtained, and how to use these forms. We have established HTML-based tools to use servlets to query database metadata. Therefore, developers can first take a look at the database in the database before adding a field. We use Erwin modeling to extract data from Erwin into our metadata table.
6 At the end of the language, it is of course, this is not all applications in database design, nor is it all database evolution design, and integrated databases and 24 * 7 hours implementation and other problems that have no resolution. Database evolution design needs to make further Research work.