Quality Assurance Technology in Collaborative Development - Parallel Version Control, Daily Construction and Delivery Project Original: Situ Okan South October 29, 2003
Summary
statement of problem
Parallel version control - effective guarantee for multiplayer development
Code submission and synchronization
Communication ties during encoding - Commit Mail
Daily test - daily construction
Effective version control - code branch, version tag
Features freezing and code freezing
Delivery project
to sum up
references
About the Author
Abstract Taking CVS as an example, some techniques for the use of version control during the software engineering, the encoding process. In the last part, the final "delivery project" of the software project is also introduced. The presentation of the problem is an important part of software engineering. This part of the work is directly related to the quality of software products. High-efficiency multiplayer development, depends on team spirit, designers to grasp the overall grasp of software architecture, good parallel version control technology, and the institutionalized daily construction and final stage delivery project. In June this year, I am fortunate to obey their daily build and delivery project in a company that develops security software. They have a very deep impression on the inter-edition control, and in-depth applications, and in-depth applications. Here, I would like to share my own learning experience with the reader, some of which benefited from the real-world view of the company, and others from my own actual software engineering project. There is no doubt that a software engineering project is the most valuable part of its design phase. A good design can make the realization link more efficient, which greatly improves labor productivity; and good coding specification is an important cornerstone of collaborative development. Limited to the space, the two contents will not be designed in this article. I will focus on some experience in the coding and testing links in the software engineering. These experiences have excellent software designers and programming, testers, suffering Adjustment, the final test leads to the release of frequently extended development teams. Parallel Version Control - Effective Guarantee for Multi-person Collaboration Improve a small development team with 4 programmers (hereinafter referred to as "TJRP Development Group"), Tom, Jason, Robert and Pat are responsible for 4 modules, follow traditional software Development mode, development will experience programming-year-on-test-release 4 stages. If the initial design is correct, and the four developers are GURU-level programmers and cooperate with tacit understanding, then this mode will work well. Unfortunately, the PAT has just participated in the work, and the understanding of the design document written by the designer is not a thorough, and Robert has proposed some amendments to the design, even worse, the project is not Timely exposed, leading to "differences" "difference" in design in PAT and ROBERT code. As a result, the stage of entering the adjustment, PAT and ROBERT have a fierce dispute. After the noisy is not available, the project manager finally let four people sit together to solve the problem, and finally, there is a double time in the continuation stage. . But unlucky things have not ended yet. Jason found in the test, the behavior of the original code was changed, and he was surprised to send the code was changed by a "other person". After finding a copy of the normal version, he found out that "others" "Some places in the modification are necessary. The code consolidation and retest make the test phase foot enough to spend three times the time I was 3 times. Poor TOM luck is worse, as the primary code reviewer, he has to read all the code. Robert and PAT quarrel led to a lot of code changes, he had to re-audit code, and Jason's code merged new problems made him help Jason to help Jason. The final result is that the cost of software development is 2.4 times the expected, and there is a lot behind the release time. I am not joking, what is told above is a real story that happened in that safety software company. Their technical manager introduces that after the implementation of the standardized development system, and after enabling parallel version control systems, they believe that development has reached a new level. The parallel version control system itself does not produce any code, but due to the use of such a system, the efficiency of development is greatly improved. The so-called version control is not a complex concept.
For most participants in development activities, version control systems can help them do a good job in records in the development process, and by saving files in different periods, delivery engineers and code reviewers can It is easy to narrow the scope of the search problem code, and the programmer can better collaborate in such a system. In general, the version control system of the source code can implement the following most basic functions: Save different version of the record modifier of any source code file, modify the reason, when two users change a file, as much as possible, modified as much as possible When you cannot merge, give a reminder between different versions, or to get the latest version of the entire source code for the latest version, and allow retreat to any version of the saved source code to create code branches. Easy to publish and post-maintenance (later will be mentioned later); new code can be merged into these branches. For different source code, it is convenient for future review access control: prevent unauthorized modifications and reviews, we know that technology is not a solution to all issues, but no one will deny the efficiency of large-scale mechanized production. The handicraft workshop of the handle, once applied, technology will greatly improve our work and life. We can see that the above function effectively solves most of the problems faced by the TJRP development group, for example:
Since the code can be modified simultaneously, and obtain the modification of the other party, PAT and Robert can effectively communicate with the communication between developers as soon as possible, each modification must give the reasons, and record the submitter test and the adjustment can be as early as possible. Starting, avoiding the incompatibility between the modules is exposed in the final stage and hindering the issuance of different developers to make timely merge, and avoiding the conflict code review due to the inconsistency, it can be done for a certain code branch, thereby allowing some development Sustainable development, while stable code can be delivered to users further, with some additional benefits, such as:
Daily build and testing allows the project manager to better grasp the progress of the project, who works, who works more well, can clearly reflect the division of labor in the version control system, through access control, can avoid not understanding the entire code The total crash caused by the developers of the system, which will maintain a large number of development experience in the version control system. This is a development team. We can see that the above improvement is concentrated. Reflect an important thinking, namely:
Timely communication with the emergence of preventing problems; find out as soon as possible, solve problems as soon as possible; clear the reward and punishment system, stimulate the enthusiasm of developers. Below we will introduce some basic use principles in parallel version of the system with a very common version control system - CVS [1]. Code Submit and Synchronization - From Update and Commit Whenever we start a new modification, you first have to extract a new copy from the code library (completed by Update); in local modification, thick After adjustment, the code will be submitted back to the code base as soon as possible (complete through the COMMIT operation). Basic Update and Commit operation processes are shown below: Figure 1. CVS Update and Commit These two operations also solve about 80% of the daily development of approximately 80%. In most cases, this part of the work is quite simple, unless two developers have modified the same file at the same time, for example, two developers have modified the same version of the same file, this situation. Conflict: Figure 2. Conflict in parallel development of CVS may be fast, or the modified thing is relatively simple, so he is first submitted. When A and B extracts code from the code library, the latest version is 1.1, so the version of B is named 1.2 by version control system. But soon, A wants to submit the code, the version control system will reject his commit, because his revision is based on the code-based version 1.1, and the current latest version is already 1.2. CVS provides automatic merge function, which allows automatic mergers to modify and the latest code, of course, if they catch two people to modify the same line of code, CVS will be very ". Smart "merges the two" heroes I'm asked ". But if two people have modified the same line of code, and what should I do? CVS will tell the next submitted developer, and ask him to solve the problem. The difference in the code will be marked in <<< and >>> to facilitate modification. Simply put, when a conflict occurs, we usually agree to resolve conflicts from the latter by the submitting - Of course, he can choose to ignore these conflicts, but these operations will be recorded, not to mention, statistical display, and modify one line of code to two Different sitting situations rarely appear in actual development. As a result, the modification process continues, as shown below: Figure 3. Developer A resolves conflicts, and submits the implementation of multiple developers while modifying the same file at the same time. This problem can basically be solved in the above method. Of course, in order to avoid this situation, you should make the code you work as much as possible as designed. Below is a very basic CVS Update / Commit operating specification: simple CVS operation agreement
First UPDATE before modifying the file. This means that the version is as new as possible, once a conflict occurs, the workload will be smaller. Timely commit. The smaller the code difference between the local code and the code base, the smaller the difficulty of the merger, and the more big probability can get a new version) to separate the different functional units to Separate. On the one hand, doing this can be possible as early as possible, reducing the difficulty of mergers of others; on the other hand, since CVS provides the ability to retreat to previous versions, once a function modification is made, it is easy to modify the content. Instead of the entire modification back to normal code. All code involved in the same function is a CommMIT. I don't want to separate the code to the same function to change, because this will bring trouble to the future tracking. After debugging. This will reduce others will cause problems caused by synchronizing intermediate results, and even subject to conflicts. Write the commit log. The CVS is allowed to save the Commit log, and you can write this code, and what kind of modification has been made. Clear COMMIT LOG can help other developers understand the modified content without carefully read the code, thereby greatly Improve development efficiency; on the other hand, these logs are very valuable for developers, and the entire development team. Synchronous Code (UPDATE) and submission code (commit) accounted for more than 80% of CVS daily operations. From the above introduction, we can see that only relying on these two very simple features, CVS can greatly improve the development process and improve the controllability of software engineering. general speaking:
All developers use the same central code base to eliminate inconsistencies due to the recovery files. When a conflict occurs, the developer submitted will have to solve it. This developer can know who introduced a conflict, he can solve conflicts in his own, or discuss how to resolve conflicts with developers who introduce conflicts. Testing can penetrate through the encoding process, and new issues introduced at any time can be tracked, quickly locate, and resolve more effectively. Due to the presence of a central code base, code auditors and daily builders can promptly understand whether there is a problem, and help the project manager guarantee the development progress. It helps to help developers develop rigorous work habits - rules require them to submit the correct code as much as possible, and each modification must be written for the COMMIT LOG. Helps to establish a more fair work quality assessment mechanism. CVS can record the actual workload of each person, including the work made by correction issues, and the quality of their code. In this way, managers can provide better job opportunities, compensation, and so on for excellent developers, which is very beneficial for encouraging the morale of the entire team to improve the enthusiasm of developers. Promote communication between developers. Although the CVS itself cannot replace communication, the CMMIT log, and the ability of the CVS system to obtain the difference between any version, can help developers understand the other's ideas and promote them together. Reduce the threshold of program developers. Since it is provided with many very convenient cooperative development methods, CVS can reduce the milling period required for collaborative development, and at the same time, due to the recipient communication between different levels, the coding process is skilled from the skill type work to skilled Sexual work has been advanced. This means that high-level developers can go to work more to play their own specialties, and novices can quickly integrate into daily development activities, thereby increasing labor productivity, reducing development costs. Communication in the encoding process - Commit Mail CVS is an open tool that can be very easy to formulate. In general, the CVS server will be set on a UNIX host (we recommend using FreeBSD), and CVS can complete some additional functions by using a scripting language (for example, Perl). Commit mail is the extension of the CommT log on the mail system. Here is a typical commit mail, from the FreeBSD development team:
phk 2003/10/21 23:32:20 PDT FreeBSD src repository Modified files: sys / geom geom_io.c Log: Forgotten commit: If a provider has zero sectorsize, it is an indication of lack of media Tripped up:. peter Revision Changes Path 1.50 3 -6 SRC / SYS / Geom / Geom_io.c We see that the developer (PHK) is mentioned in this Commit Mail (Pacific Time (Pacific Ocean Time 23:32) 20), the code library name (freebsd src repository), modified file (Geom_io.c) of Sys / Geom and CommT log. Finally, the CommMit Log also mentions the latest version (1.50), modify the size ( 3 -6), and the actual path of the code. Realizing the above features are not complicated. In fact, you only need to download a set of custom FreeBSD CVS code libraries (no more than 40KB), and make a small amount of adjustment, you can directly use these features (we will be short later Publish these contents). Taking these customs or even unsuitable perl and c / c common sense can be completed - of course, I think this kind of common sense is not a high requirement for software developers. Commit mail can be sent to all developers through a mailing list. Many large software companies, and open source groups use this way to coordinate development activities. Daily Test - Adhere to the daily construction of traditional software engineering, after the test occurs. The theoretical basis for this is that tests depend on a consistent, at least code that can be compiled and started. This condition cannot be satisfied before the adjustment. However, after a version control system (such as CVS), the temporary adjustment has become the daily behavior in the development. The code is almost consistent at every moment, and even in many cases, the code will be available in the available state, thus creating very favorable conditions for testing. Many large development teams use one or even a machine called Tinderbox to complete daily build and test. The simple daily construction process is as follows:
Test engineers, or code review, from the code library, picking up a code of the code, compiling any errors that compile compilation on Tinderbox, the test engineer or code review will fall back to the code to the last point capable of successfully compiling. And other developers submitted thereafter, solve the problem test engineer to test the code, first, step is to pass the script, automatic completion, no artificial intervention. Compiling errors in the third step will occasion in software development (this may come from problems caused by conflict merge, but because of developers' local testing, this stylistic will not be regular), habits, these mistakes Will track and process the code review, and hand it over to the relevant developer resolution. Test engineers can deliver compiled versions to a test group, even users to test. Test engineers may release software "Snapshot" versions. In fact, in many major companies, it is very "homewinner" everyday. The Internet Explorer version, such as 6.0.2600 or similar 6.0-2600 in the BUILD 2600 is that this code has experienced 2,600 daily build operations (of course, there must be many modifications in the middle, and Do not rule out this 2600 is deliberate, but in short, they have a considerable daily construction work). In the later stage of software development, due to the urgent complex, daily construction is likely to evolve into continuous construction, ie, each commit is triggered once. All issues get feedback immediately. In order to support daily build or continuous build, the ideal method is to use Makefile to complete the build operation. For UNIX systems, the make tool is usually PMake (BSD Make) or Gmake (GNU Make); for Visual C , NMAKE. Large development groups typically use a set of scripts to complete all Make operations, and for small and medium-sized projects, it is acceptable to manually use Ide itself such as VC . Basic daily construction specifications are as follows: Basic daily construction and test considerations
Avoid extracting snapshots when there is a large-scale community of developers. The intermediate results extracted at this time are likely to have problems and cause daily build from the head. Typically, test engineers and code review are beginning to build daily at most developers, therefore, everyday construction is sometimes referred to as a Nightly Build. Tag (TAG) can be compiled correctly in daily build. This will reduce the trouble in the daily construction of the next day. Timely notification caused problems, that is, the submitter involved in the code. Some companies even asked employees to drive mobile phones and pagers to ensure the period. As a valid supplement of Commit Mail, many project development groups create mailing lists to pass some related information. The test daily will usually issue participants throughout the development team. In addition, there is a considerable benefit of a problem that appears, and there will be considerable benefits for the test link - these problems are subsequently tested to ensure that the final release does not appear. these questions. Daily construction is not available, as an important means of daily testing, daily build can effectively help managers understand the progress of the project, help developers find problems as soon as possible, and will also promote communication in the development group. Effective version control - version tag, "latest version" of three files in code branch is 1.5, 1.3, 1.4, but the latest version does not necessarily what we need. In this case, the version control system provides a very important mechanism - version tag. For example, we have now confirmed the 1.4, 1.3, 1.4 combination of three files, which can run normally, so we labeled tag_1 on these versions of three files, as shown below: Figure 4. Tag_1 tags are hit three files Different versions need to be described in that the tag can be moved. This means that once the label is found, the tag can be moved to a different location. But in practice, tags often use another very important version control mechanism - code branch. Before you discuss the importance of tag (TAG), let's take a look at the code branch: Figure 5. A more complex situation, a file in the project being developed, has completed the release of 2.0 and 2.1; Among them, BP refers to a BranchPoint, a so-called code branch, is a very critical concept in version control. When developing to a phase, you can deliver a version, while the main developers put energy into the latest version of the developments. Some problems in the first delivery branch (2.0), and the introduced new features are then corrected in the releng_2 branch, the company decides to release version 2.1; after that, the problem in 2.x continues to be corrected in releng_2, and some security Update, is merged into the 2.1-release (releng_2_1). Figure 5 shows a version branch on a file. The source code of the actual software engineering project is composed of a large number of files, although in essence is said to each file, it is like a file labeled the same branch name, it can express a set of specific versions. A collection of files. The version of the CVS has a large defect, namely, the split point of the large number of files (BranchPoint, that is, the original version number of a certain branch) is difficult to be specified in CVS (CVS supports a certain branch, a particular Time, a particular version to extract files, but the version number of different files is not uniform, especially in large projects, there must be some files because there are many times, and the version number is very "high" ). In order to eliminate this defect, in practice, we use the method of version marking and branch, ie after the new branch, add a version tag on this version of these files.
For example, for the 2.0 version of the software, the two branches of releng_2 (2.x), releng_2_0 (2.0) will be divided, and the version of the file is hit by a RELENG_2_0_0_BP. In this way, when comparing the version later, we can use releng_2_0_0_bp to specify this version. This tag will greatly reduce the workload of the code review after a different branch has increased many modifications. Note that the code branch is not limited to the usage of the version. In fact, multiple different software based on the same code can also be developed using the code branch. In the end, these code can also merge into one. You may have noticed a list of the leftmost versions: 1.1, 1.2, 1.3, 1.4, 1.5. 1.6. In CVS, this sequence is referred to as "Main Branch". Although not necessarily, but habits, main branches are often active development branches. In this branch, people continue to introduce the latest features, of course, inevitable, this may also trigger some problems, and these problems of these introduction primary branches will followed, revised. After a period of time, the code "precipitation" can enter another code system called "stable branch". Such development models are often referred to as "multi-headed" mode, such mode is very common in many open source software development, for example, Linux's single, double version, freebsd --stable and -current [2], and many more. In general commercial software development, this model is also quite common, especially in the development of large companies. Having a long and entered this capability is especially important for large software development, because large software is likely to contain considerable modules, through version control, problems can be easily tracked throughout the development team. In a multi-headed development environment, developers can participate in development or maintenance in the case of rough familiarity with a branch, which means that even if the maintenance personnel of a code branch suddenly leaves, others don't have to worry about If you read the code of different branches, it is difficult to understand, in other words, the requirements for new maintenance personnel are reduced, so that the development and maintenance process of software can be done in order to do. As far as I know, the software development process of FreeBSD greatly benefits from a long-in-market development model. The software development model used in FreeBSD is simply introduced below:
Case: FreeBSD development model FreeBSD includes two main development branches: 4-Stable and 5-Current, as well as a number of secure branches. Among them, 4-stable (releng_4 branch) is the development of the FreeBSD 4.x series, focusing on the stability and performance of the system; 5-Current (HEAD branch) represents the development of the FreeBSD 5.x series, The focus of its attention is to introduce the latest operating system characteristics as much as possible, new design ideas, and so on. In addition, there are also some branches called safe branches, which represent freebsd 2-stable, 3-stable, 4.6-release, 4.7-release, 4.8-release, and upcoming 4.9-release, but These branches do not introduce any new features at all, only security updates can be added to these branches. FreeBSD's "safe branch" is a very important concept. In the development of FreeBSD, these branches are basically only one, which is called "FreeBSD Security Officer", which includes a small number of developers (currently only two people). . For many users, they don't care about whether the operating system has new features - they are not willing to try new version of software because existing systems are working very well. These users use the "Secure Branch" FreeBSD operating system because he can provide the necessary security updates, and the operating system characteristics will not change, whether this change can improve performance, or provide some dazzling functions, and even support New hardware, because the user's system has already placed there). The Current branch is another extreme extreme. All new features, once the COMMITTER test is passed (large changes requires the core team, the Core Team approved, "is not a lot), allowing the Current branch to be introduced. Although the Current branch can be properly compiled in most of the time, the introduction of new features may inevitably bring some problems, such as hardware adaptability. Between these two extremes, there is an intermediate route, namely the Stable branch. The code submitted in the CURRENT branch is usually specified for a MFC (Merge from -current) time. After this time, if no one is submitted to the code, the code will be introduced into the Stable branch. In this way, the code of the Stable branch is almost a quite long-term code. For most users, the Stable branch is a good choice. In general, the code in FreeBSD will experience the following history:
The code is introduced into the CURRENT branch related developer to get feedback from the user; if you confirm that there is no problem, the code is introduced into the Stable branch Most end users using the Stable branch code to support their computer. We can see, the development above The pattern simultaneously tested the interests of developers and user groups. On the one hand, the active development will not be accused of affecting a large number of ordinary users; on the other hand, the code in the development branch is introduced into Stable after a period of time, and the end user can get the new operating system features. In fact, the above development model has proven to be quite successful. Since there are quite a few people in the development process testing the new Current and Stable branch code, in the last few years, the development of FreeBSD has always been a good situation. Release Engineering Basic-Features Freezing and Code Freezing Many readers involved in large projects may experience, at least in the concept of feature frozen and code frozen such a concept. The so-called "feature freezing" is actually an agreement between developers, in which this phase is no longer allowed to add new features. Feature Freeze is usually starting at a development branch to deliver a release branch. The reason why the characteristics need to be frozen, because adding new features is likely to introduce new problems, and this will bring heavy burden to the code review, and even finally delivered a unsuccessful delivery. Of course, for small software engineering projects (for example, traditional waterfall development models) that have explicitly define a characteristic table, the feature freeze has no significance, because in these projects, the detailed design is complete before writing code. This means that the code will be written to be clearly defined in the detailed design. However, in the actual project, detailed design tend to include two types of different types of requirements - a part is "Must Have Feature", another part is "Designed Feature" . Prior to delivery, all "The features that want to achieve" will be clarified as "realistic" and "not realistic" when feature freezing. We noticed that in this case, some features are delayed until proximity to delivery, and others must be implemented, while others must be implemented as "not implementation", thus transforming to us before Familiar, that is, the detailed design documentation clearly defines all of the features in the software. The result of this is that software engineering projects have greater flexibility. The functional design generated by the customer needs, it is obvious that it should be listed as "Features that must be implemented", and those developing teams can improve the overall scalability, scalability, or other properties of the software, should be listed as " Hope to achieve the features. " After the feature freeze, the entire development team will focus on those "realization" (although these features may have not been formally achieved) more stable, it will produce higher quality software. According to my personal experience, feature freezing should occur when the expected encoding time has been used to approximately 2/3. At this time, the project manager should organize the developer to hold a meeting to discuss feature freezing, and after the feature is freezing, any developers should not consider those that are listed as "not realistic". The code freezing is a concept similar to feature freezing, at this stage, only allowing errors in the frozen code branch to be modified without allowing any other modifications. In practice, in this process, only delivery engineers (usually one or more code recipients with rich development experience) have been granted the power of review and approval code, any code modification, as long as there is no After the deposit of the delivery engineer, it cannot be submitted to the code base. The code freezing time is generally not only too long.
For medium-scale projects, this process usually lasts one to two weeks, for large projects, this process may last for a month or even longer. At this stage, delivery engineers are primarily responsible for code review, and test engineers have responsible for timely feedback to exposure in the concentrated test, and contact related developers to solve these problems. Technically, code freezing can be implemented by modifying the configuration in CVS. However, a better way is to ensure that the code is frozen. Delivery Project - Summary Delivery Project of the Coding Stage In a sense, it is directly related to the user's interests. As mentioned earlier, quite a number of techniques and methods for improving the development efficiency through version control technologies during software development, here I will continue to talk about delivery projects. The previously referred to the daily construction of the version tag appropriately on the code, and adding version tags at the division point when dividing the version branch, which is very important for delivery projects. Delivery projects, in software engineering projects are an important phase of incorporating code review and centralized testing. General procedure for delivery
During a specific phase, the delivery engineer delivers that other major project administrators decided to decide to announce the code frozen and delivery project. Delivery engineers focus on the code, and test engineers organize large-scale centralized tests. During the delivery of the project, the delivery engineer issued a "Release Candidate", and the test engineer followed the installation and testing these delivery candidates. In this regard, the test engineer feeds back to the delivery engineer, and the delivery engineer classifies, organizes, and resolves the corresponding developers. Finally, a "delivery candidate version" is finally designated as the delivery version. The code is frozen, and the software is delivered to the customer. Some readers may have noticed that the vast majority of the delivery project has actually been integrated into the development process I described above - during the entire process of the coding phase, the code review, problem feedback and testing have been continuously performed during the entire process of the coding phase. However, in the "delivery project" phase, the code review and testing have been upgraded to a more core position. At this stage, the important task developed is to check the error and troubleshoot, and no longer push the function of the software. A new level. Summary - Based on version control, we have introduced some methods of improving the coding and testing phase of software engineering by introducing version control systems. These methods come from my own experience and experience in the actual projects learned through some other ways. The content introduced in the text is mainly CVS (Concurrent Version System) because CVS is easier to get (it itself is open source software), and is more mature. In addition, since the details regarding the CVS itself are very deep, the reader is also easy to use these experiences in other version of the control system, such as Bit Keeper, Perforce, Clear Case, and so on. Introducing version control, after daily construction, project management personnel and developers can significantly feel the following improvements: