Martin Fowler & Matthew Foemmel the transparent translation of the English original copyright is owned by Martin Fowler Original text is copyrighted by Martin Fowler Original link: http: //martinfowler.com/articles/continuousIntegration.html Martin Fowler Chief Scientist, ThoughtWorks dollop: 2002 On January 23, we are honored to listen to Mr. Martin Fowler in the online communication of Umlchina. In communication, Martin Fowler recommended this article to all China Software Developers: Continuous Integration ("Continuous Integration"). In the first reading, I felt its components, Agilechina's Lin Xing also praised: "The idea is very good. The master is master." Then, I used it for a week, I finally translated this article. Readers. Since this is a gift for Mr. Fowler to all Chinese software developers, I will never dare to exclusive. Anyone can reprint this article anywhere, but please keep this article integrity during reprinting - including title, copyright statement, description link, translator ... In short, please do not do any changes or increase in turn. In addition, if you can give me a mail when you reprint it, I will be more happy. Below, please start appreciation of this wonderful article. There is an important part in any software development: getting a reliable software creation (Building) version. Despite the importance of creating, we will still surprise because of the creation failure. In this article, we will discuss the process of implementing Matt (Matthew Foemmel) in an important project of Thoughtworks, which is increasingly valued in our company. It emphasizes fully automated, repeatable creation processes, including multiple automation tests running multiple times a day. It allows developers to integrate system every day, thereby reducing problems in integration. ThoughtWorks has opened the source code of CruiseControl software, which is an automated integration tool. In addition, we also provide consultant services in CruiseControl, Ant, and continuous integration. If you need more information, please contact Josh Mackenzie (Jmackenz@thoughtworks.com). This article has the following main content: The more integrated integration of continuous integration, the better the effect, the better the success of success? Single Code Source Automation Creating Script Self-Test Codes The master creation code is also summarized in the field of software development, there are a variety of "best practices" in the field of software development, and they are often talked, but there seems to be very real. These practices are the most basic and most valuable: there is a fully automated creation, testing process, allowing the development team to create their software multiple times a day. "Day creation" is also a view that people often discuss, McConnell creates Japan as a best practice in his "Fast Software Development", and the day creation is also a development method that Microsoft is famous. However, we support the view of XP community: day creation is just the minimum requirement. A fully automated process allows you to do multiple creation multiple times a day, it is also entirely worth it.
Here, we use "Continuous Integration" terms, this term is from a practice of XP (extreme programming). But we believe that this practice has long been present, and many people who have not considered XP are also using it. However, we have always used XP as a standard for software development process, and XP also has a profound impact on our terminology and practice. Despite this, you can only use continuous collection without having to use any other part of XP - in fact, we believe that: Connected integration is a basic component for any practical software development activities. Implementation of automation day requires the following work: Save all source code in a single location, allowing everyone to get the latest source code (as well as previous versions) from here. Enable the creation process to completely automate anyone, you can complete the creation of the system when you enter one command. Enable the test to automate, so that anyone can only enter a command to run a complete system test. Make sure everyone can get the latest, best executable. All of this must be guaranteed by the system. We have found that introducing these systems to a project requires considerable energy. However, we also found that once the system is established, keep it normal operation, don't spend how much effort. The advantages of continuous integration describes the greatest difficulty in integration: it has fundamentally changed the entire development mode. If you don't have worked in a continuous integrated practice environment, you'll be difficult to understand its development mode. In fact, most people can feel this atmosphere while working alone - because they only need to integrate with their own systems. For many people, the word "team development" always allows them to remember some of the problems in the field of software engineering. Sustained integration reduces the number of these problems, in order to a certain system. The most basic advantage of continuous integration is: it fully avoids the "Insect Conference" of developers - the previous developers often need to open this, because someone stepped into the field of others when they work, affected Others' code, and those affected still don't know what happened, so BUG will appear. This bug is the most difficult case, because the problem is not in the field of a person, but is on the exchange of two people. As time goes, the problem will gradually deteriorate. Typically, the bug appearing in the integration phase has already existed before and even a few months. As a result, developers need to spend a lot of time and effort in the integration phase to find the root of these bugs. If you are using continuous integration, the vast majority of such bugs can be found in the same day introduced. Moreover, because there is not much part of the change in the day, the location of the error can be found quickly. If you can't find BUG, you can also integrate these annoying code into the product. So, even if you have the worst case, you just don't add the characteristics of the bug. (Of course, you may have your requirements for new features over the hate for Bug, but at least you can choose more.) Until now, continuous integration can not guarantee the bug that appears when all integration occurs. The continuous integration of misalignment depends on the test technology, knowing that the test cannot be proven to have found all the mistakes. The key is that the continuous integration can catch enough bug in time, which has fallen back to its overhead. Therefore, continuous integration can reduce the time of "catching insects" in the integrated phase, thus ultimately improving productivity. Although it is not known whether some people have scientific research on this method, it is obvious that it is quite effective as a practical approach.
Continuous integration can significantly reduce the time in "integrated hell", in fact, it can turn hell into a piece of dish. The more frequent integration, the better the effect, the basic points of the integration, the basic point of the regular integration is better than rare integration. For continuous integrated practitioners, this is natural; but for those who have never practiced sustained integration, this is contradictory with intuitive impression. If your integration is not frequent (less than once a day), then integration is a painful thing, it will cost your large amount of time and effort. We often hear some people say: "In a large project, you can't use day creation." It is actually a very stupid point of view. However, there are still many projects practice sustained integration. In a 50,000-line code project, we have to integrate more than two more than two more. Microsoft is still created during the project of tens of millions of lines. Sustained integration, because the integrated workload is proportional to the square of the two integrated intervals. Although we have no specific measure of data, it can be estimated that the workload required for integration once a week is definitely not 5 times the integration every day, but approximately 25 times. In this way, if integrated makes you feel pain, you may explain that you should integrate more frequently. If the method is correct, more frequent integration should reduce your pain, let you save a lot of time. The key to continuous integration is automation. Most integrations can be completed automatically. Read source code, compile, connect, test, which can be done automatically. Finally, you should get a simple information, tell you whether this creation is successful: "Yes" or "no". If you work, this time you integrate this; if you fail, you should be able to undo the last modification, go back to the previous successful creation. During the entire creation process, you don't need your brain. If you have such an automation process, you just want to create more frequently created. The only limiter is that the creation process itself will consume a certain time. (Translation: But compared with the time required to catch insects, this time is negligible.) What is the successful creation? There is an important thing to determine: What kind of creation is successful? It looks very simple, but such a simple thing will sometimes become a mess, it is worth noting. Once, Martin Fowler went to check a project. He asked whether the project was created in the date of executing day, got a sure answer. Fortunately, Ron Jeffries were also present, and he mentioned a question: "How do you deal with creation error?" Answer Yes: "Let's send an e-mail." In fact, this project has not been successful for several months. created. This is not created day, this is just the taste created by Day. For the following "successful creation" standards, we are quite confident: all the latest source code is configured to manage system verification Qualified All files are connected by recompiling the destined target file (in us is Java Class file) Received the executable file system to start running, the system test suite (there are about 150 test classes here) start running if all steps are not wrong, no one is interfering, all tests have passed, we get A successful creation of most people think "Compile Connection = Creation".
At least we believe that creation should also include launching applications, simply testing the application (McConnell is called "smoke test": Open the switch to run the software, see if it will "smoke"). The more detailed test set can greatly increase the value of continuous integration, so we will prefer more detailed testing. Single code sources In order to achieve daily integration, any developers need to easily get all the latest source code. In the past, if we want to integrate, we must run through the entire development center, ask every programmer to have a new code, then copy these new code, then find the appropriate insertion position ... Nothing than this worse . The way is very simple. Anyone should be able to bring a clean machine, connect to the LAN, and then use a command to get all the source files immediately, start the system's creation. The easiest solution is to use a set of configuration management (source code control) system as a source of all code. Configuration management systems typically have a network function and a tool that allows developers to easily obtain source code. Moreover, they also provide version management tools so you can easily find the previous version of the file. Cost is even less problem, CVS is a set of excellent open source configuration management tools. All source files should be saved in the configuration management system. I am talking about this "all" often more than people think, it also includes creation scripts, attribute files, database scheduling DLL, installation scripts, and other things you need to create on a clean machine. I often see this situation: The code has been controlled, but some other important files can't be found. Try to make sure all things are stored in the same code source tree of the configuration management system. Sometimes people use different projects in the configuration management system in order to get different components. The trouble brings to this: people have to remember which component is used which version of other components. In some cases, you must separate the code source, but this happens is much smaller than you think. You can create multiple components from a code source tree, those above, can be resolved by creating scripts without having to change the storage structure. Automation creation scripts If you have written a small program, there are only more than a dozen files, then the application's creation may just be a command: javac * .java. More projects require more creation: You may put your files in many directories, you need to make sure that the target code is in place; in addition to compilation, there may be a connection step; you may also be from other The code is generated in the file, which needs to be made before compiling; the test also needs to be run automatically. Large-scale creation will often take some time, if only a little change is done, of course, you will not want to re-do all these steps. The good creative tool automatically analyzes the parts that need to be changed. The common method is to check the dates of the source file and the target file, and only when the modification date of the source file is later recompiled. As a result, the dependence between the files requires a little skill: if a target file changes, only those target files depend on its target file will be recompiled. The compiler may handle this type or it may not. Depending on your own needs, you can choose a different type of creation: The system you created can have test code, or you can choose different test sets; some components can be created separately. Creating scripts should allow you to choose different creative goals according to different situations. After you enter a simple command, help you provoke this heavy burden is often the script. You may use a shell script or a more complex scripting language (such as Perl or Python).
But soon you will find a specially designed creation environment is useful, such as the Make tool under UNIX. In our Java development, we will soon find a more complex solution. Matt used a considerable amount of time to develop a creation tool for enterprise Java development, called Jinx. However, we have recently turned to use open source creation tool ANT (http://jakarta.apache.org/ant/index.html). Ant's design is very similar to JINX, and also supports Java file compilation and JAR sealing. At the same time, it is also easy to write Ant's extension, which allows us to complete more tasks during the creation process. Many people use IDE, and most of the IDEs contain the functions of creation management. However, these documents are dependent on specific IDEs and are often relatively fragile, but also need to work in IDE. IDE's users can build their own project files and use them in their own separate development. But our main creation process is built with ANT and runs on a server using Ant. Self-test code only allows the program to be compiled and far from enough. Although a strong type of language compiler can point out many problems, even if you successfully pass the compilation, the program may still leave a lot of errors. In order to help track these errors, we highly emphasize the automation test - this is another practice of XP advocating. XP divides the test into two categories: unit testing and accommodating test (also called function test). The unit test is written by the developer, usually only one class or a group of classes. Access tests are usually prepared by the customer or external test group in the help of the developer, and the end-to-end test of the entire system is used. We will use these two tests and try to improve the degree of automation test. As part of the creation, we need to run a set of tests called "BVT" (Build Verification Tests, create a confirmation test). All tests in BVT must pass, and then we can announce a successful creation. All XP-style unit tests belong to BVT. Since this article is about the creation process, what we say "test" basically refers to BVT. Remember, in addition to BVT, there is also a test line exists (translation: referring to a function test), so don't mix BVT and overall testing, QA, etc. In fact, our QA team will not see code without BVT because they only test successful creation. There is a basic principle: while writing code, developers should also write corresponding tests. After the task is completed, they must not only return the CHECK IN product code, but also to return these code tests. This is also very similar to the "test first" programming style of XP: before writing the corresponding test, and see the test failure, you should not write any code. So, if you want to add new features to your system, you should first write a test. This test can only be passed after the new features have been implemented. Then, your job is to make this test. We write these tests with Java, using the same language, so writing tests and writing code is not too big. We use junit (http://www.junit.org/) as an organization, write a test framework. JUnit is a simple framework that allows us to quickly write tests, and test the test organization as a kit and run the test kit with interactive or batch mode. (Junit is the Java version of the XUnit family - xunit includes a test framework for almost all languages.
) In the process of writing software, after each compilation, developers usually run a part of the unit test. This actually improves the developer's work efficiency because these unit tests can help you find logic errors in your code. Then, you don't have to debug the error, just pay attention to the code that is modified after the last test is running. This modification should be small, so it's easy to find BUG. Not all people strictly follow the style of XP "test first", but the benefits of writing tests in the first time are obvious. They not only make each person's work efficiency, but the BVT composed of these tests can more capture errors in the system. Because BVT is running several times a day, any questions checked out by BVT is relatively easy. The reason is simple: We only have a considerable small modification, so we can find bugs in this range. Running the wrong miscarriage in the modified piece of code, is of course more effective than tracking the entire system. Of course, you can't expect to test to help you find all the questions. As people often say: Tests cannot prove that there is no error in the system. However, doing good beauty is not our only requirement. Not enough testing is much better than the "perfect test" that is not always written frequently. Another related question is: Developers write tests for their own code. We often listen to people: developers should not test their own code, because they easily ignore the mistakes in their work. Although this is also the fact, the self-testing process needs to quickly transfer the test to the code basis. This rapid conversion value exceeds the value of the independent tester. Therefore, we still use developers to prepare the tests to construct BVT, but still have independently prepared accommodation tests. Another important part of the self-test is that it improves the quality of the test by feedback -XP. The feedback here comes from the bug escaping from the BVT. The rules of the self-test are: unless you have added a corresponding test in the BVT, you cannot correct any errors. In this way, whenever you want to correct an error, you must add the corresponding test to make sure the BVT will not put the mistake. Moreover, this test should guide you to consider more tests, write more tests to enhance BVT. The automation of the main creation of the creation process is very meaningful for single developers, but it really glows, or generated throughout the system. We found that the main creation process allows the entire team to come together and let them find the problem in integration. The first step is to select the machine running the master. We have chosen a computer called "Torch Car" (we often play "" Imperial Times "J), this is a server with four CPUs, which is very suitable for designing. (This horsepower is required because the complete creation requires a long time, so this horsepower is required.) Creating a process is done in a Java class that is always run at any time. If you do not create a task, create a process is always waiting, check the code warehouse every few minutes. If no one is returned any code after the last creation, the process continues to wait. If there is a new code in the code warehouse, you will start to create. The first phase of the created is the code in the warehouse. StarTeam has provided us with a very good Java API, so it is also easy to cut into the code boy. Daemon will observe the warehouse before five minutes, see if there is anyone in the last five minutes to return the code. If there is, the daemon will consider and then extract the code (so as not to extract in the process of the code). The daemon extracts all the code into a directory of the torch machine. After the extraction is complete, the daemon will call the ANT script in this directory.
Then, Ant will take over the entire creation process and make a complete creation of all source code. The ANT script is responsible for the entire compilation process and put the obtained Class file into six JAR packets, published on the EJB server. When Ant completes the compilation and release of work, create a daemon starts running new JAR on the EJB server, and starts running the BVT test kit. If all the tests can run normally, we get a successful creation. Then create a daemon will return to StarTeam and create a number on all extracted source code. Then, the daemon will observe if someone is returned during the creation process. If you have, you will start to create again; if not, the daemon returns to its loop, waiting for the next return. After the end of the creation, the Create a daemon will give all the developers who have returned the code to the latest creator, send an E-mail, report the creation. If you leave the creation after the code is returned, don't use E-mail to informed the developer, we usually think this is a bad form. The daemon writes all the steps in the XML format log file. A servlet will run on the stone truck, allowing anyone to check the log through it to observe the status created. (See Figure 1) The time to create is running and start running on the screen. Historical records, successful, failed, have recorded in the left. Click on some of the records, it will show the details of this creation: compile whether it passes, the results of the test, what happened ... We found that many developers often look at this page because it let them see The direction of the project development, it sees that the change occurs as people constantly return the code. Sometimes we will put some other project news on this page, but you need to grasp the scale. It is important to allow developers to simulate the main creation process on their local machine. In this way, if an integrated error occurs, developers can study, debug, and do not have to implement the main creation process in their own machine. Moreover, developers can also create local execution before returning code, thereby reducing the possibility of the main creation failure. There is a more important issue here: the main creation should be clean creation (starting from the source code) or incremental creation? Incremental creation will be much faster, but it also increases the risk of introducing errors because some parts are not compiled. And we have the risk that cannot be recreated. Our creation is quite fast (200,000 line code for about 15 minutes), so we are happy to create cleanly every time. However, some teams like to create incremental increments in most, but when those strange problems suddenly appear, they are often cleaned (at least once a day). Figure 1: Servlet code belonging to the stone car is also created using automation means that developers should follow some rhythm to develop software, the most important thing is that they should be integrated. We have seen some organizations, they also created day, but the developers did not regain the code. If the developer returns a code for a few weeks, what is the meaning of the day creation? The principle we follow is: Each developer is at least once a day. Before starting new tasks, developers should first synchronize with the configuration management system. That is, they should first update the source code on the local machine. Write the code on the foundation basis, which will only bring trouble and confusion. The developer then keeps the file update.