o you want to be agile, do you? You want to work in small increments and continuously deliver business functionality. You want to embrace change, even if that means taking on new requirements late in the game. But wait, won’t that be dangerous? It doesn’t have to be if you’ve got a solid Continuous Integration (CI) infrastructure in place.
Agile Development and Friction
Before digging deeply into Continuous Integration, let me talk first about how agile developers want to work in this brave new world, and some of the potential problems along the way. Agile development means working iteratively and incrementally to build a system one feature at a time. The development team slices each project into iterations?but don’t think that this just means doing a series of mini-waterfalls, because in agile development the team does all the traditional phases of development and testing simultaneously. The goal of any given iteration is to complete a finite number of business requirements. Completion in this case means “done, done, done.” In other words, the development team has coded the feature, it passes all its acceptance tests, it’s signed off by quality assurance, and the customer has reviewed and approved the feature.
Think about this for a minute. You want to take a number of business requirements and drive them to a deployable state within a short time frame. To finish, you need to put the new code in front of testers and the actual customers. To do that, you need to pay attention to how your team uses testers during each iteration. You can’t just let testers sit idle for the first three quarters of the iteration, and then dump all the code onto them at one time. You’ll increase the project’s overall throughput if you can push some completed features to the testers as soon as possible to get them engaged early in the iteration.
The point of all this discussion is to emphasize that you need to stage code very frequently in the face of an evolving design and architecture, moving small bits of completed features from development to testers to demo servers frequently within the course of each iteration. You need to make change safe and risk free by building in lots of feedback loops that spot and diagnose problems. I like to think of agile development as a car engine that runs at high RPM. Friction that you might not feel in a sequential development cycle becomes painfully apparent in a more iterative process. Fortunately, you can use some software engineering practices that can enable iterative processes and lubricate the friction. Chief among these practices is Continuous Integration.
What Is Continuous Integration?
Roughly put, Continuous Integration is the practice of doing integrated builds very frequently, usually upon each check-in to source control. CI is first and foremost a development practice, but most teams will use a CI server tool such as CruiseControl.Net to execute their CI process. The tools are helpful, and maybe even essential, but the practice and the mindset is the important take-away from this article. Forgetting for now the specific choice of CI software, the basic CI workflow looks something like this:
- Check in code to source control. (You are using source control, aren’t you?)
- The Continuous Integration software on the build server sees that there are updates to source control since the last build.
- The CI server retrieves the latest version of the code into a working folder on the build server.
- The CI server then runs an automated build script (more on this later).
- For a successful build, the CI server will create a tag in source control to version the build.
- Finally, the CI server will broadcast the success or failure of the automated build. This might be to a monitoring client on each development workstation, instance messenger, or through email.
The build script must be fully automated and run without human intervention (you do want to go fast repeatedly, right?). The script needs to know pertinent folder locations and the resources that your system needs to run. Most build scripts will compile the latest version of the code, set up the development environment (registry settings, putting components in the GAC, setting up IIS virtual directories, you name it), and run the battery of unit tests in the code. Running the automated build should give you a quick indication whether there is anything wrong with the code. CI is all about getting feedback about the state of your code.
More advanced builds will include automated database builds, static code analysis, integration or acceptance tests, and automatic generation of deployment packages. Again, CI is all about moving quickly and safely by automating repeated deployment activities and building feedback loops around your code to spot problems quickly.
|Editor’s Note: This article was first published in the May/June 2008 issue of CoDe Magazine, and is reprinted here by permission.|
It Works on My Machine!
How many times has this happened to you in your career? You grab the latest code out of source control in the morning and immediately find out that it doesn’t work on your machine. You turn to the guy to your left and ask what’s going on. He shrugs and says, “I don’t know dude. It works on my machine!”
That’s friction. Your development team loses productivity anytime a developer or a tester can’t work on the code because of some kind of environmental problem. CI beats that friction by finding problems upon check-in instead of letting other developers or testers check out non-functioning code.
CI’s tenets include always having an automated, repeatable build script and a single repository for the system. The end goal is to be able to walk over to a fresh machine, download all the code and dependencies from source control, run the build, and be completely ready to work.
To pull this off, you first need to put as many of the third-party components that your system relies upon in source control with your code. That adds the benefit of being able to tie a version of your code to its dependencies. Don’t depend upon GAC installations of components because the next developers or testers might not have that software installed on their machines. The build script does need to be portable to other machines, so be careful with absolute paths and assumptions about the machine that the script is running on. The “portability” issue is largely enforced by running the build independently on a clean build server.
Your build script should completely lay down the development environment. Besides compiling code and running unit tests, the build should make environment changes such as setting up virtual directories, installing Windows services, registering COM components, and anything else it takes to run the code. The build script should include enough integration, smoke, and environment testing to ensure that the development environment is ready to function.
Several years ago I worked on a project where we struggled with the security architecture. Frankly, the project had a lot of churn, and each change required a different IIS configuration on developer workstations to function. I checked in some code one day that required a different IIS setting, and it took me the better part of the next day to get the other developers running again. If I’d instead just added the IIS configuration setup to the NAnt build script, and tested that configuration as part of the CI build, I could have saved my team a lot of lost productivity.
Essentially, you want to make the build script the authoritative description of the system architecture. If the build script really can configure the environment, it becomes a very useful form of documentation that does not get out of synch with the code. Think about how many times in your career you’ve started a new project with a legacy code base, only to spend the first couple days just trying to make the code run on your workstation. Documentation is always helpful, but you can simplify life for the next guy by using CI with an automated build.
In the course of a development project you move code around frequently. You send new changes and bug fixes to testers. You set up demonstrations for the business to get feedback. If you’re relying on manual processes to migrate code you might have a lot of waste in your development process. Consider these scenarios and some suggested solutions with Continuous Integration:
- It takes too long to deploy new versions of the code. I’ve been in environments that took the better part of a day to make a new deployment to the testing server. It might take days to fix a small handful of bugs. Making an automated push to the testing server part of your Continuous Integration infrastructure can help. The proverbial five-minute bug fix (that, frustratingly, often took a full day) could be cycled in a matter of minutes. When you can cycle code much faster, it opens up new opportunities to work in a more incremental fashion. Besides, copying files around is beneath the dignity of a human being. Make the computer do the grunt work for you.
- Someone made a mistake in the code deployment to testing. If your team deploys your code manually, you will sooner or later make a mistake. I’ve wasted too much time in my career trying to fix bugs that were really caused by a botched testing deployment. If you fully automate the build and include some level of automated environment testing, you can stop this problem in its tracks. You’ll know when the test environment is nonfunctional before the testers file bugs.
- Be sure you have the right build. One source of false-alarm bugs is a tester working against an older version of the code that doesn’t have the bug fix or new feature being tested. This problem can be acerbated in a very iterative environment as there’s so many more code drops to the testers. There’s a relatively easy answer-version your code. All of the CI server tools will include some way to get the current build number. When you as a developer report a bug fixed or a feature complete, always include the first build number in which the new code appears. Make some way for testers or your business folks to quickly determine the current build number in the system. By communicating and tracking the version number, you can drastically cut down on version mismatch problems between the developers and the rest of the team.
The bottom line is that everything that can change repeatedly needs to be automated.
Continuous Integration Is an Attitude
Something to keep in mind is that CI is as much a mindset as it is a practice or set of tools. Having a CI server isn’t terribly useful unless the developers religiously follow the “check-in dance.” When you reach a stopping point in your code and you’re ready to check in changes to your source control server, you need to consistently follow a series of steps like this:
- Refresh your version of the code with the latest changes from the source control server.
- Merge any conflicts between the latest version and your code.
- Run the developer build locally on your own machine to ensure that your code compiles and passes all unit tests. Depending on how long the build takes, you may also include integration tests, automated acceptance tests, and some sort of static code analysis or test coverage. Fix any problems that the build detects before checking in.
- Check in your code.
- Monitor the build on the server before you continue with other coding. Do not change your local copy of the code until the build succeeds or fails.
- If the build breaks, just stop and fix it. Now. Tell the rest of the team that you’re working on the build. No “Broken Windows!” The minute that you stop taking build failures seriously is the point that CI stops being a useful practice to your team.
The exact steps you take will vary depending on the project and the tooling that you use for source control and build automation. I use a mostly open source Subversion stack for source control, CruiseControl.Net for Continuous Integration, and NAnt plus MSBuild for automated builds. Some newer CI tools such as JetBrains’ TeamCity allow you to combine steps 3 and 4 above and simply reverse code changes if the server build fails. For more complete out-of-the-box solutions, be sure to check out CI Factory. It is possible to do Continuous Integration with VSTS, but I’ll leave that up to you and Google.
Some other suggestions for using Continuous Integration are:
- Check in as often as you can. Try to work in smaller steps to reach stopping points more often. It’s all about the feedback cycle. You want to find problems being introduced into the codebase faster, and granular check-ins give you much better traceability from a detected problem to the code that caused the problem. Working and integrating in smaller steps and shorter timeframes can do wonders to reduce the pain of merging code changes. Put another way, stale code is the typical cause of merge conflict difficulties. The frequency of check-ins is even more important for bigger teams with more parallel work streams.
- Update local code frequently. If you’re forced to work for more than a couple hours in between check-ins, be sure to frequently update your local copy of the code with the latest changes from the source control server.
- Don’t check code into a broken build. If you see that the CI build is broken from someone else’s build you certainly don’t want to check code in, because it impairs the team’s ability to diagnose build failures. Don’t check code out from a broken build, either. If the build is “red,” there’s something wrong with the trunk code and it could conceivably leave you unable to continue working. Let your team member get the build fixed first before you check in or grab the latest copy.
- Try not to leave the build broken overnight. Doing that makes it hard for the people coming in early in the morning to work. Try to get code checked in before you go home at night with plenty of time to spare. The build “knows” when you’re in a hurry to go home. Checking code in at 4:55 fails magically?every single time.
- Know the build server configuration. Every member of the development team must be able to run and diagnose the build themselves. I strongly suggest that at least two members of the team be familiar with the build server configuration.
Of all of the practices originating from Extreme Programming, I think CI delivers the highest ratio of reward to effort. Continuous Integration is a foundational component that enables any kind of rapidly iterative development process, and can add value to any standard development process. It provides a fast feedback mechanism for spotting code base problems, and reduces the mechanical cost of moving code along the development manufacturing line. A solid commitment to CI can give you more control and traceability over your code without resorting to inefficient manual processes. All told, CI is a great way to reduce little sources of friction in a development project. See the “Further Reading” sidebar for some useful resources.