PMD Squashes Code Bugs

ode reviews are widely recognized as one of the most effective ways to detect bugs, performance issues, and coding errors of all sorts. And by promoting discussion and knowledge sharing, code reviews also have the rare ability to improve both the quality of your code and the skill of your developers.A code review involves presenting and discussing the code for a project. The aim is to spot potential problems and check that the software architecture and design rules are being applied correctly. Reviewers generally use a checklist of common errors and applicable coding standards and best practices to ensure that they examine the code thoroughly.

The following are the different types of code review, which vary mostly in degree of formality:

  1. A code walkthrough simply involves developers meeting to review code.
  2. Code reading is a little more formal, with reviewers independently examining a designated block of source code.
  3. Code inspection, the most formal of all code reviewing techniques, was introduced by Michel Fagan in 1976. Code inspections follow a rigorous formal process where trained participants are assigned predefined roles, such as moderator, scribe, author, and reviewer.

Code reviews can also be done individually, either in preparation for a peer review or just as part of good programming practices. Indeed, the Software Engineering Institute recommends personal code reviews as a best practice of its Personal Software Process. This involves simply reading through your code and using a review checklist to look for errors.

The efficiency of code reviews has been well studied and well documented. IBM reported detecting 80-90 percent of defects by using formal code inspection techniques, as well as overall schedule savings of 10-30 percent (Fagan, M.E., Advances in Software Inspections, July 1986, IEEE Transactions on Software Engineering, Vol. SE-12, No. 7, Page 744-751). A code review done by an experienced developer can help raise and fix many different types of potential problems, such as:

  • Inefficient and/or poorly performing code
  • Security issues
  • Incorrect business logic
  • Incorrect use of project or industry best practices and standards
  • Just about anything else!

In addition to the obvious benefits of reducing defects and improving code quality, code reviews have another thing going for them. They are (or should be) an excellent way to share knowledge and discuss code design and architecture in a constructive, applied manner. They ensure that all developers involved in a project are familiar with applicable industry best practices and that they have a good handle on the specific architectural and design techniques used.

However, despite their obvious advantages, code reviews are among the least used of all the recommended industry best practices. They often have a bad reputation among developers, who may fear criticism of their code and who generally dislike “wasting time” in too many meetings. Indeed, code reviews can be long and tedious, which means they are rarely done with any real assiduity. Human nature being what it is, code reviews will often tend to focus on surface issues, such as conformity to basic coding conventions, and neglect the harder aspects, such as potential errors, performance issues, and understanding and verifying the design and business logic behind the code.

Automating the Review Process

A great help in implementing code reviews in development is automation. Software tools can help you automate a great deal of the review process, leaving you to concentrate on the more interesting aspects that only a human reviewer can check: faulty business logic, adherence to project design and architecture guidelines, possible optimizations, and so forth.

A good tool for this job is PMD, a static code analyzer capable of automatically detecting a wide range of potential defects and unsafe or non-optimized code. While other tools, such as Checkstyle, can verify that coding conventions and standards are respected, PMD focuses on preemptive defect detection.

Using PMD with Eclipse

PMD is designed to integrate well into a developer’s work environment. Plug-ins exist for the principal IDEs, and they are the most productive and convenient way for a developer to use PMD. Plug-ins allow almost real-time code verification?they raise issues whenever you save the source code (see Figure 1).

Click to enlarge
Figure 1. The PMD Plug-In in Eclipse

Installing PMD
Click to enlarge
Figure 2. Setting Up a New Remote Site

The easiest way to install and use PMD under Eclipse is to use the remote update site. From Eclipse, take the following steps:

  1. Open the Help->Software Updates->Find and Install menu.
  2. Click Next, and choose New Remote Site.
  3. Now enter the URL of the remote site (http://pmd.sf.net/eclipse), and an appropriate name such as “PMD” (see Figure 2).
  4. Click to enlarge
    Figure 3. Installing from the Remote Site
  5. Now you should have a brand new remote site in your “Update Sites to visit” window (see Figure 3). Make sure you have the PMD site checked in the “Sites to include in search” window, and click Finish. Then just go through the installation screens to install the plug-in.
Click to enlarge
Figure 4. Activating PMD in a Project

The plug-in should now be correctly installed. Restart Eclipse if you are asked to. Once that’s done, you can set up your projects to use PMD.

Configuring PMD in a Project

You now need to activate PMD for your project. Open the project properties window (Project->Properties). You will now have a PMD entry (see Figure 4). This window allows you to configure PMD in detail for your particular project. For now, just check the “Enable PMD” box.

PMD in Action

Click to enlarge
Figure 5. Displaying PMD Rule Violations

PMD is quite flexible. You can use it in two basic ways:

  1. Wait for PMD to automatically analyze a file each time it is saved or added to the project
  2. Manually execute it against selected files, folders, or projects

PMD displays rule violations in the “Tasks” view, among TODOs and other task entries, or in the special PMD view, which gives a more tailored display and more specific filtering options (see Sidebar 1. PMD Rules to see a listing of all main PMD rule sets). The “Violations Outline” panel gives another, more summarized view of PMD rule violations (see Figure 5).

PMD violations have the following five levels of severity (see Figure 6):

Click to enlarge
Figure 6. PMD Rule Violations Are Nicely Detailed in the “PMD” View
  1. Error (high)
  2. Error
  3. Warning (high)
  4. Warning
  5. Information

This is mainly to help prioritize fixes, and the level of each rule can be easily configured. Violations are (by default) ordered by level of severity, and it is easy to filter out certain priority levels. You can also chose to display only the violations of the current project or of the selected file.

Rules aren’t meant to be applied blindly. If you’re not sure why a particular violation is being detected, or think it doesn’t apply in your case, you can refresh your memory of the rule by displaying the details (“Show details” in the contextual menu) (see Figure 7). And, as you will see shortly, if a rule is not justified in certain circumstances, there are ways to deactivate it.

Detecting Cut-and-Pasted Code
Click to enlarge
Figure 7. Displaying the Details of a Particular Rule

Cutting and pasting code between classes is a bad habit. Areas of cut-and-pasted code increase maintenance costs unnecessarily, and indicate in the very least a good candidate for refactoring. In many cases, they are high-risk zones for potential errors.

PMD comes with a useful tool for detecting cut-and-pasted code called CPD (Cut and Paste Detector). You can run it from the contextual menu on the project, using the “PMD -> Find Suspect Cut and Paste” menu option.

Unfortunately, at the time of writing, the results of this tool were not integrated into the IDE. The tool generates a text file called cpd-report.txt in the /report directory, which contains copy-and-paste suspects, as shown here:

=====================================================================Found a 18 line (56 tokens) duplication in the following files: Starting at line 69 of 
/home/taronga/Documents/articles/HotelWorld/src/main/java/com/wakaleo/tutorials/hotelworld/model/HotelModel.javaStarting at line 82 of
/home/taronga/Documents/articles/HotelWorld/src/main/java/com/wakaleo/tutorials/hotelworld/model/HotelModel.java List hotelsFound = findHotelsByLanguage(language); Hotel hotel = null; for(int i = 0; i < hotels.length; i++) { hotel = (Hotel) hotels[i]; if (hotel.getCity().equalsIgnoreCase(city)) { hotelsFound.add(hotel); } } return hotelsFound; } /** * Find hotels where a given language is spoken. * @param language * @return */ public List findHotelsByLanguage(Language language) {

You can customize the minimum size of a copy-and-paste zone suspect in the workbench preferences under PMD->CPD Preferences. Just adjust the "Minimum tile size" field, and specify the minimum number of lines.

Report Generation from Within Eclipse

Many people are very attached to hard-copy outputs. If you are among them, you may appreciate the ability to generate PMD rule violation reports in CSV, HTML, TXT, and XML formats. Just go to the contextual menu on the project, and select "PMD -> Generate Reports". The reports will be generated in the /report directory of the current project. Figure 8 shows an example of an HTML report.

Click to enlarge
Figure 8. Generating a PMD Report from Eclipse

All rules have exceptions. You will have occasions when PMD gets it wrong, and you have a legitimate reason for not respecting one of the PMD rules. For example, consider the following code:

    /** Countries : USA */    public static final Country USA = new Country("us","United States");

Suppose that your company standards impose a minimum of four letters for variable names. In this case, PMD will incorrectly generate an error. To get around this, you can mark a violation as "Reviewed", which basically tells PMD that you've seen the issue and that it's fine by you. Click on the error and open the contextual menu, then select "Mark as reviewed". PMD will insert a special comment similar to the following:

    /** Countries : USA */    // @PMD:REVIEWED:ShortVariable: by taronga on 4/13/06 7:25 AM    public static final Country USA = new Country("us","United States");

As long as you don't remove it, PMD will now ignore this violation for this particular case.

Another way of doing this while writing the code is to use the "NOPMD" marker, as follows:

    // These are x and y coordinates, so short variable names are OK    int x = 0; // NOPMD    int y = 0; // NOPMD

The marker deactivates the ShortVariable rule for the variables.

A third technique, particularly useful for generated classes or legacy code, is to use the PMD SuppressWarnings annotation. In the following class, all PMD warnings are suppressed:

    @SuppressWarnings("")    public class Country {        ...    }

You may just want to suppress certain rules for a given class. In the following generated class, for example, private variables are prefixed with an underscore, which is not in line with PMD's rules concerning JavaBeans. To get around this, you just suppress a specific PMD rule:

    @SuppressWarnings("BeanMembersShouldSerialize")    public class Country {        private String _code;        ...        public String getCode(){             return _code;         }    }

Configuring PMD Rules

Click to enlarge
Figure 9. Configuring PMD in Eclipse

When you introduce coding standards and best practices into an organization, it is important to tailor the rules to your exact needs. This should be a team effort?get everyone who's going to be applying the rules involved. Each PMD rule has a detailed description and examples, available both on the Web site and visible in the configuration screens (see Sidebar 1. PMD Rules to see a listing of all main PMD rule sets). Review each rule and come to a joint decision on whether and when that rule should be applied in your organization.

The most convenient place to configure the PMD rule set is from within Eclipse, in the PMD entry of the "Windows->Preferences_>PMD->Rules configuration" window (see Figure 9). This window contains a list of all the available PMD rules. From this list, you can go through the rules, adjust rule priorities, modify any of the other rule-specific properties, and also remove any rules you don't need.

Click to enlarge
Figure 10. Building a Rule Set from Scratch

You can also build a rule set from scratch: just delete all the current rules ("Clear All") and then import selected individual rule sets one by one (See Figure 10).

When you're happy with your new customized rule set, you can export it in the form of an XML file ("Export Rule Set"). Other team members can now clear their existing rule set and import the new rule set into their environments.

You can also activate or deactivate individual rules for a project in the project properties window (See Figure 11).

And if you do anything really silly, you can always get back to the default rule set using the "Restore Defaults" button.

Click to enlarge
Figure 11. Project-Specific PMD Configuration

Using PMD on Legacy Code

If you are using PMD for the first time on an existing code base, it is good practice to create a minimal rule set containing the most important and potentially dangerous issues. For example, an initial rule set might contain the Unused Code rules. Once these are cleaned up, you might add the Basic rules to find any empty catch blocks, and so on.

Fixing all the relatively minor coding standards issues in a body of legacy code is tedious and time-consuming, and the relative return on investment is much less than with fresh code, where standards can be easily verified in real-time from within the IDE. It is important to know when to stop and get on with more productive work.

PMD in the Build Process

If you want to introduce PMD into your organization, integrating PMD into each developer's work environment is a good place to start. Nevertheless, you should also integrate PMD checks into your nightly builds. You can then post the generated report in the public place, or, if not, at least display it on the project Web site.

You can run PMD from the command line or by using an Ant task. It is also well integrated into Maven, where the PMD reports fit seamlessly into the Maven-generated project Web site (see Figure 12). Issues are directly linked from the PMD report to the HTML version of the source code.

Click to enlarge
Figure 12. Example of a PMD Report

Individual and Peer Code Reviews with PMD

Tools like PMD and Checkstyle can considerably reduce the time and effort involved in personal and peer code reviews. Virtually all coding convention rules, and a large number of best practices, can be automatically verified using these tools. If Checkstyle or PMD don't raise any issues for a class, the reviewer can concentrate on reading the flow of the code, understanding and validating the business logic, and working with a short checklist of project and/or company-specific design and architecture guidelines. In practice, this is a highly effective way of reducing defects and increasing code quality and reliability.

Despite its merits, even individual code reviews require quite a bit of discipline to do consistently, especially across a large team. Group training and individual mentoring can help get people up to speed. Consider giving your team an internal training course on code reviews, the best practices and guidelines that your organization has selected, and code quality in general. It is well worth the investment!

It is also vital to get management buy-in and support. Make sure management fully and visibly understands and supports quality initiatives such as code reviews and best practices.

Increase Code Quality

Static code analyzers like PMD can greatly contribute to reducing defects and improving code quality. They do not replace human code reviews, as there will always be errors that only a human can detect. Nor do they replace a disciplined QA process, as tools alone are of little use without a well-understood way of applying them. But when automatic code analysis techniques are combined with manual code reviews, the result is increased code quality, reduced defects, lower maintenance costs, and better-trained developers.

Share the Post:
Share on facebook
Share on twitter
Share on linkedin

Overview

Recent Articles: