At this point, I am very proud of my Scanner class and show it to my fellow programmers. They are somewhat impressed, but they (being accomplished software developers themselves) suggest all sorts of extensions to the class. Can you add a parameter to specify the minimum and maximum size of the file? How about a pair of parameters to specify the date range when the file was created? Ditto last written? Ditto last accessed? Can you add parameters to specify more than one extension? How about the ability to specify a regular expression for the file name test? And on it goes. I could, of course, write code for each of these possibilities, but it is clear that this would be a never-ending task. Each time that I got one possibility working, someone would suggest yet another possibility. My nice simple Scanner class would be "junked up" in no time.
One of the goals of writing a class is to get it "done." That is, I want to write the code, write the unit tests, write the documentation, create the deployment logic, and so on, and then put all of this "on the shelf" where I can re-use it. The principle that I'm interested in here is called the "open closed" principle. I want the Scanner class to be open for extension but closed for modification. In other words, I want to create a Scanner class that implements a certain amount of functionality (that is done) that I can extend by specifying parameters or though one of the techniques that I'll cover in this article. I do not want to have to keep going back to working (and tested) code to make modifications. To achieve this state of near-nirvana is difficult but not impossible. One way to achieve this state of ultimate tranquility is to focus on the relevant essentials and ignore everything else.
If the Scanner class squints its eyes (to abstract out the irrelevant details), it becomes obvious that the only thing that is relevant to the Scanner (for all of the above possible extensions) is whether the individual file "is of interest or not." The Scanner class really does not care about how this determination is made or even where it is made. All that you have to do is to define a "predicate" function that accepts information about the file and returns a Boolean "true" or "false." You could make this an overridable function and create derived child classes that provide the specific implementation of the predicate function, or you could create a delegate method signature for the predicate function and pass in an appropriate delegate to handle the relevance test. I'll show you how to use this second approach. You can see the next "Zen koan" to mark the journey in Listing 3.
The Scanner class walks the file hierarchy, creating a FileInfo class instance for each file that it finds. It then passes each FileInfo instance to the predicate function. The predicate function determines whether the file is of interest. You'll see a a number of significant changes in this version of the program. Because added a number of different delegates, I decided to create the GUI component in Figure 1.
|Figure 1. Scanner Class GUI: This simple GUI lets users input an initial directory, a log file, and specify file extension, size, and date filters.|
This GUI presents three different options (along with modifying parameters) for the predicate function delegate. Because the GUI provides way to display the results immediately, I changed the capture logic from writing to a specified log file to raising an event that the caller can capture. In this example, the GUI displays the data on a scrollable textbox. Finally, I added the delegate logic to test the relevance of the file.
Revisiting the function list, at this point, the software does the following:
- Accepts a path to the initial directory (loss of control to an external mechanism).
- Accepts an extension to test for (loss of control to an external mechanism).
- Navigates the file hierarchy.
- Invokes a delegate to test for relevancy (loss of control to an external mechanism).
- Raises an event for each relevant file (loss of control to an external mechanism).
Two interesting things have happened here. First, I injected the logic for the relevancy test into the Scanner class. Second, I've removed the logic from the Scanner class that handles each relevant file; the code in Listing 3 raises an event with the pathname of each relevant file, but has no idea of what will be done, if anything, with that value. The value could be written to the console, written to a log file, displayed on a GUI control, and so on. The Scanner class has now been pared to its essence: to navigate the file hierarchy; test each file for relevancy, and raise an event for the files that pass the relevancy test. Everything else has been removed. Paradoxically, the class does less but has become considerably more useful. However, I'm still passing in a file extension, which isn't required, but is typical of programs that evolve: This is a vestigial parameter that has not quite withered away.
May You Live in Interesting Times
Now assume some time has passed—in fact, a lot of time has passed. While the vision and implementation of the Scanner class might have seemed close to the sought-after state of perfection, the Scanner class and its supporting software have continued to evolve. (I won't continue the Zen references from this point forward because now we're going to deal with the more tangible realities of developing software in a business environment.) In the time since the enhancements you saw previously in this article, your development team has modified the Scanner class to respond to requests for:
- The ability to capture and process the data in the relevant files.
- The ability to rollup processing after all of the individual files have been processed.
- The ability to output the captured and processed data in various ways.
Vicki Vice President found about this software and has requested that you build a revenue reporting system for the company's ancient Rocks-a-Lot order entry system. This system has been in place for almost 30 years, and is the division's main system. As each order comes in, the Rocks-a-Lot system creates a file with the details of the order and saves the file within the directory structure, using a naming scheme where the directory path specifies the company and reporting period. The order fulfillment system scans the directory structure and ships the product. At this point, your inner software developer is screaming that there are an infinite number of opportunities for improving this system. Be patient, Grasshopper. The careers of several vice presidents have crashed upon the shores of the Rocks-a-Lot system in futile efforts to upgrade or replace the system. It has become the dreaded "third rail" of IT systems within the company. Nobody—not even the ultra-capable Miss Vicki—wants to take on this particular monster. The only safe thing to do is to build out from the original system.