RSS Feed
Download our iPhone app
Browse DevX
Sign up for e-mail newsletters from DevX


Is Documentation Holding Open Source Back? : Page 2

Doing a better job at documentation can turn technical excellence into universal acceptance.

How to Write Documentation that Will Lead to New Users
If our intent is to encourage people to use these tools and to contribute their time to the development of these tools, we need to spend more energy on introductory documentation. Engineers who look at Sourceforge are looking either for solutions to problems or are looking for interesting tools. In either case the Web site needs to convey the important details in a short form to enable the reader to gauge their level of interest. Obscurity in documentation benefits no one as it annoys potential users and non-users alike.

There are five key categories that you need to think about when introducing potential users to a project:

1. The top five (or so) problems the tool was meant to solve
What a tool does is only half the story. Tools exist for a reason, to solve problems. How to get a nail into two pieces of wood is a problem. The solution is a hammer. It's difficult to explain a hammer without mentioning wood because the solution is directly related to the problem. Technology is just like that.

It's so easy to get caught up in the how of technology that we often forget about the why. The why will tell your reader whether your tool will fix their problem directly, as a side effect, or not at all.

Using XSLT as an example we can phrase a problem statement this way:

"Procedural code that transforms XML into HTML is difficult to write and maintain. XSLT allows you to specify transformations of XML data into XML output formats (including HTML) in a simple, declarative manner."

From this statement an engineer should be able to understand why XSLT was developed and how XSLT is doing its job. Understanding both the why and the how lets engineers know what falls in and out of the scope of the tool. When an engineer understands the scope of a tool they will be able to judge whether they can use the tool for their project and if they want to contribute to the development of the tool.

2. The standard usage patterns used to solve the top five problems
For each of the problem statements we should also be able to create a usage pattern that matches. This usage pattern explains at a reasonable level of detail the mechanics of the tool.

Using the first XSLT statement as a platform we phrase a usage pattern this way:

"The XSLT transform engine takes as input an XML stream that contains the data in addition to an XSLT style sheet that contains the transformation rules. The engine applies the style sheet to the data and creates an output stream. In this case the output stream is HTML."

With this in mind an engineer can understand the proper and intended use of the tool. Successful tools are always developed with a use case in mind. Taking the simple case of a hammer and a nail, the proper use of a hammer is to hit the nail with the striking face of the hammer. You could use the side of a hammer to drive nails but you would not be getting the benefits the design of the hammer provides. The same is true with software engineering tools, if you use them the right way you will get the benefit of all of the design and testing work. If not then the results can be unexpected.

3. The design parameters of the tool
The documentation of any database engine will tell you that the number of fields in a table is unlimited. But that's not really true. Most database vendors designed their systems to handle up to about 100 fields and the test cases usually hang at around 20 to 50 fields. With this in mind you can feel confident when your tables are small, and you can make sure to test the system when you have tables with an unusually large number of fields.

How this information is represented is highly dependent on the nature of the tool. Taking XSLT as an example, we may express the average size of the input and output test case files as well as the average number of transformations.

Marketing brochures never indicate the design parameters of a tool, because the vendors fear losing your business if you think the tool might under-perform with your requirements. Unfortunately these parameters often decide the success or failure of a deployment. In an open source world we don't need to be so profit-driven, so we can avoid the inevitable support headaches that come when people push code too far by being up-front with the performance characteristics of our tools.

4. The environment the tool was developed on and is tested on
Software needs to work in the real world. Knowing what environment the tool is written on and tested on is a big clue to where the tool will be most comfortable. Large projects like Mozilla have teams of people testing the code on multiple platforms. For those types of projects this information isn't critical. For smaller-scale projects with just a few contributors readers need to know what the code is being written on and tested with.

Designing a tool to be database independent is not the same as having automated tests that run every build of the code against each database for a thorough check. If the code is developed primarily on MySQL you can only rely on the code working on MySQL.

To ensure that the code works in the most reliable manner possible it is extremely important to know what platform was used to develop and test the open source software. Give your users the best chance to get the most out of your tool by giving them this vital information.

5. Graphics
If a picture is worth a thousand words why don't we use more of them? Only one of the 20 Web sites I looked at made any use of graphics to explain the structure, architecture, or usage of the development tool.

Even the most basic graphics are extremely compelling. The front page of UML2EJB contains a single graphic that describes the complete workflow cycle between the incoming UML and the EJB beans that are at the end of the cycle.

Both Visio and Dia have easy-to-use templates for software diagrams. The graphic below shows a simple XSLT input and output flow:

This is a simple graphic, but it speaks volumes. An engineer can quickly see that if they have XML and they want HTML that XSLT can provide a solution.

Documenting these basics is not only valuable for potential users of the tool, it's also valuable for the implementing engineers. Software engineering is not a discipline that favors ambiguity. Stating clearly what a tool does and does not do provides a method for judging what new features should be included and what should be left out. If it's difficult to nail down what a tool should does it is a potential sign of weakness in the requirements, architecture, or design of the tool.

Open source software is arguably more stable and reliable than its closed source equivalent. We need to build on that technical success by fixing open source software issues with documentation. Writing reasonable and useful documentation is not difficult. It entails stepping into the shoes of your reader, introducing them to the basics of the software, addressing their concerns and doing a little self-promotion. For good examples all we need to do is look at the documentation for successful commercial software.

Technical excellence is not everything. The technically superior tool can be outpaced in its adoption by an inferior tool with better support and documentation. To ensure the long-term success of open source we need to spend a considerable amount of our development time on both in-depth and introductory documentation.

Jack Herrington is a senior software engineer with 20 years of experience. He is currently working for Macromedia on the next generation of Dreamweaver. His first book, "Code Generation in Action" (Manning Press) is due out in July. He is the editor of the Code Generation Network. Reach him by e-mail .
Email AuthorEmail Author
Close Icon
Thanks for your registration, follow us on our social networks to keep up-to-date