n an ideal world, all software and database systems would have clear documentation that is always up to date. In the real world, however, documentation usually is either non-existent or so out of date that it is misleading. This is especially true for legacy software systems that were developed before the industry realized that maintenance is the most costly aspect of a software system.
One of the main causes of incomplete or out-of-date documentation is the lack of simple tools and processes for integrating documentation into the design process. The simplest way to overcome this issue is to get developers used to writing at least as much?and possibly more?in-line comments as they do code. For database systems, the use of internal comments is an effective way of ensuring that at least some documentation is written.
The Case for Integrated Documentation
Keeping the documentation within the system (a.k.a. integrated documentation) has many advantages. The first and most obvious advantage is that integrated documentation will never be lost. You may ask how would documentation get lost in the first place. Well, when a system has been around for decades and the people who wrote it are gone, its documentation likely will end up missing or misplaced. And even when found, it usually is so dangerously out of date that most developers are scared to touch the system with a ten-foot pole.
The second advantage of integrated documentation is easy to start and maintain?so much so that there really is no excuse for not doing it. Every time you change something in your code or in your database, spend a few minutes to also update the comments. It is as simple as that.
The third advantage is integrated documentation makes it very easy to divide software or database systems up among developers at different sites, and merging the documentation requires little effort. When you merge the software/database, the documentation is merged for you!
Recognizing these benefits and the urgent need for them, we developed an in-house, integrated-documentation tool to help document both legacy systems and systems inherited through acquisition. As its GUI, the tool offered a PL/SQL-generated HTML database dictionary. The tool became very useful when dealing with offshore development teams as well, when many groups were working on different parts of the system. More recently, it has become popular with data-mining folks, as each new project presents them with a new unknown database. This article explains how our tool can help you create and maintain documentation for Oracle databases.
Feature Set of an Integrated-Documentation Tool
Now that you know the advantages of integrated documentation, let’s examine what exactly the integrated-documentation tool can do for you. Essentially, the tool can help you in two main ways:
|Figure 1. Main Page of a Dictionary for a Sample Oracle Database: The main page shows the names of all the tables in the database schema with useful information such as the number of records, last date analyzed, etc.|
- Providing a browsable, easy-to-navigate dictionary of the database: This can help you get a feel for the usage of tables and their columns.
- Providing a template for viewing and inserting comments in all the tables and columns in the database
Figure 1 shows the main page of a dictionary for a sample Oracle database. This page contains all the basic information about the database, including all the table names and some useful stats about them. The stats give the user a quick feel for the database by revealing how many tables are in the database and how many records exist in each table. Other useful information such as the average row length, the date each table was last analyzed, and whether or not it is backed up are also displayed on the page.
Also notice on this page the hype-link “View ? Template To Modify Schema Documentation.” This link will lead you to a page with all the existing table and column comments laid out in an easy format, which you can use as templates for comment insertion (see Figure 2). These templates allow the developer to easily and conveniently use the internal system tables of the Oracle database to store comments about the tables and their columns. Because all the tables and columns have a statement, it becomes less likely that anything will be missed when commenting the database.
How to Set the Tool Up
The main entry point for this tool is the GenerateWebPages.cmd file located in the main directory. GenerateWebPages.cmd is basically a batch script that contains the basic set-up parameters. It connects to the Oracle database with these parameters before firing off other files to generate the HTML dictionary. Make a new copy of this file for each schema that you wish to have a dictionary. The parameters that you need to set in this file are the schema name, the user name, the password, the TNSName, and the SID (as required by the Oracle DB for connection). Make sure that the user name has sufficient security privileges to access system tables. An example setting for these variables is show in Figure 4.
|Figure 4. Setup Parameters: Here is an example of connection parameters that need to be set in GenerateWebPages.cmd file.|
After these parameters are set, all you have to do is double-click the script (or call it from a windows command window), and the script will automatically generate the dictionary and open it in your default browser. The HTML files created by the script are placed in a directory with the same name as the schema in the main directory of the tool.
How the Tool Works
If you are curious about what the PL/SQL is doing, take a look inside the SQL script files inside the SQL directory of the main tool directory. Basically, a batch file causes the SQL files to run one by one and spools their outputs in HTML files. The SQL files use the schema name to retrieve the information about the schema from the system tables. These system tables include ALL_TABLES (for getting the table names), ALL_TAB_COLUMNS (for getting the column names for each table), ALL_TAB_COMMENTS (for getting the comments for tables), and ALL_COL_COMMENTS (for getting the comments for the columns). Each table in the schema is also hit to create the column-specific stats.
Upcoming Feature for Version 2.0
When we saw the great productivity boost a simple tool like this could deliver, we thought to ourselves wouldn’t it be great to include more analytic information to further assist data-mining projects. Certain stats and analytics are very common in data-mining projects, and it could be very beneficial to have them included in an enhanced version of the tool.
The main challenge for the next version of the tool has been deciding which stats provide the best insight into a database and how to present that data. We have been working with other colleagues to come up with a good solution and hope to deliver a solution in the near future.