don’t know about you but when I think about writing applications I think about all of the things I want the application to do for myself or for the customer. What I don’t think about is all of the little details that turn an elegant design into a workable implementation, because so often the beauty of the design gets lost during the implementation process.
Alleviating this loss of design integrity is the main reason I’m so enamored with code generation. By using a program to translate an abstract model of my design into code I can maintain the design and also gain a measure of portability that no single implementation language can provide. While code generation provides many benefits, it’s the portability aspect that I want to analyze in this article.
For those not familiar with the software engineering technique of code generation, I’ll run through a brief introduction. Code generation is the technique of using a program to write production code for you. There are many models of code generator. The example in this article uses an abstract model of the design as input, and it outputs one or more implementation files.
There are four primary benefits to code generation; quality, consistency, productivity, and abstraction. Generated code is of high quality because it is being built using a set of templates. When bugs are found the templates are modified and the bugs are fixed across all of the generated code. Because it uses templates, generated code has consistent naming conventions across all of the output. The technique increases productivity by letting developers spend more of their time on the design and the unique aspects of the code.
But it’s the abstraction benefit that is the primary focus of this article, because abstraction is what gives our applications portability. Part of the design of the application is abstracted into the generator’s input files; this abstract model can then be ported to other platforms and technologies simply by altering the templates of the code generator.
To demonstrate this I will use a code generator to build the data access layer of an application with two very different implementation technologies. I’ll begin by defining the design of the database that I am going to generate.
My project is to model a small book collection. This is a valuable tutorial because it contains multiple tables and is realistic enough to serve as a foundation for larger data model and data access layer development. I will first generate the SQL to build this database and then generate C# and PHP database access layers for the database.
C# and PHP are proven application development technologies and they are representative of both a latently-typed scripting language and a strongly-typed system languages. With this example in hand you will be able to generate your database application layer in the language of your choice; you will not be bound to the technology that you begin with.
The model should include the name of the book, the author, and the publisher. Using a normalized relational model, I will need one table for each of those fields. These tables relate to each other as shown in Figure 1.
Figure 1. Tables: You’ll need three tables to hold the information for the book database.
Listing 1 shows a model of Figure 1 in XML.
For simplicity I’ve combined the table definition, the connection information for the database, and the test data, into one file.
More detail about different types of XML nodes is provided in the following table:
The database node includes attributes which describe the connection to the database.
Your application may require more information to properly model your database layer. You can easily add this to the code generation templates in this article.
With the data model and test data in hand I can start building the code generators that will turn the design into reality. I’ve chosen PHP and C# as my generation targets, but I’ll also generate SQL for the physical model so that the database schema always remains in sync with the database access layer sitting above it. For the generator I have chosen to use XSLT.
Generating Code with XSLT
The value of XSLT is that it is a standard and has implementations in a number of languages, making it very accessible. There are a number of books and articles on XSLT. It also helps that with version 1.1 you can create multiple output files with a single XSL template. I’ll start with a simple example stylesheet called ‘simple.xsl’:
For this article I’ve used the Saxon XSLT processor (v.6.5.3). To run ‘simple.xsl’ on the ‘schema.xml’ input file the command is:
The output looks like this:
In the process above, the XSL processor searches the templates against each of the nodes it encounters in the input XML. In this case it matches against