t’s becoming common to retrieve data from a relational database formatted as a hierarchical XML document. By using an XML schema, it’s possible to generate data from XML documents and then create a relational database from the schema in which to store the data?in other words, the schema relationship can apply in both directions.
It’s also common to use XSLT to format existing XML data, but using XSLT to generate data is not as intuitive. Subsequently, it should be possible to use XSLT and a schema to generate stored procedures that both create a relational database and store the data in the generated tables.
Although it’s most commonly used for transforming XML to HTML, XSLT is a general-purpose transformation language useful for generating XML documents from other XML documents or combination of documents, and even for generating other types documents types?such as database stored procedures. This solution shows you how to generate randomized data using XSLT. A future solution will show you how to use a schema and XSLT to generate relational tables and store the XSLT-generated data in those tables using XML, ADO 2.5 and SQL Server 2000.
Sometimes, you need to be able to generate random data and use the generated document schema to create a set of database tables in which to store the data.
This solution shows you how to generate randomized data using XSLT. A follow-up solution will show you how to use a schema and XSLT to generate relational tables in which to store the generated data.
XSLT?Not Just For HTML
Some time ago, I wrote an article for this column that dealt with accessing XML information from databases, and after some discussion with my editors I decided that I wanted to update the piece, looking at the interaction between XML, XSLT and databases with the release of ADO 2.5 (and the upcoming ADO.NET), the MSXML3 Parser and SQL Server 2000. Over the course of the next few months, I’ll have a series of different articles focused on these topics, with this being the first.
I used to play a number of role-playing games, and a brother who works for Wizards of the Coast guarantees that I still periodically get to try my hand at adventure scenarios, even if it’s far less often than in the past. The facet of game playing that I found most fascinating was creating the game world?generating the characters and towns and encounters, and it was this that actually got me into programming in the first place. I initially started to build a database of characters in a SQL Server database, but as I was putting together this article it occurred to me that this was a perfect opportunity to show how you can use XSLT to create such a database in the first place.
You may be wondering about this assertion. Isn’t XSLT for transforming XML to HTML? Yes and no. You can certainly do such a transformation, but XSLT can actually handle the transformation from any XML source to any other XML output?even if that source doesn’t necessarily have any direct bearing on the order or presentation of the output. In the case of the character generator, the source XML document contains a set of parameters that describes the values for given options?a character’s gender, species (it is a fantasy RPG), vocation, name, and so forth.
The XSLT generateCharacters.xsl (see Listing 3) uses this document along with a random number generator to create an arbitrary number of characters where the values are constrained to those in the worldData.xml file (see Listing 1). While a few specific categories must be known in order to describe their rules, the properties that define the characters (‘strength’, ‘memory’, ‘presence’, and so forth) are actually contained in the worldData document, along with the ways that the primary characteristics (‘species’, ‘gender’, ‘vocation’, etc.) affect these secondary properties.
Controlling XSLT Output
This is something of a mixed rule set, where the structure of the output is partially determined by the initial data and partially by the transformation. It is possible to change this into a format where the structure is completely determined by the source document and the transformation acts only to implement the rules, but such a transformation is also considerably more difficult to write well in the general case.
Thus, the world document specifies the range of possibilities for various primary attributes. In the case of gender and species, for instance, each particular instance of an attribute has a range of probabilities (min and max) between 1 and 100. Thus, if you rolled a hundred-sided dice (and yes, there are such things), a score between 1 and 75 would indicate that the character was human, between 76 and 80 the character was a half-elf, and so forth.
This range of probabilities provides a way of creating a non-equal distribution that still accurately represents a population, a trait that is especially evident in the Level element, which can take a value of 1 to 10 but does so in a very non-linear way. On the other hand, a few qualities such as the Order or Temperament are given random values based upon the number of items in each category.
The world document also recognizes two distinctly different modes. The primary mode indicates those properties that can be calculated directly. The property mode, on the other hand, depends upon quantities given in the primary elements. For example, the Strength property is normalized for humans, but an average centaur by dint of its size would have an additional three points of strength added to the baseline score. A number of attributes in the primary elements are recognizable as having the same name as the properties?if a character has a given primary characteristic (species as a centaur, for instance) which has a named property attribute (such as Strength), then this value will be added to the average of the sum of three six sided dice.
The world document would thus likely produce an output similar to Listing 2.
Programming the Exceptions
A significant amount of programming comes in recognizing common cases and programming for them, and then programming for the exceptions. The file generateCharacters.xsl (see Listing 3) uses this model. It looks first at the generalized case of primary characteristics such as Level or Gender, and then determines secondary properties such as Strength or Stamina, which are dependent upon the primary characteristics. It also includes specialized handlers for Name and LifePoints, because neither one falls into the mode for generating content that all of the rest of the properties do:
Using Custom Namespaces
The generateCharacters.xsl stylesheet (Listing 3) makes extensive use of a custom defined namespace called game:rollDice(), which takes as parameters the number of dice to be rolled and the number of pips (spots) on the dice itself. XSLT contains no randomizing elements, so the stylesheet defines this extension explicitly. If you wanted to take this to a platform other than Microsoft’s MSXML3 parser, you could write the extension in Java.
The stylesheet works by calculating the two principle sets of information as well as the two exceptions (for creating names from a list, which is dependent upon gender information, and for determining the number of life points that the character has). Once it has this information, the generateCharacter named template outputs the name, simplifies the primary characteristics (which initially contain some attribute information from WorldData.xml for convenience) and sends these to the output stream, and then does the same for the properties.
Building Loops in XSLT
The generateCharacters (see Listing 3) template calls itself recursively after outputting each character’s XML, incrementing a counter by 1 for each iteration. This continues until the number of characters reaches a certain number, at which point it stops?the equivalent of an indexed for loop in a procedural language. By default, the value is 20, the number in the worldData.xml property
The property items also perform a potentially sophisticated bit of processing themselves. For example, the strength item in the WorldData.xml file has the following structure:
This tells the generateCharacters.xsl file (Listing 3) to generate a number as follows:
However, there are a few points to consider with this:
$randomizers can contain the name of a primary element to use in calculating the number of dice. For example, if randomizers=”Level” then the value of the Level property is used to determine how many dice are used.
“primary”. One of the most powerful attribute values for the property mode is “primary”. It will look at all of the elements that were defined as having mode=”primary” and will check to see if an attribute with the same name as the current element exists. If it does, then this and all similar matching element attribute values are summed together. If used as an argument for @min or @max, this will be used to determine the size of the dice. If used for @randomizers, this will determine the number of dice to roll, while if used for @offset the primary sum will be added to the sum of the dice rolls.
Constants. You can use a combination of $min, $max, $offset and $primary to directly add properties together. For instance, setting $randomizers, $min and $max to 0 and $offset to 10 will make the result ten, while setting $min=$max will multiply the values of $min by the $randomizer without any random effects.
Applying the Techniques
I suspect that for most people, the need for a routine to generate game characters is probably not a high priority. However, by separating the transformation mechanism from the intrinsic structure of the output, this same mechanism can be used to generate test data of any sort. For instance, you could create an XML data page that would generate faux employee information that might have the following structure.
The BusCardData.xml, template illustrates how the same basic setup with different elements can generate similarly rich XML structures using generateCharacters.xsl.
10 Associate Administrative Assistant Programmer/Analyst Manager Executive Accounting Research & Development Production Marketing & Sales Distribution 1 2 3 4 5 6 7 8 9 10 Male Female Thorin Kor Alfrin Meior Kirin Tomas
Most important, perhaps, are the lessons that can be gleaned from following common structures and designing in as general a case as possible. The output is clean and simple (and largely one dimensional), which also makes it ideal for depositing in a database.
Next month, I’ll show how this XML structure can be entered into a SQL Server database through an XSLT-generated stored procedure, and after that I hope to look at how you can retrieve and interact with the same information using SQL and ADO.