Generate XML Mapping Code with JAXB

Generate XML Mapping Code with JAXB

f you spend any time at all writing DOM or SAX code, then you need to know about the Java Architecture for XML Binding (JAXB). It rapidly generates XML mapping code for you, saving time and effort, and reducing both costs and risks. JAXB is a specification for a set of APIs and tools that generate java classes based on the domain model encoded in an XML schema. These classes then have the ability to read and write XML documents from files or streams.

JAXB is still in beta, but the powers that be are discussing a full release early this year. Of course, as with every new tool/technology, learning how to utilize it will take time. Fortunately, JAXB has been designed to be extremely easy to use, so the time investment should be minimal. In fact, for a large class of applications this simple article?a short, practical introduction to the JAXB technology?will show you enough about the tool to do the job.

At the risk of stating the obvious, Java is an object-oriented language, while XML is hierarchically structured data. A number of strategies exist for dealing with this dichotomy:

  • Create an object model that describes the generalized structure of the data (as in DOM).
  • Deal with the data as pure data (as in SAX).
  • Encapsulate the data in domain objects, and let them manage it from there.
  • The latter solution is called mapping, binding, or translation, and it is the solution JAXB utilizes. In the mapping process, a developer usually uses a parser (SAX or DOM) to get at the data in an XML file and then uses the data to populate a domain object model. Often, the developer creates one class per element of the document, assigning values to attributes from either the data contained within the tag or the tag attributes. If a tag contains further elements, then the developer creates relationships from the class that represent the outer tag to the class representing the inner tag. Of course, before the developer populates an object with data, he or she has to first instantiate it. So the developer also writes code that manages both the creation and collection of objects.

    With SAX, this is usually done with a stack implementation. Every time the parser encounters a tag, the code decides whether it’s an object or an attribute. If it’s an object, the code creates a new instance of its representative class and pushes it onto the stack. If it’s an attribute, the code passes it for processing to the object referenced by the head of the stack.

    In this approach, the code needs to have a detailed understanding of the document being parsed. It has to know that it should create a class when it encounters

    and an attribute when it encounters . This approach has obvious flexibility problems. If the schema changes, the code must be changed too. Also, although the pattern is generally applicable to mapping, the code is specific to a single schema. So it can’t be reused.

    JAXB generates the mapping code based on the schema of the documents you’ll be processing. It creates domain classes that model the schema and creates all the code you need to read and write documents.A Working JAXB Example
    First things first: download JAXB from, and since it relies on other XML libraries, downloading their whole XML package is perhaps the easiest way to go. As of the writing of this article, the latest pack is Summer 02. Set up the environment as per the instructions or nothing will work. The source code for the example, the schema you’ll use, and a document conforming to it are all available to download in this article.

    The example document I use is Shakespeare’s “A Midsummer Night’s Dream”, which I found on the Web in XML format. I made a few small changes to slightly simplify the structure and emphasize issues I wish to focus on, and I generated a W3C Schema using the latest version of Altovira’s excellent XML Spy Enterprise. (Figure 1 shows the fairly simple schema in the XML Spy diagramming format.)

    Figure 1: The Play Schema

    The play schema shows that a play must contain six elements. One element is an act, which must itself contain many scenes, and scenes consist of many intermingled stage directions and speeches. Speeches contain many speakers, lines, and stage directions, again in no predictable order.

    Figure 2 shows a simple play-viewer application that I put together to demonstrate the principles of processing with JAXB. Writing this would have been a lot easier had I used Tomcat and an HTML GUI, but I didn’t want to “complexify” things.

    Figure 2: Play Viewer

    How long would it take you to write this using SAX or DOM? The GUI is quite straight forward, but the mapping code may take you a while. Maybe two days? I did the entire thing in half a day using JAXB, and most of that was spent playing with the enigmatic Swing interface, trying to make it look a bit prettier for the screenshot. I created the mapping code in about three seconds. The panel on the left contains a standard tree and the one on the right contains a subclass of Jlabel, which I altered to implement the Scrollable interface and to hold off displaying its contents until the application gives it permission.Using the Code Generator
    Start by unpacking the xml and xsd files into a development directory, which contains a directory called 'src'. Create a new directory under src called model. On the command line issue the following instruction:

      xjc ?d src ?p model play.xsd

    If you’ve setup JAXB correctly you should see a message telling you that it’s parsing the schema, followed by another message saying it’s compiling the schema, and finally a list of the classes it has generated appears. Issuing the xjc command by itself produces a small help screen that explains the parameters. The options we’ve used here tell JAXB to both generate the classes into the src directory and put them in the model package.

    If you have a legacy IDE like JBuilder, then you probably should generate javadocs for the classes you’ve just generated. If, like me, you use the impressive new version of IntelliJ IDEA, then this isn’t an issue. IDEA reads the javadocs directly from the source and displays them in a tool-tip window at your request.

    Next, open the model package in your favorite IDE and have a look at it. Its structure is very similar to the schema on which it’s based. Where a one-to-one relationship exists, a method returns an object of the related type. For example, if you look at the Play class itself you’ll see instance methods for retrieving an instance of Title.

    Where a one-to-many relationship exists, a java.util.List class is returned. For example, instances of Play have a getAct() method that returns a List containing Act objects. The code I use to get the first act of the play (in ViewPanel’s constructor) simply gets the first element of the list:

      currentAct = (Act) play.getAct().get(0); 

    Of course, if I planned on outputting all of the acts in sequence I could just get an iterator from the list and use that. But I’m jumping ahead. The first thing you should do with your generated classes is use them to read an XML file. In the example, you can do this with the following three lines:

      1	JAXBContext ctx = JAXBContext.newInstance("model");  2	Unmarshaller u = ctx.createUnmarshaller();  3	play = (Play) u.unmarshal(new File("/play/data/amsnd.xml"));

    The argument on the line 1 is the name of the package to which the model belongs. The filename on line 3 is the XML file to parse; you should change this to refer to your copy of the file.

    In this code, play is the head of the object tree that JAXB generates. From here I can use my domain model as I please. For example, the status bar contains details of the currently selected act and scene. Upon initialization, I use the following code to gather that information:

      1  Act initialAct = (Act) Model.getPlay().getAct().get(0);  2  Scene initialScene = (Scene) initialAct.getScene().get(0);

    The Model class is a singleton I’ve written to return the root of the domain object tree. Act and Scene are classes JAXB generates. Multiple objects, in this case acts and scenes, are stored in Lists. For the initial status bar display, I’m interested only in the first of each of these, so I can access the first element directly.

    In our play schema, both act and scene have an attribute called title. JAXB translates XML attributes to object attributes. So, our Act and Scene classes have getTitle() methods that return String.

    Look at the Line schema element. Notice it has no attributes, but its contents are the text for the line. Treat values like this as unnamed attributes, and access them with the getValue() method.

    Building and Running
    When you’re building or running an application that uses JAXB, you’ll need to include the following three libraries, which Sun distributes with JAXB:

  • jaxb-libs.jar
  • jaxb-api.jar
  • jaxb-ri.jar
  • You’ll also need an XML parser. If you’re using the 1.4.1 or later SDK, then you’ll already have one. If not, Xerces from the Apache XML project ( is a fairly useful and common one.Customization Through the Schema
    Sometimes the code generated from the schema won’t meet your exact requirements. Sometimes the schema has defects in it. Sometimes you just want to play around with things. In all of these cases, you can affect the behavior of the code generator through customization. At the moment, the reference implementation (RI) implements customization directly through the schema being processed, but the specification says that external customization scripts eventually will be supported too.

    Customizing the schema won’t affect the schema itself, except when it’s being processed by xjc because it’s done through a set of schema extension points. The spec allows for many customization points, at many levels of generation. This article doesn’t have room for a detailed examination of these or a detailed explanation of schema extensions, but for a sample I’ll show how to generate Vectors rather than Lists.

    A schema extension is basically detail you add to the schema to give instructions and extra information to specific processors. In this case, the extensions are aimed at xjc and all other processors should ignore them.

    I’m adding a global customization, which means that it affects all generated code. While this is a simple customization to make, quite a few other more complex ones provide real power to JAXB.

    Here are the changes you must make to the schema to make JAXB generate java.util.Vector rather than List, its default collection (note that the top-level schema tag needs to be changed too):


    Without going into too much detail about schemas, the annotation tag introduces a part of the schema that is usually intended for schema processing software. The appinfo tag introduces instructions for a particular processing application (in this case, JAXB’s xjc code-generation tool). Usually, each application uses its own namespace, as JAXB has done here.

    As it happens, simply changing the default collection type from List to Vector has no effect at all on the sample application I’ve presented here. My code works by retrieving iterators from collections, not the collections themselves. So when I generate the new model from this change, my code works without any new faults.

    Some people are leery of altering their schemas. The standard supports external annotations, but the current implementation doesn’t. This is something that will be rectified in the future.

    You’re on Your Way
    That’s all you need to know for a basic introduction to JAXB. Currently, the RI is in beta, but a final version is expected to be released around mid-February?so you don’t have long to wait. The full release isn’t going to differ much from the beta (apart from bug fixes), so you can start developing immediately against the beta fairly safely and then change over to the official first version when it’s released.

    If you decide to use JAXB technology, read the specification document from Sun, and pay particular attention to the section on customization. The customization I’ve introduced here really is trivial. JAXB is capable of considerably more.


    About Our Editorial Process

    At DevX, we’re dedicated to tech entrepreneurship. Our team closely follows industry shifts, new products, AI breakthroughs, technology trends, and funding announcements. Articles undergo thorough editing to ensure accuracy and clarity, reflecting DevX’s style and supporting entrepreneurs in the tech sphere.

    See our full editorial policy.

    About Our Journalist