Discover RELAX NG, A Simple Schema Solution

any experienced developers loathe changing their DTDs (Document Type Definitions) because of their cumbersome syntax and semantics. For newcomers, DTD syntax just looks like Greek. A brand new addition to the DSDL (Document Schema Definition Languages) called Regular Language Description for XML (RELAX NG) intends to change that. The following are RELAX NG’s main goals:

  • Compact syntax: Maximize readability, simplify the syntax (RELAX NG schema is very close to a textual description of a vocabulary.)
  • Compatibility: Backward compatibility starting from XML 1.0 DTD

In fact, since it is defined in XML, RELAX NG inherits the entire XML specification, including data types, infosets, and namespaces. While XSD (XML Schema Definition) was developed to define schemas in the XML format, RELAX NG supports pluggable simple data type libraries (new ones can be readily designed and built as needed) and provides two interconvertible syntaxes: an XML one for processing and a compact non-XML one for human authoring.

This article provides in-depth details of XML-based RELAX NG tags, providing a comparison to XSDs and DTDs wherever applicable. It provides an example that defines the schema in RELAX NG and validates it using JAXB from within a Java application.

RELAX NG Data Model

The RELAX NG abstract data model deals with XML documents that represent both schemas and instances. An XML document is represented by an element, which consists of a name, context, a set of attributes, and an ordered sequence of zero or more children. Consider the following RELAX NG example of student roll:

                        

RELAX NG uses self-explanatory numerical pattern quantifications, such as , , and tags, which would be represented by ‘*’, ‘+’, and ‘?’ respectively in a DTD:

                                                                	  	        			      		"Male"      		"Female"	    	 	

The tag provides the functionality of OR (‘|’). In the above snippet, the schema says that a studentID consists of either just a name or a first and last name.

Data Types

RELAX NG supports all the data types defined by W3C XML schema data types. In the student roll example, the age attribute refers to the integer defined in datatypeLibrary. Specifying the URI is a good practice, but if you omit it, RELAX NG implicitly uses the default URI in the root element (http://www.w3.org/1999/XMLSchema-datatypes).

Data types also can have parameters. For example, a string data type can have a parameter controlling the length of the string. You specify parameters by adding one or more param elements as children of the data element, as in the following example:

      127  

Data Type Plug-ins

The editors of RELAX NG (and I) believe that no universal data type system can exist and that, beyond some very basic universal types, each application domain has its own requirements. RELAX NG defines a generic mechanism for plugging in external type systems. The current implementations support W3C XML Schema data types.

Let’s walk through an example of pluggable data type use:

                                  ……….

Note that the data types “nonnegativeInteger” and “token” are defined in the XML schema for XML Schemas: Part 2: Data types. They are not part of the initial data type library. With prior foresight, you can define your own application data types and use the definition across all the applications in your domain.

To define a custom data type within an application, RELAX NG provides the named patterns define and ref to make the definition and reference of user-defined data types easy and flexible:

             

Complex Types

Compared with XSD, RELAX NG does not support simpleType and complexType tags. However, you can build complex types by grouping the named patterns define and ref. For example, complexType studentRoll can be built using references to studentID and studentSSN:

        

Sequence Tag

By default, XSD defines the complex type elements in ordered sequence. XML tools on the market such as XMLSpy, 4Suite, and Stylus use the sequence tag to mandate that the elements within a complex type be ordered in sequence. As a result, though you define all the sub-elements, they must be in the order specified in the schema. This limits the full features of inheritance, because it restricts the definition of a subset of elements and what can be inherited from it. RELAX NG relaxes this restriction by providing the interleave tag:

                     

The above schema specifies that the parentNames can be interleaved, so mother’s name or the optional father’s name need not follow the order specified in the schema.

RELAX NG in Java

The JAXB libraries in the Java Web Services Developer Pack (Java WSDP) 1.5 include functionality for RELAX NG. JAXB 1.5 provided the initial support for RELAX NG grammar.

The example “publicHolidays.rng” in the downloadable code for this article provides the schema for a list of public holidays in US. The following tags specify the package name to be used for the Java classes generated by JAXB:

      

After JAXB generates the Java files, you compile them. In the example code, the RELAXNGExample parses the sample XML file and prints the contents. Independence Day is incorrectly defined as June 4th in the sample publicHoliday.xml file. RELAXNGExample corrects the date within the Java code by setting the month to July, revalidates it, and prints the new contents to the screen.

RELAX NG and JAXB: Simple XML

JAXB makes parsing and loading XML files into Java applications simple; RELAX NG makes defining the XML schemas easy. Use RELAX NG and JAXB in your next enterprise Java application. Why wait, when it’s so simple and easy?

Share the Post:
Share on facebook
Share on twitter
Share on linkedin

Overview

Recent Articles: