Login | Register   
RSS Feed
Download our iPhone app
Browse DevX
Sign up for e-mail newsletters from DevX


Using RDFa with DITA and DocBook

Learn how to add RDFa metadata to DITA and DocBook documents, how to keep those documents valid, and what advantages this technique can bring to a DITA- or DocBook-based publishing system.

he RDF data model gives you a way to add attribute name/value pairs to any resource that you can reference with a URI. This makes it easy to create metadata about nearly anything. The W3C's RDFa standard is an increasingly popular syntax for storing RDF statements inside HTML documents, but according to the RDFa in XHTML: Syntax and Processing W3C Recommendation, "RDFa is a specification for attributes to express structured data in any markup language" (my emphasis).

The recommendation goes on to state: "this specification deals specifically with the use of RDFa in XHTML, and defines an RDF mapping for a number of XHTML attributes, but RDFa can be easily imported into other XML-based markup languages." This flexibility can come in very handy when you work with publishing systems based on the DITA or DocBook XML specification. The flexibility of DocBook and DITA Document Type Definitions (DTDs) make it easy to add RDFa to any documents that conform to these standards, and perhaps even to reduce your need to further customize these DTDs for your own company's publishing system.

This article explains how to create valid DocBook and DITA documents that incorporate the RDFa metadata demonstrated in the W3C's RDFa Primer. The RDFa Primer demonstrates how to incorporate machine-readable metadata to simple HTML examples by adding RDFa attributes in the appropriate places. If you're new to RDFa, read the W3C's RDFa Primer first and treat this article as a refresher and sequel that explains using RDFa outside of XHTML, particularly in DocBook and DITA. The attached source code download combines the embedded data from the RDFa Primer examples and a few new ones with a DocBook document and a DITA document. It also includes add-on DTD modules to make these sample documents valid.

Inside the RDF Data Model

The RDF data model stores information in a simple data structure known as a triple, so named because it has three parts: a subject, a predicate, and an object. In more database-oriented terms, think of these three parts as a resource ID, an attribute name, and an attribute value. For example, a triple could store the statement "index.html has a title of 'My Home Page'."

Figure 1. Connecting Two Triples: When the same resource (in this case, index.html) is the object of one triple and the subject of another, you can combine statements in ways that let you answer new questions.

RDF requires that the subject and predicate in a triple be represented by URIs. After all, many web pages have the filename index.html, and the word "title" could mean a job title, the deed to a piece of property, or the title of a work. So, a triple consisting of {http://www.snee.com/bob/index.html, http://purl.org/dc/elements/1.1/title, "My Home Page"} makes it clear exactly which index.html the URI is referring to, and that "title" is in the Dublin Core sense of the term: the title of a work.

The "My Home Page" part of this triple demonstrates that the third part need not be a URI. If it is a URI, though, you can more easily connect triples together to learn more from the combination (see Figure 1). For example, if another triple says {http://www.someclub.org/memberID/4329, http://xmlns.com/foaf/0.1/homepage, http://www.snee.com/bob/index.html} (or, in English, "someclub.org member 4329 has a home page at http://www.snee.com/bob/index.html"), then the two triples together tell you that the person represented by the URI http://www.someclub.org/memberID/4329 has a home page with the title "My Home Page."

More complex inferencing from larger data sets is making RDF and related standards such as OWL and SPARQL popular in biopharmaceutical research and other domains looking to draw connections among disparate sets of data.

Comment and Contribute






(Maximum characters: 1200). You have 1200 characters left.



Thanks for your registration, follow us on our social networks to keep up-to-date