Login | Register   
RSS Feed
Download our iPhone app
Browse DevX
Sign up for e-mail newsletters from DevX


Using RDFa with DITA and DocBook : Page 4

Learn how to add RDFa metadata to DITA and DocBook documents, how to keep those documents valid, and what advantages this technique can bring to a DITA- or DocBook-based publishing system.



DITA XML documents usually have a document type of task, reference, or concept. The ditarefsample.xml sample file is a DITA reference document with content that parallels dbrdfasample.xml.

Author's Note:Like dbrdfasample.xml, ditarefsample.xml includes numbers after proper names to make the correspondence between the content and extracted triples clearer.

One RDFa attribute not previously mentioned (because it's not used in the W3C RDFa Primer) is rev. This attribute expresses an object-predicate-subject relationship as the "reverse" of the subject-predicate-object relationship described with a rel attribute. DITA already has a rev attribute that you can add to most elements. Although it's short for "revision level," if you want to use standardized RDFa software to extract metadata and still track the revision level of specific elements, you can use rev for RDFa and declare a revision attribute to fill the role formerly held by the DITA rev attribute.

The sample DITA reference document that incorporates RDFa metadata is called ditarefsample.xml, and it points to the DITARefRDFa.dtd DTD shown in Listing 3. Like DocbookRDFa.dtd, this DTD references the RDFaAttributes.mod module that declares the RDFa attributes, redefines a parameter entity from the standard DTD (in the DITA case, base-attribute-extensions) to include these new attributes, and includes the standard DITA DTD for reference documents: reference.dtd. By pointing at that DTD, the DITA reference document shown in Listing 4 ensures that its XML is valid and that it can also store RDFa triples.

As with the DocBook example, you can now add the new attributes just about anywhere you want in the DITA document, but whenever possible the document has them in elements provided by the standard for metadata. For example, the mpc:editor and mpc:workFlowStage values are stored in a prolog element right after the document's title. Unlike DocBook, DITA has no specialized elements for wrapping references to images, so I used ph elements (DITA's equivalent of DocBook's phrase element) to hold the attributes for the sunset.jpg image's lastScreenShotDate and softwareRelease values.

What About Existing DocBook and DITA File Processing?

You may wonder how the metadata additions described in the previous sections affect the processing of DocBook and DITA files. In short, it doesn't. Popular frameworks for converting DocBook and DITA files into HTML, PDF, and other output formats (for example, the DocBook XSL Stylesheets and the DITA Open Toolkit) process the documents you give them by pulling values from the elements and attributes they need and ignoring the other attributes. In fact, this was part of the design of both DTDs: to let you add new attributes without worrying about backward-compatibility. The "a" in RDFa helps keep this new add-on module from being too intrusive when added to other DTDs and schemas because you don't need to incorporate any new elements into the target DTD's content models.

Speaking of the add-on module, in this article's examples the syntax for including DTD declarations for the RDFa attributes are a bit simplified. In a production system, you should use DocBook or DITA conventions to incorporate the full XHTML Metainformation Attributes Module into your system. As the official DTD module from the standard, this module makes your system more compliant with that standard. The DITA DTDs already declare a datatype attribute for the data element and a content attribute for the othermeta element. To avoid warning messages about the RDFa declarations making those attributes redundant, your production version should also re-declare those elements' attribute lists without those attributes.

What Have You Learned?

Adding RDFa to your DocBook or DITA documents has a nice payoff: easier addition of metadata that can be extracted by existing tools that follow an open standard. And it comes at a minimal cost.

Try adding some of the sample metadata shown in this article to your own documents, and then modify it to reflect metadata your system needs but that doesn't have an obvious place in your existing DTD. As tools for aggregating and manipulating RDF triples proliferate, you'll find an increasing amount of technology that can help you get more out of your own content and metadata.

Read the actual RDFa specification, the "RDFa in XHTML: Syntax and Processing Recommendation," to learn more about the RDFa attributes discussed here and several others.

Bob DuCharme, a solutions architect at Innodata Isogen, was an XML "expert" when XML was a four-letter word. He's written four books and nearly 100 on-line and print articles about information technology without using the word "functionality" in any of them. See his blog at snee.com/bobdc.blog for more.
Comment and Contribute






(Maximum characters: 1200). You have 1200 characters left.



Thanks for your registration, follow us on our social networks to keep up-to-date