Microformats are trouble. A microformat
(which is discussed in "Discover Microformats for Embedding Semantics
") makes use of HTML attributes (or XHTML attributes) as a shorthand replacement for some formal semantic. There's nothing at all wrong with this approach. The problem is that most microformats as they exist right now assume that all data fit more or less neatly into one of perhaps a dozen distinct formats, and those formats can work with the assumption of a fairly ad hoc approach to encoding them.
So long as the information being so encoded is a vcard, a calendar event, a friend of a friend, or a few other similar syntaxes, this approach isn't necessarily a bad thing, particularly if it contains the information you need and if your assumption is that you're coding for usually one or two web services (such as del.icio.us or digg).
Suppose that you are a journalist and you want to encode specific metadata information into your stories. One approach is to use a formal XML document. However, while such XML documents are useful for providing strictly structure information, a writer wants to write and in general would prefer just to annotate his or her writings with some kind of associational editor. If that's the case, vcard is likely not going to cut it for encoding all of the information the journalist needs while still allowing him or her to write the narrative.
"Hmmm...," you may be thinking, "what if you were to use...CURIEs?" Yes!
But not so fast. While microformats in general have some problems, one of the things that they recognize is the concept that you do need some kind of framework for putting this structure in place, some specific set of attributes that are being used to solve the problems of encoding.
RDFa defines a number of tags that can be added to the HTML or XHTML model, and that together have the effect of defining within such a document all the same data as would be contained in an RDF document. RDFa is, in essence, RDF for microformats, but it includes enough underlying structure that you can parse through the contents and reconstruct RDF from it.
For instance, suppose that you have a technical article about CURIEs, and you want to embed some metadata about it, such as publishing information, from the Dublin Core Metadata Initiative. Dublin Core is a particularly useful schema for web pages. Its initial role was to provide some form of hint about web documents that wasn't necessarily contained in the formal HTML; however, the idea of creating secondary RDF files containing Dublin Core information has never really caught on. However, it's not a radical jump to go from a basic markup page to something like this:
<title>CURIE Eleison: Simplifying RDFa Notation</title>
<h1 property="dc:title">CURIE Eleison</h1>
<h1 property="dc:subtitle">Simplifying RDFa Notation</h1>
<h1>by <span property="dc:creator">Kurt Cagle</span></h1>
<h1 property="dc:date" datatype="xs:date"
content="2007-09-05">September 5, 2007</h1>
There are several points to note in this particular case. Notice the declaration of the Dublin Core namespace in the <html>
The use of namespaces in XHTML documents beyond the XHTML namespace itself points to one of the major benefits of XHTML over traditional HTML: you can introduce specific namespace content into the XHTML, extending
it to either add metadata or provide hooks for other processes (such as the graphical SVG or XForms). The preceding example declares the Dublin Core namespace, but note that no element or attribute name in the code uses any dc
-defined identifier. Instead, it uses the dc
prefix to identify terms within property attributes
as belonging to Dublin Core. In other words, these are CURIEs.
Jumping past the <style> block for a moment, the property attribute declares the properties associated with either the document itself or some portion of the document. For instance:
<h1 property="dc:title">CURIE Eleison</h1>
indicates that this particular <h1>
element is also a container that holds a Dublin Core title. In general, the specific element that the property is attached to is irrelevant from the standpoint of RDFa. The previous example could just as readily have been a <div>
element, and the RDFa would nonetheless be the same.