Login | Register   
RSS Feed
Download our iPhone app
Browse DevX
Sign up for e-mail newsletters from DevX


Discover Microformats for Embedding Semantics

Developing for the semantic web will require working with the complexities of several semantic technologies. However, there's a simpler approach that can give you a jump start in bringing semantics to your web applications.

he Resource Description Framework (RDF) and Web Ontology Language (OWL) are important technologies driving development on the road to the semantic web. The former is a set of World Wide Web Consortium (W3C) specifications that provide a model for representing metadata through specific statements, or triples, made up of subject-predicate-object representations for specific resources. Data from disparate stores can then be mashed together or built into resources of machine-readable information, which can be processed, exchanged, and stored by web-based applications. The latter technology, OWL, is currently a W3C recommendation for a language that can be applied to define and represent data models more effectively than other metadata languages, such as XML, through the use of semantics and a vocabulary that provide class and property descriptions.

These and other technologies that are gaining more widespread adoption to build out the semantic web promote highly complex concepts and analytics, and the ability for the semantic web to extract, process, and deliver intelligent information efficiently is going to take time for developers and designers to refine and implement. Despite the learning curve, many people building applications that aim to provide Web 3.0 capabilities will need to begin embedding semantics now to get a head start delivering more meaningful, information-based content to customers and users. And one way to embed semantics for selected applications is to use microformats.

Microformats provide extensions to the standard HTML tags that have been used widely for some time to create web pages, and they are open and freely available elements for semantic-based markup in HTML or Extensible HTML (XHTML). Consider microformats as a means of lowering the barrier to your entry in semantic web development. They can even give web page designers without extensive programming experience the ability to program web sites. This discussion will take a look at microformats and demonstrate how they can be a convenient stepping-stone for developers and designers looking to participate in the evolution of the semantic web.

Embedded Semantics
Rather than reinvent the web, microformats allow developers to approach the problem of embedding semantics from the perspective of existing and widely adopted web standards. Microformat-aware browsers can parse this code and use it to help extract meaning from web pages. Standard HTML markup describes how only text should be formatted. Microformats allow programs such as web crawlers to recognize items like contact information, events, and so on, which can be added to address books and calendars. Microformats also provide you the ability to aggregate content or create "mashups," such as adding a restaurant review to a MapQuest map.

Although they are in effect an attempt to turn a medium designed for publishing and presentation into something that is dynamic and programmable, it is important that microformats be designed for humans first and for machines second. Microformats should therefore be human readable and easily understood by content authors and designers as well as more experienced programmers. One way to think about microformats is that they are about people, events, places, and things, rather than just pages.

Think about the last developer conference you attended. Wouldn't it have been easier to manage your workshop schedule if the agenda on the conference web site could have found its way directly into the calendar on your laptop or PDA? Microformats can enable this kind of scenario.

From a technical point of view, microformats are a form of semantic markup using standard XHTML encoded with specific HTML attributes such as class, rel, and rev. What otherwise would be seen by a machine as just text, gains meaning through its context as indicated by wrapping items in span (or other HTML) elements with class names that are part of a specific microformat specification. A set of class names—for example, formal name (fn), organization (org), telephone number (tel), and url—constitutes a class (in this case, class="vcard"), which can be understood as a single, specific entity for the purposes of data exchange. Once detected, this information can be extracted by software and reused (indexed, searched for, saved, or cross-referenced). When encoded in this way, it can be used either by web services in a more programmatic manner or be imported into desktop applications.

Microformats are not unlike using XML tags, but there is one important difference: instead of allowing everyone to create their own custom XML tags, microformats are derived from existing web standards as much as possible. For example, hCard maps one-to-one to the vCard standard that has been in use for years by desktop applications like Microsoft Outlook and Apple's iCal.

Having this microformats markup based on easily understood and widely adopted standards gives authors of microformats the hope to speed adoption and provide a more generalized form of markup rather than the myriad industry-specific forms of XML.

Comment and Contribute






(Maximum characters: 1200). You have 1200 characters left.