Writing Functional Code with RDFa

ews feeds in all their manifestations—both with and without RDF—have a long tradition as structured data on the web. RDF—the data model—can state relations between certain entities. For example, one relation between a human and a feed could be conceived as creator.

In contrast, HTML is about structure and presentation, the semantics of the conveyed data is not—and cannot—be represented. Presentation-oriented formats such as HTML are useful for users, but typically cause rather expensive back-end processing (along with heuristics).

However, the RDF data model is useless without serialization syntaxes that are available to exchange representations online. To date, RDF/XML is the only official RDF serialization syntax that is available for developers to use.

Unfortunately, this means that it is not feasible to use RDF—even RDF/XML—in HTML. That is an ongoing issue. RDF serialization needs to make a persistent graph structure (be it in XML or in another form), and if the graph order is irrelevant, then interoperability issues can arise for use cases where the order is important, for example, in news feeds. However, when HTML is used as the container for RDF, structural elements can be preserved while defining and carrying arbitrary vocabularies (such as FOAF, SIOC, Dublin Core, DOAP, etc.)

Before discussing the details of feed rendering, you need to have a closer look at the environment. ARC2 is a Semantic Web library for PHP. ARC2 is simple to set up and offers useful features. For example, several readers (from microformats over OpenID to RSS) are built in, which allows you to parse a wide range of structured data formats and have them available in RDF. Further, ARC2 supports all common RDF serializations, such as RDF/XML, Turtle, Ntriple, and several plug-ins that extend the functionality of the base system. With ARC2 you can implement a compliant SPARQL end point with only three lines of code. The SPARQL-based scripting mechanism was added recently. These capabilities make ARC2 a multiple-use tool that you can use to create an RSS 1.0-to-RDFa converter service; as shown in the following:

include_once("../../Apache2.2/htdocs/arc2/ARC2.php");/* ARC RDF store config */$config = array(  'db_host' => 'localhost',	'db_name' => 'arcdb',	'db_user' => 'arc',	'db_pwd' => '',	'store_name' => 'rss2rdfa');$store = ARC2::getStore($config);/* global store init (one shot)*/	if (!$store->isSetUp()) {  $store->setUp();}

First, declare the prefix mappings that are usable in the entire store:

$NAMESPACES = array(	'xsd' => 'http://www.w3.org/2001/XMLSchema#',  	'rdf' => 'http://www.w3.org/1999/02/22-rdf-syntax-ns#',  	'rdfa' => 'http://www.w3.org/1999/xhtml/vocab#',  	'rdfs' => 'http://www.w3.org/2000/01/rdf-schema#',	'owl' => 'http://www.w3.org/2002/07/owl#',  	'foaf' => 'http://xmlns.com/foaf/0.1/',  	'dc' => 'http://purl.org/dc/elements/1.1/',   	'dcterms' => 'http://purl.org/dc/terms/',  	'skos' => 'http://www.w3.org/2004/02/skos/core#',  	'sioc' => 'http://rdfs.org/sioc/ns#',  	'sioct' => 'http://rdfs.org/sioc/types#',  	'xfn' => 'http://gmpg.org/xfn/11#',  	'twitter' => 'http://twitter.com/',  	'rss' => 'http://purl.org/rss/1.0/'   	);

Generating the RDFa-based Feed
The main issue in creating an RDFa-based feed is that of semantics vs. structure. The order of the feed items is significant, and hence they need to be preserved.

The XHTML+RDFa header typically looks like this:

$r = "	XHTML+RDFa from $URI
";

The code defines the namespaces for the RDF vocabularies used in the document. You must define the namespaces somewhere, although you can do it at any level (that is any element; you don’t have to place the definitions in the element).

Next, parse the RSS 1.0 feed (in RDF/XML) from a given URI, and load the resulting triples into the ARC2 store:

$q = "LOAD <$URI> INTO <$URI>";$rs = $store->query($q);

And then process the feed items further:

if (!$store->getErrors()) {			$r .= " 
"; $r .= getChannelInfo($URI); $r .= getChannelItemList($URI); $r .= "
"; $r .= getChannelItems($URI); $r .= "
"; $r .= "
"; } $r .= " ";

Listing 1 contains two important methods: getChannelInfo and getChannelItems.At the very end, you need to clean up the code again to improve performance. You don’t have to worry about privacy issues, because this is an online converter service:

$q = "DELETE FROM <$URI>";

Basically, you are now done with the code. However, there are a few tasks that you can do to extend the service: order items according to topics and/or creator, implement filters, and optimize to provide for lossless conversion, etc.

The converted RSS 1.0 feed (for example http://identi.ca/mhausenblas/all/rss) in XHTML+RDFa is represented as:

mhausenblas and friends

Feed for friends of mhausenblas
...

Further, a single news item in the feed looks like this:

dave: [FriendFeed]: Utah Open Source Conference, later this month ... Dave Winer2008-08-11T01:54:08+00:00

dave's status on Monday, 11-Aug-08 01:54:08 UTC

Usage
Converting an RSS 1.0 feed into an XHTML+RDFa representation is likely of little value on its own. However, using such a feed in a reader (such as netvibes.com) would be a first step, though I doubt there are current implementations that accept XHTML+RDFa.

SPARQLScript has a nice demo on how to create semantic mashups. Further, linking the content (or specific metadata) of a feed item to a dataset such as DBpedia or Geonames makes new use cases possible. From integrating other sources (for example, mapping hash tags from microblogs to DBpedia entities), to cross-site queries regarding a certain user, the possibilities are limited only by your imagination.

Issues
The downside to using RDFa is that not every tool currently supports it. For example, to query the example feed, you would naturally use SPARQL. However, nowadays most SPARQL engines, such as http://sparql.org require RDF/XML as input. Therefore, you’d need an RDFa processor such as the RDFa Distiller to convert the RDFa serialization into an RDF/XML serialization.

Share the Post:
Share on facebook
Share on twitter
Share on linkedin

Overview

Recent Articles: