Capturing Knowledge with RDF
There is wide consensus that the triple-based model of RDF is simpler than the RDF/XML format, which is called the "serialization format." Because of this, a variety of simpler formats have been created to quickly capture knowledge expressed as a list of triples. Let's walk through a simple scenario where we express concepts in four different ways: as natural language sentences, in a simple triple notation called N3, in RDF/XML serialization format, and, finally, as a graph of the triples.
Following the linguistic model of subject, predicate, and object, we start with three English statements:
Buddy Belden owns a business.
The business has a Web site accessible at http://www.c2i2.com/~budstv.
Buddy is the father of Lynne.
In your business, you could imagine extracting sentences like these from daily routines and processes in your business. There are even products that can scan email and documents for common nouns and verbs. In other words, capturing statements in a formal way allows the slow aggregation of a corporate knowledgebase in which you capture processes and best practices, as well as spot trends. This is knowledge management via a bottom-up approach instead of a top-down approach. Now let's examine how we capture the preceding sentences in N3 notation:
<#Buddy> <#owns> <#business>.
<#business> <#has-website> <http://www.c2i2.com/~budstv>.
<#Buddy> <#father-of> <#Lynne>.
From each sentence we have extracted the relevant subject, predicate, and object. The # sign means the URI of the concepts would be the current document. This is a shortcut done for brevity; it is more accurate to replace the # sign with an absolute URI like "http://www.c2i2.com/buddy/ontology" as a formal namespace. In N3 you can do that with a prefix tag like this:
@prefix bt: <http://www.c2i2.com/buddy/ontology/>.
Using the prefix, our resources would be as follows:
<bt:Buddy> <bt:owns> <bt:business>.
Of course, we could also add other prefixes from other vocabularies like the Dublin Core:
@prefix dc: <http://purl.org/dc/elements/1.1/>.
This would allow us to add a statement like "The business title is Buddy's TV and VCR Service" in this way:
<bt:business> <dc:title> "Buddy's TV and VCR Service".
Tools are available to automatically convert the N3 notation into RDF/XML format. One popular tool is the Jena Semantic Web toolkit from Hewlett-Packard, available at HP Labs Semantic Web Research. Listing 2 is the generated RDF/XML syntax.
|Warning: The RDF/XML serialization of predicates and objects can use either elements or attributes. Therefore, it is better to use a conforming RDF parser that understands how to translate either format into a triple instead of a custom parser that may not understand such subtlety.|
The first thing you should notice is that in the RDF/XML syntax, one RDF statement is nested within the other. It is this sometimes-nonintuitive translation of a list of statements into a hierarchical XML syntax that makes the direct authoring of RDF/XML syntax difficult; however, since there are tools to generate correct syntax for you, you can just focus on the knowledge engineering and not author the RDF/XML syntax. Second, note how predicates are represented by custom elements (like RDFNsId1:owns or RDFNsId1:father-of). The objects are represented by either the rdf:resource attribute or a literal value.
Figure 4 displays an IsaViz graph of the three RDF statements.
While the triple is the centerpiece of RDF, other elements of RDF offer additional facilities in composing these knowledge graphs. The other RDF facilities are discussed in the next section.