
ava programmers who want to develop semantic web applications have a growing range of tools and libraries to choose from. One such tool, the Jena platform, is an open source toolkit for processing
Resource Description Framework (RDF),
Web Ontology Language (OWL), and other semantic web data. Specifically, this discussion will introduce you to Jena's Model abstraction that provides the container interface for collections of RDF triples, which are data linked by relationships.
Model is one of the key components of Jena's approach to handling RDF data. You'll explore its core capabilities along with some of the extensions of the basic Model that are built in to Jena to give you a working knowledge of Jena code that will load, process, query, and write RDF data and ontologies.
Jena is a free, open source (under a liberal BSD license) Java platform for processing semantic web data. In this case semantic web particularly refers to the approach based on the World Wide Web Consortium (W3C) Semantic Web standards, especially RDF, OWL, and SPARQL. Note that W3C strictly produces recommendations rather than standards, but the nuances of that difference are beyond the scope of this discussion.
One of Jena's original goals was to support the W3C standards as faithfully as possible, and that principle remains one of the platform's key values today. Jena grew out of a research activity at HP Labs, during the period when the current releases of RDF and OWL were being standardized. The Jena 2 release series began in 2003; the latest version at the time of this writing is Jena 2.5.3. Jena has been actively maintained and developed since then by the team at HP Labs and contributors from the community.
The heart of Jena is a Java library for semantic web data handling. The Jena SourceForge site, however, lists a number of other related tools and APIs for assisting developers to build and manage semantic web applications.
RDF Triples and Graphs
Confucius is said to have written that a journey of a thousand miles begins with a single step. With RDF, that single step is the triple. In essence, a triple is two pieces of data that are linked by a named relationship. For example:
Lisa Gerrard performs-track "Sacrifice"
Logically, a triple is a simple statement about the truth of some propositionin this case that the binary predicate
performs-track is true of the arguments
Lisa Gerrard and
Sacrifice. (Note that some details in this example were omitted for clarity; in fact, the track "Sacrifice" is performed by Lisa Gerrard and Pieter Bourke on the album
Duality.) RDF calls the first of these arguments the triple's
subject, and the second is its
object.
So far so good. But what are the arguments exactly? RDF distinguishes two kinds of elements that can appear in triples: literals and resources. A literal is just a piece of data: an integer, a string, a floating-point number, or even an XML structure. A resource identifies something (or someone), about which we make semantically meaningful statements. That "something" might be a report, a stock trade, or, in this example, a recording artist.