Login | Register   
RSS Feed
Download our iPhone app
Browse DevX
Sign up for e-mail newsletters from DevX


Meet Jena, a Semantic Web Platform for Java : Page 5

Tools for developing semantically aware applications are rapidly growing more Java friendly. Take a closer look at Jena, an open source toolkit for processing and working with semantic web data.

Ontology Models
The examples up to this point have looked at RDF models, concentrating on the simple binary predicates and triples that represent semi-structured data in graphs. The W3C Semantic Web standards define two other languages, RDF Schema (RDFS) and OWL, which extend RDF and add considerably more power for describing and modeling information and applications. These languages provide a basis for developing ontologies. Ontologies describe what is true in principle about the domain applications being worked on.

For example, consider that a given person resource may have a name property. RDF can represent that individual and that individual's name, but RDF by itself can't state in principle that it's true all persons have names, or that any resource that has a social security number must therefore represent a person. Ontologies provide a formal (that is, mathematical) description of the concepts in a given application domain, and they can represent things that are true in principle.

Using and applying ontologies is really a subject for another article. Continuing the theme of this discussion, you should observe that the W3C ontology languages—RDFS and OWL—connect to RDF in two ways. First, they make semantic statements about RDF instance data. Second, they also use RDF to represent the ontology itself. This idea is powerful (and sometimes confusing). Concept descriptions represent what the application can state about RDF instance data. These concept descriptions are themselves written down in RDF, using a special set of reserved resource and property URIs. The concept description, called a class in ontology languages (and not to be confused with a class in OO programming languages), can be expressed by a set of special triples. Note that it really is a set of triples: there's more to say than can be expressed by just one RDF statement.

The essential simplicity of RDF's triple-centric representation can lead to a can't-see-the-woods-for-all-the-trees problem with ontologies. Class descriptions are confusing to manipulate triple by triple, and it's easy to introduce errors. Jena's ontology model API tries to help by extending Model, Resource, and so on with Java classes that encapsulate a more abstract view of what's really being expressed. Instead of seeing the raw triples attached to a resource representing an ontology class, the Java abstraction OntClass provides a convenient way of processing those triples while keeping them out of the programmer's way. As just one example, in RDF you can search for triples with the predicate rdfs:subClassOf, from which the subclass relationships between class resources can be extracted. Or, through OntClass the super- and subclasses can be listed directly with OntClass.listSuperClasses() and .listSubClasses(), respectively.

Inference (Reasoning) Models
The real power of formal ontologies comes when combined with reasoning algorithms to infer things that are true about some collection of data, but not written down explicitly—and perhaps not even obvious. There are many approaches to reasoning, but Jena uses a particular pattern for all of the inference procedures it supports. In essence, Jena views a reasoner as an automated way to add more triples—entailments—to a base model. For example, suppose an application has only these two facts:

project:member_10 foaf:name "Julie". foaf:name rdfs:domain foaf:Person.

There are two triples here. One is that a resource with the URI project:member_10 has the FOAF name Julie, and the other is that the FOAF name is a property with a domain class of FOAF Person. Unlike a schema or constraint language, the fact that member_10 is not known to be a Person doesn't imply a violation. In RDFS and OWL it means that you are entitled to infer, or entail, the statement that member_10 is, in fact, a Person.

This triple is therefore entailed from the two previous triples (together with a semantic rule about the meaning of a property's domain class):

project:member_10 rdf:type foaf:Person.

In a Jena inference model only the base statements are asserted, but when inspected the model appears also to contain the entailments—just as though those triples had also been asserted.

A final example shows Jena's inference processing in action. Since reasoners and ontologies are commonly used together, this example shows both OntModel and one of Jena's built-in OWL reasoners working in collaboration:

@Override protected void run() { // create an OntModel that also handles OWL reasoning through the rule engine OntModel m = ModelFactory.createOntologyModel( OntModelSpec.OWL_MEM_MICRO_RULE_INF ); // read in the FOAF ontology FileManager.get().readModel( m, FOAF_NS ); // make a claim about project member Julie Resource m10 = m.createResource( NS + "member_10" ); Property name = m.getProperty( FOAF_NS + "firstName" ); m10.addProperty( name, "Julie" ); // now list the named foaf:Persons - we'll only see Julie if we can tell // that m10 has rdf:type foaf:Person, which is entailed implicitly listPeople( m ); }

This example uses one of the built-in Jena reasoners, based on a Jena-native rule engine. However, the inference model support in Jena works with a variety of other reasoners, including external description logic reasoners such as Pellet.

This introduction to Jena's Model abstraction has covered some of the core operations in Jena, and touched on some of the key variants of Model that are part of the Jena framework. Other capabilities, which may be the subject of future articles include:

  • Jena's built-in RDB Model adapters work with a specific triple store table layout, but there are other tools that extend Model to cover repositories other than triple stores, such as native relational tables or LDAP servers.
  • The examples discussed here created models programmatically, but it's also possible to describe models using a declarative vocabulary (in RDF, naturally) and have this description assembled into a Jena Model object.
  • Jena's schemagen tool can automate the translation of ontology terms into Java constants that can be used by Java programs to access RDF and OWL data.
RDF is a simple, flexible, and extensible representation for semi-structured data, and is a foundational technology for the semantic web. Jena is a well-established, open source Java platform for creating, manipulating, and handling RDF data. In Jena, RDF graphs are represented as Model objects, and triples are represented as Statement objects. The model abstraction is the basis for some powerful extensions, including transparent support for databases and inference, and a convenient API for processing ontologies.

Dr. Ian Dickinson is a senior research engineer at Hewlett-Packard Laboratories in Bristol, UK. He joined HP Labs in 1988, and has been a member of the Jena team since 2003. He has worked on a variety of semantic web application projects for HP, and his particular research interests include software agents and user interfaces for semantic web systems.
Comment and Contribute






(Maximum characters: 1200). You have 1200 characters left.