Let Semantics Bring Sophistication to Your Applications

ave you ever been frustrated by the lack of sophistication in some of today’s search applications? I certainly have been. For example, I was recently evaluating a tool, called Green Pepper, for automating testable requirements; when I googled the phrase “Green Pepper,” I received a bunch of results that were unrelated to the type of information I was looking for. I was interested in the Green Pepper software tool, not the vegetable. Wouldn’t it be nice if the search engine could understand the meaning of my query or, if not, at least assist me in disambiguating the possible interpretations to provide more meaningful results?

At a very rough level, this clarification is the goal of the Semantic Web initiative. The Semantic Web aims to extend the information on the web in a form that can be consumed more usefully by people and computer programs. This approach generally involves extending today’s web of unstructured data with a more meaningful representation of knowledge. Going back to my previous example, Green Pepper is a vegetable and a software product. If the computer program I was using to search the Internet was aware of these relationships, it might have been able to give me more precise search results. And, if the program was really snazzy, it might have been able to infer my context (software) from my question, from my previous searches, or through some sort of interaction with me.

 
Figure 1. The Sommelier: This sommelier application recommends a selection of wines based on body, flavor, and color.

This task may seem daunting, and indeed it is. Although there are many smart people and companies working toward this vision, they still face many challenges and we are years away from seeing this vision realized. In the meantime, people have found creative ways to integrate semantic technology into their programs. In this article, you will explore some of the building blocks of semantic technology and build a simple application to exercise these components.

Application Example: White or Red?
Throughout this article, I will show you how to put together a simple wine recommendation application. Haven’t you always wanted to have your own personal sommelier? This sommelier will be pretty basic; it will recommend particular wines based only on body, flavor, and color (see Figure 1).

The data the sommelier will use is the wine ontology published by the W3C as an example. Now, you may ask, what is an ontology? But, before diving in, I’d like to warn you that this is fairly cerebral stuff. The next two sections will introduce ontologies?what they are and the motivations behind them. I will keep these introductions as simple as possible but will also try to make them complete enough for you to have the conceptual background necessary to understand the application example.

About Ontologies
Fundamental to an understanding of semantic technology is the representation of knowledge. For example, to create a sophisticated sommelier, you would need to provide it with an understanding of the types of wines, their characteristics, and relationships. Within the field of semantics, this representation often takes the form of an ontology. An ontology is a set of concepts and relationships that can be used by people and computers to reason about a domain.

There are many standards for representing ontologies, but this article will focus on the popular Web Ontology Language (OWL). OWL is a W3C standard that extends other W3C standards such as the Resource Description Framework (RDF) and RDF-Schema (RDFS) by adding to their vocabulary and by specifying additional semantics.

Ontologies are composed of statements about a domain. These statements relate individuals, properties, and classes and are sometimes referred to as triples. Statements assert something about the domain and take the form of “ < Predicate> “. I could make a statement about wine such as “FrenchWine is-a Wine.” In this statement, both Wine and FrenchWine are classes that are related through subclassing, as indicated by the “is-a” predicate.

Classes, Individuals, and Properties
Ontologies in OWL are composed of classes, individuals, and properties and are represented (like RDF) in XML. OWL defines the serialization of these components and prescribes semantic meaning for special types of relationships that can be used for reasoning. But, more on reasoning later. First, you should explore classes, individuals, and properties in the context of your wine domain.

Each OWL ontology will have a base URI that uniquely identifies the ontology. This base URI is described in the header of the OWL document. As an example, the base URI for the wine ontology is http://www.w3.org/TR/2003/CR-owl-guide-20030818/wine. All resources (including classes, individuals, and properties) declared in an OWL document are identified by a fully-qualified URI. For example, there is a resource defined in this ontology named http://www.w3.org/TR/2003/CR-owl-guide-20030818/wine#FrenchWine. As a convenience, you can omit the base URI in references within the document and just prefix those references with the pound symbol (#)?for example, #FrenchWine.

Classes are sets that contain like individuals and are related to each other through subclassing (they are also referred to as taxonomies). All classes in OWL implicitly subclass the owl:Thing class and are declared using the owl:class construct. You could represent your previous statement about French wines in OWL as follows:

   

This code states that there are two classes (Wine and FrenchWine), that all instances of FrenchWine are instances of Wine, and that all instances of Wine are instances of Thing. Notice that the subclassing of Thing was implicit.

Individuals are the instances of things in a domain. For example, there is a wine named Chateau Cheval Blanc St. Emilion (which is a St. Emilion-styled wine from the Chateau Cheval Blanc winery). This instance is an individual wine, one that you could go and buy. This instance could be declared in OWL notation as:

This declaration can be read as “there exists an instance of wine named ChateauChevalBlancStEmilion.”

Properties relate two instances in the form of a statement. For example, I could state: “ChateauChevalBlancStEmilion hasColor Red” and “ChateauChevalBlancStEmilion hasFlavor Strong.” These statements can be combined to form a graph representing your current understanding of the wine domain as shown in Figure 2.

The properties hasColor and hasFlavor can be declared in OWL as:

And, the ChateauChevalBlancStEmilion instance can have these properties defined for it as:

      

In this example, Red and Strong are individuals declared elsewhere in this document and are related to the ChateauChevalBlancStEmilion instance through the hasColor and hasFlavor properties, respectively.

It is important to note that properties relate individuals, not classes. However, individuals can be grouped into classes on the basis of their properties through class restrictions. For example, you could define a class named RedWine, which represents the set of all individuals having the property hasColor and the value Red. Figure 3 illustrates this new class and three instances, all of which are related to RedWine through the hasColor property.

Figure 2. Properties Example: This diagram shows two statements that together declare that ChateauChevalBlancStEmilion is red and has a strong flavor. Figure 3. A Basic Class: The hasColor property relates these three instances to the RedWine class.

In OWL the RedWine class would be expressed as:

                                      
Author’s Note: The restriction is implemented through subclassing. The declaration in the previous code example could be read as “The RedWine class is a subclass of wine and a subclass of the anonymous set of all individuals having the color red.”

The foregoing has been only a basic introduction to ontologies and OWL, but hopefully it has at least given you an initial understanding of how ontologies are structured and represented through OWL. For this article, you will consume a prebuilt ontology for the sommelier application, so you don’t need to be an OWL expert and probably know enough to continue. If you are interested in learning more about ontologies and OWL consult the related resources sidebar for links to the relevant specifications, tutorials, and helpful tools.

Why Ontologies?


Figure 4. Class Hierarchy: The asserted wine ontology is relatively flat.

Figure 5. Complex Ontology: The ontology is much more complex after a reasoner inferred additional relationships and added them.

If you managed to complete the last section you probably noticed that ontologies are relatively complex. You might be wondering what advantages do people and programs gain by expressing information in this form over other forms? For example, given your simple sommelier application, you could probably construct a relational database to represent the same type of information. There are several reasons to use ontologies.

First, formal ontology specifications such as OWL define a standard way of capturing and sharing the understanding of things in our world. The standardization of the form of this understanding allows ontologies to be reused, extended, and translated. In fact, there are hundreds of free ontologies available through the web that you could reuse in your own application as we are doing for your sommelier. Additionally, you could create your own ontology that extends someone else’s ontology. The wine ontology, for instance, extends a food ontology. As another example, a company specializing in investments could create an ontology built on top of more general financial ontology.

Second, and perhaps more importantly, ontological representations such as OWL facilitate inferencing and consistency checking. In the semantic space, reasoners are a type of tool that can deduce new relationships from ontologies. The ability to make such deductions is referred to as inferencing and can greatly simplify the creation of a complex model because many of the relationships can be inferred. Reasoners also can help detect logical errors in an ontology.

For example, in the wine domain an individual wine can exist in multiple classes. A red bordeaux wine is a type of red wine, a type of French wine, and also possibly a type of strong wine. To attempt to model this multiple inheritance graph of classes would be both difficult and error prone. Inferencing makes a simpler approach possible.

Instead of declaring BordeauxWine to be a subclass of multiple parents, I can simply declare it to be a type of wine and restrict this class to include only individual wines from the BordeauxRegion of France. Then, I can create a separate class called FrenchWine and declare that it is the set of wines produced in the FrenchRegion. A reasoner will be able to infer that BordeauxWine must also be a type of FrenchWine because the BordeauxRegion is a type of FrenchRegion.

The asserted (explicitly defined) hierarchy of classes in the wine ontology is relatively flat (see Figure 4). This is the ontology as it was explicitly defined in its OWL document.

After running this ontology through a reasoner, you get the much more complex ontology represented in Figure 5.

It is hugely convenient to model the ontology shown in Figure 4, yet be able to ask questions of it as if it were as robust as the model demonstrated in Figure 5. You won’t get into inferencing much more in this article, except to show how to leverage a reasoner in Java applications.Querying Ontologies
You’ve either built an ontology or have found one from some source?now what? By themselves, ontologies are interesting but not very useful. To do valuable things with ontologies, you must be able to get meaningful information from them. Different tools have different ways of getting information out of an ontology, and you will see two of them in this discussion. The first will be to use the Java Jena library to access your wine ontology programmatically. The Jena interface includes facilities to read and navigate an ontology, which will be demonstrated shortly as you build your sommelier application.

A draft W3C specification is in the works to create a standard query language for RDF data named SPARQL (which recursively stands for SPARQL Protocol and RDF Query Language). Because OWL is an extension of RDF, you will be able to mine information from your OWL-based wine ontology using SPARQL.

SPARQL is similar to SQL in form but different in function. Whereas SQL is based on set theory, SPARQL is based on graph theory. SPARQL queries construct a graph that is pattern-matched against the RDF graph, and matching structures are returned. This querying technique is pretty abstract stuff, so look at a couple of examples. A basic SPARQL query that selects all the triples from an ontology would look like this:

SELECT ?subject ?predicate ?objectWHERE { ?subject ?predicate ?object }

Did you notice that this form looks a lot like how I defined statements previously? It looks similar because the WHERE clause is basically creating a small ontology to query with. I can specify a combination of subjects, predicates, and objects to query with, and the wildcards that I leave blank will be accessible through the SELECT clause. For example, to list all subjects and objects from the wine ontology related by the hasColor predicate I could execute this query:

SELECT ?subject ?objectWHERE { ?subject <#hasColor> ?object }

Because hasColor is a property, this predicate will apply only to individuals in the ontology. Thus, this query will return all the individual instances related to other instances through the hasColor property.

I can also select only those instances having a particular color by specifying the value in which I’m interested:

SELECT ?subject ?objectWHERE { ?subject <#hasColor> <#Red> }

And, I can further refine my wine search by including more triples in the graph I specify in the WHERE clause:

SELECT ?subject ?objectWHERE {   ?subject <#hasColor>  <#Red> .   ?subject <#hasFlavor> <#Strong> .   ?subject <#hasBody>   <#Full> .}

Notice that statements are separated by a period. The ontology will be searched for patterns matching the union of the three statements listed previously.Putting It All Together
Enough talk; it’s time to build your sommelier! You might be surprised at how easy it really is. I would feel like a failure as a guide if I didn’t show one test for your sommelier first:

src/test/java/com/devx/semantics/sommelier/SommelierTest.java...   public void testRecommendWine_FullBody()   {      WineDescriptor descriptor = new WineDescriptor();      descriptor.setBody(BODY.Full);      List recommendedWines =         sommelier.recommendWine(descriptor);      for (Wine wine : recommendedWines)      {         assertEquals("Full", wine.getBody());      }      assertEquals(16, recommendedWines.size());   }...

This test first constructs a WineDescriptor object to contain the factors I would like to use in wine selection. In this case, I chose to verify the number of full-bodied wines returned. Although there are 11 wines in your wine ontology that are asserted (explicitly defined) as full-bodied wines, I chose to verify that 16 wines are recommended. Why? Because I wanted to test that the sommelier can properly infer wines that are full-bodied implicitly because of their classification.

Now turn your attention to the sommelier. The first job of the sommelier is to read in the wine ontology. I have chosen to do this in the default constructor:

src/main/java/com/devx/semantics/sommelier/Sommelier.java...   private static final String uri =      "http://www.w3.org/TR/2003/CR-owl-guide-20030818/wine#";   private Model model;   public Sommelier()   {      OntModel ontModel = ModelFactory.createOntologyModel();      ontModel.read(Sommelier.class.getResourceAsStream("/wine.owl"),         "http://www.w3.org/TR/2003/CR-owl-guide-20030818/wine");      model = ModelFactory.createInfModel(         PelletReasonerFactory.theInstance().create(), ontModel);   }...

For this application, I have chosen to use Jena (a Java-based semantics framework) because it is lightweight, easy to use, and supports OWL and SPARQL. In the constructor, read in the wine ontology (wine.owl) from the class path and declare that the base URI for resources local to this ontology is http://www.w3.org/TR/2003/CR-owl-guide-20030818/wine. Next, create an instance of a PelletReasoner, and instruct the reasoner to create a new ontology that includes the asserted and inferred relationships. Pellet is a separate tool from Jena; it implements Jena’s Reasoner interface.

Next, implement the wine recommendation logic:

src/main/java/com/devx/semantics/sommelier/Sommelier.java...   public List recommendWine(WineDescriptor descriptor)   {      String query = "SELECT DISTINCT ?s WHERE { ";      query = query + createBodyPattern(descriptor);      query = query + createFlavorPattern(descriptor);      query = query + createColorPattern(descriptor);      query = query + "}";      QueryExecution qexec = QueryExecutionFactory.create(query,                                                          model);      ResultSet results = qexec.execSelect();      List wines = new ArrayList();      while (results.hasNext())      {         ...      }      return wines;   }   private String createBodyPattern(WineDescriptor descriptor)   {      if (descriptor.getBody() != null)      {         return "?s <" + uri + "hasBody> <" + uri +            descriptor.getBody() + "> .";      }      else      {         return "";      }   }...

This section of code creates the SPARQL query and executes it against the inferred model. The results are traversed and a list of wines is packaged and returned to the caller. Inside the while loop, I use the Jena API directly to navigate to the subject returned from the query and construct a wine object that contains the relevant properties (color, body, and flavor) to describe the subject.

src/main/java/com/devx/semantics/sommelier/Sommelier.java...   public List recommendWine(WineDescriptor descriptor)   {      ...         ResultBinding result = (ResultBinding) results.next();         RDFNode node = result.get("s");         String name   = getNameOfNode(node);         String body   = getPropertyForNode("hasBody", node);         String flavor = getPropertyForNode("hasFlavor", node);         String color  = getPropertyForNode("hasColor", node);         wines.add(new Wine(name, body, flavor, color));      ...   }   private String getNameOfNode(RDFNode node)   {      Resource resource = (Resource) node.as(Resource.class);      return resource.getLocalName();   }   private String getPropertyForNode(String property, RDFNode node)   {      Resource resource   = (Resource) node.as(Resource.class);      Statement statement = resource.getProperty(         model.getProperty(uri + property));      return getNameOfNode(statement.getObject());   }...

In this code retrieve the subject in the form of a RDFNode from the result, and then determine its name and properties. To determine the properties of the resource, navigate the statement formed between the subject and the property (hasBody, for example) and then programmatically read the name of the corresponding object. This process is used to find the values for body, flavor, and color for each result of the SPARQL query executed earlier.

You also can download the full source code for this article, if you are so inclined.

Here’s to You
You have covered quite a bit of ground during this article. You have taken a high-level look at semantics, ontologies, and some of the technologies and tools behind them. Hopefully this fairly high-level overview has helped you to formulate an understanding of how semantic technologies fit together, and how you might begin to integrate some of these technologies into your own applications. For a more detailed treatment of this subject area, consult these related resources, and keep your eye out for future articles from DevX that will cover this topic.

Related Resources:

Share the Post:
Share on facebook
Share on twitter
Share on linkedin

Related Posts

DevX is the leading provider of technical information, tools, and services for professionals developing corporate applications.

Join Our Newsletter

Subscribe to receive our latest blog posts directly in your inbox!

© All Rights Reserved.