Laying the Foundation of a Semantic Web Application

The article Why Migrate to the Semantic Webdescribed the benefits of the semantic web for applications that deal with real world objects. It outlined how an existing web application (CDMS) that stored its data in a relational database now has added benefits as a semantic web application.

As part of the process of migrating to the semantic web, you need to select components that form the core of the semantic web framework. This article outlines the foundations of the application and defines its overall structure.

Requirements of a Semantic Web Framework
For the CDMS application, the main requirements of a semantic web framework include:

  • Core support for RDF, the RDF Schema language (RDFS), and the Web Ontology Language (OWL).
  • Support for the current SPARQL specification, including the ability to easily query external datasets for related items.
  • Support for SPARQL Extensions such as count, insert, update, and delete.
  • Support for transactions because RDF data will be inserted, updated and deleted; possibly in conjunction with updating relational data.
  • Ability to scale efficiently to large datasets.
  • Ability to deploy Linked Data using the methods outlined in the tutorial How to Publish Linked Data on the Web, e.g., appropriately handling content negotiation.
  • Select apply-role based security to publish Linked Data, e.g., for project collaboration scenarios when sharing data externally with project partners.
  • Provide inference capabilities for OWL ontologies.

Selecting a Java Semantic Web Framework
A number of good semantic web development tools are available, in several different programming languages, but for this application the selection is limited to Java based tools. This is because the existing application is written in Java and it is intended to continue using parts of its persistence layer (Hibernate and MySQL) in areas not suited to the semantic web.

Based on their known scalability features (as per the Berlin SPARQL Benchmark), the prime candidates were :

Jena was selected as the main component for this semantic web framework for the following reasons.

  • Jena SDB uses MySQL for the storage of RDF datasets, simplifying integration with the existing MySQL relational database.
  • Jena ARQ provides a leading implementation of SPARQL, including SPARQL Extensions such as count, insert, update, and delete.
  • Jena includes Joseki as an HTTP engine that supports the SPARQL Protocol and the SPARQL RDF Query language. The query engine is based on ARQ. Joseki can also be deployed to Tomcat.
  • Pellet can be added as an external OWL-DL reasoner.

A number of other open source projects provide additional useful functionality for Jena, such as Jenabean and Jfresnel.Additional framework components:

  • Jersey (which is the reference implementation of JAX-RS) was selected to implement RESTful web services, as part of providing Linked Data.
  • The Spring framework and Spring Security were chosen to apply role-based security to the RESTful web services.

Laying the Foundations of the Application
The main steps in laying the foundations of the application are:

  • Installing Jena SDB to store RDF data. This includes populating the SDB with some initial data to confirm the install.
  • Creating a Maven-based web application project that integrates Jena SDB with Joseki, providing a SPARQL end point for querying the RDF data held within Jena SDB.

Installing Jena SDB
Follow these steps to install the Jena SDB:

  1. Download and install MySQL if it is not already installed.
  2. Download Jena SDB 1.1 then use the following steps to install it within MySQL as per the instructions on the wiki.
  3. In MySQL create a database named “sdb”, specifying utf8,
    mysql> create database sdb character set utf8 ;
  4. Set up a store description sdb.ttl similar to the SDB example, but with MySQL specifics based on the file SDB-1.1/Stores/sdb-mysql-innodb.ttl found in the SDB distribution, and a database name of “sdb”. The other change is to set the sdb:layout to “layout2/index”. The index layout (“layout2/index”) usually loads faster than the hash form. (sdb.ttl is provided in the attached source code in the src/main/resources directory)
  5. Set up the environment variables listed in the SDB Commands script setup section.
  6. Run the create command to create the tables within the sdb databaseSDBROOT > bin/sdbconfig -sdb=sdb.ttl -create
  7. Run the test suite: SDBROOT > bin/sdbtest –sdb=sdb.ttl testing/manifest-sdb.ttl

If any tests fail, check the Mysql Notes. Some additional steps to set up utf-8 may be required.Loading Data Into Jena SDB
The example RDF model from the previous article Why Migrate to the Semantic Web can be loaded in SDB using the command:

SDBROOT > bin/sdbload --sdb=sdb.ttl http://www.3kbo.com/examples/building.rdf

The SDB Commands provide a number of ways to check whether the data loaded correctly.

A generic test is to run the sdbdump command, which lists all the RDF triples loaded:

SDBROOT > bin/sdbdump --sdb=sdb.ttl

Use the sdbquery command to make SPARQL queries. For example, the following lists the RDF triples in simple table form:

SDBROOT > bin/sdbquery --sdb=sdb.ttl 'SELECT ?s ?p ?o WHERE {?s ?p ?o }'

This command writes out the model in N3 format:

SDBROOT > bin/sdbquery --sdb=sdb.ttl 'CONSTRUCT {?s ?p ?o} WHERE {?s ?p ?o }'

You can list all people in the model with the query:

SDBROOT > bin/sdbquery --sdb=sdb.ttl 'CONSTRUCT {?s ?p ?o} WHERE {?s ?p ?o .  ?s a 
}'

Details of the BreakerBay building project are provided with the query:

SDBROOT > bin/sdbquery --sdb=sdb.ttl 'CONSTRUCT { ?p ?o}
WHERE { ?p ?o }'

This gives the result:

:BreakerBay      rdf:type      :BuildingProject ;      rdfs:comment  "House extension and landscaping" ;      rdfs:label    "Breaker Bay "^^xsd:string ;      :approvedBy   :WellingtonCityCouncil ;      :builtBy      :GarethEvans ;      :designedBy   :AlexGreig ;      foaf:based_near  ;      foaf:fundedBy  :RichardHancock .

Creating a Maven-based Web Application
Use Maven to manage the dependencies between the various component libraries required by the application. Download and install it from the Maven site if it is not installed already.

Running the following maven command creates the initial project structure of a maven based web app application, creating a top level directory named “sdb-joseki“.

mvn archetype:create -DgroupId=com.devx -DartifactId=sdb-joseki -DarchetypeArtifactId=maven-archetype-webapp
Author’s Note: If you are new to Maven check out the instructions for Building a Project with Maven.

Download and unzip Joseki and copy the contents of the Joseki-3.2/webapps/joseki directory to the sdb-joseki/src/main/webapp directory as shown in Figure 1.

 
Figure 1. Project Structure: Copy the contents of the directory.

From the source code, add the provided joseki-config.ttl and log4j.xml files to the src/main/resources directory and replace the generated pom.xml with the one provided.

The joseki-config.ttl file configures Joseki to work with the RDF data loaded into SDB. (Change the username and password in the joseki-config.ttl file to match that used for the MySQL “sdb” database.)

In the WEB-INF/web.xml file, comment out the servlet-mapping for books url-pattern because only the default graph was configured in joseki-config.ttl file.

The provided pom.xml file contains all the dependencies needed to build the sdb-joseki.war file, however joseki.jar is not available from an external maven repository.

On the command line, change into the directory Joseki-3.2/lib and add joseki.jar to the local maven repository by running the command:

mvn install:install-file -DgroupId=org.joseki -DartifactId=joseki -Dversion=3.2 -Dpackaging=jar -Dfile=joseki.jar

Change into the directory sdb-joseki. The sdb-joseki.war file can now be built by running the command:

mvn package 

Deploy the sdb-joseki.war file to Tomcat to provide a SPARQL endpoint querying the default graph, i.e., the data loaded into SDB. Test that the SPARQL endpoint is working by opening the form at http://localhost:8080/sdb-joseki/sparql.html and submitting the query ‘CONSTRUCT {?s ?p ?o} WHERE {?s ?p ?o }‘.

In Firefox, with the Tabulator Extension installed, the default graph is displayed as per Figure 2.Another query to try is “CONSTRUCT {?s ?p ?o} WHERE {?s ?p ?o . ?s a }“, which results in Figure 3.


Figure 2. Default Graph: How the default graph appears with the Tabulater Extension installed.
 
Figure 3. Listing Of People: A certain query provides a listing of people.

At this point, you have laid the foundation of a semantic web application and you have created a maven web application project that implements an RDB triple store using Jena SDB and a SPARQL endpoint via Joseki. The next article in this series will discuss adding the Spring Security and JAX-RS to enable the publication of public and private Linked Data.

Share the Post:
Share on facebook
Share on twitter
Share on linkedin

Overview

Recent Articles: