RDF Store Overview
You have several options to consider when choosing an RDF database for your application. AllegroGraph
is a commercial option that boasts impressive performance as well as other value-added features such as a built-in reasoner and support for federated databases. Three popular open source options are Sesame
, and Mulgara
Jena's traditional database layout is RDB and was optimized for the Jena Model API. A new Jena component, SDB
, is being developed to offer an alternative database layout optimized for larger patterns such as those you would typically execute when performing SPARQL queries. SDB is currently in beta so Jena references in this article refer to the RDB database layout unless otherwise noted.
The aspects you will want to consider when selecting an RDF database are:
- Database compatibility
- API compatibility
- Load and query performance
- Tool support
- Inferencing support
AllegroGraph is itself a database and as such doesn't rely on or truly integrate with traditional relational databases. It does, however, offer the ability to back up its operations to a relational database. Sesame and Jena, on the other hand, are not databases; they're toolkits for working with RDF data. Both have the ability to either sit on top of popular databases (see Table 1) or use file-based and memory-based modes.
Table 1. Database Compatibility: The table shows the relational databases supported by four popular RDF databases.
Sesame and Jena both allow you to store RDF triples. The schema that each uses to store the triples is proprietary, but each exposes an API to manage and query the stored RDF data. You can access AllegroGraph repositories through both the Sesame and Jena APIs.
Regardless of which RDF database you choose, you can access that store through either the Jena API or the Sesame API. The Jena Sesame Model
project allows developers to access Sesame databases through Jena's model abstraction. Conversely, the Sesame-Jena Adapter
project provides access to Jena models through the Sesame API. Although you can
use either, you will generally be better off using the Jena API to access Jena databases and the Sesame API to access Sesame databases. You may want to factor in this affinity when deciding what set of trade-offs to make when selecting an RDF database (see Table 2).
Table 2. API Compatibility: You can access all the RDF datasets analyzed here through both the Jena and Sesame API.
| ||Jena API||Sesame API|
|Jena||Yes||Yes (via Sesame-Jena Adapter)|
|Sesame||Yes (via Jena Sesame Model)||Yes|
|AllegroGraph||Yes (via AllegroGraph interfaces)||Yes (via AllegroGraph interfaces)|
|Mulgara||Yes (also exposes its own JRDF API)||No|
Obviously, load and query performance are among the biggest factors affecting any RDF database selection. Performance benchmarking and tuning are always very contextual regardless of the technology being considered. I urge you to perform your own benchmarks in your own network and with your own hardware, datasets, and query types. Consult these links for performance benchmarks reported by the respective RDF database providers:
Tool support is another important consideration when choosing technologies; RDF databases provide different types and levels of tooling around the core function of managing RDF data. Sesame ships with graphical tools to manage a Sesame server, and supports load, query, and explore operations via a web interface. Although Jena offers only command-line management utilities, several related projects can help you manage Jena RDF databases:
- Joseki lets you query RDF files and databases online.
- Twinkle provides a GUI for executing SPARQL queries against RDF files.
- TopBraid Composer is a powerful ontology editor that can access Jena, Sesame, and AllegroGraph RDF stores.
Table 3 shows some available tool support for the three RDF databases discussed here.
Table 3. Tool Support: These RDF databases provide a spectrum of tooling support—from command-line utilities to graphical UIs.
||Command-line tools and Java API
||Joseki and Twinkle
||GUI and Web Interface
||Java, HTTP, Lisp
||Java, HTTP, Lisp
||Java, Perl, iTQL
||Java, Perl, iTQL
Another potentially important consideration when evaluating RDF databases is the query languages they support. All the popular RDF databases explored here offer a proprietary query language into RDF data, but not all offer support for SPARQL, an emerging standard RDF query language. Table 4 highlights the differences in support for RDF query languages among the various tools:
Table 4. RDF Query Language Support: One distinguishing characteristic of RDF databases lies in their support for SPARQL.
||Native RDF Query Language
Inferencing support is yet another important characteristic to consider when selecting an RDF database. Sesame and AllegroGraph notably provide optional inferencing front ends that can dynamically create entailments during database operations and can insert these additional entailments along with the asserted statements into the database. Jena features a robust and highly configurable inference engine, but at this time you can't configure it as a front end to an RDF database. Fortunately, there's a relatively simple workaround; you can create your own entailments using Jena's inference engine and add those into your RDF database explicitly.