Login | Register   
LinkedIn
Google+
Twitter
RSS Feed
Download our iPhone app
TODAY'S HEADLINES  |   ARTICLE ARCHIVE  |   FORUMS  |   TIP BANK
Browse DevX
Sign up for e-mail newsletters from DevX


advertisement
 

Storing and Using RDF in Mulgara : Page 4

The semantic web is about machine-processable metadata. As you accumulate this information, where do you plan on putting it, and how do you plan on accessing it? Check out this open source solution.


advertisement
Using Mulgara from Java
Even though you'll still be using iTQL indirectly, you are probably going to want to use Mulgara by going through its Java API. While this API isn't as rich as, say, the Elmo API from the Sesame project (ahem), it is a useful abstraction when you're working from within a Java application.

The sample application here will spider Friend-of-a-Friend (FOAF) files on the Web. The FOAF project is another RDF vocabulary for describing social networks, professional relationships, and so on. While it doesn't have the same user base of some of its closed-model brethren (LinkedIn and MySpace), the community has been growing and continues to do so.

From Java, you communicate with a Mulgara instance through the use of the ItqlInterpreterBean. You can find this class in the driver-1.1.0.jar file in the unpacked distribution. Through this interface you can create models, queries, and data updates on the same machine or different machines as long as the name bound to the server instance is visible from those other machines (that is, not localhost). The application starts by loading a FOAF file into the Mulgara instance and then querying it for all new FOAF file references it has discovered. It will then repeat the process until there are no new entries. Depending on whom you know, this process might go on for hours!



There are plenty of optimizations that you can make to the code shown in Listing 1, but in the interest of keeping it simple they were left out.

After you have your network graph in place, there is a tremendous opportunity to ask interesting questions of what you've found. The benefit of the RDF graph model is that you don't need to know what you might find beforehand. You can simply start to query the results, and see what is there. No need for schemas here!

After poking around a bit, you might see that some people put into their profiles the name of the school they went to. If you want to see a list of where everyone who shares this information went to school, you would use a query like this:

select $school from <rmi://localhost/server1#foaf> where $subject
<http://xmlns.com/foaf/0.1/schoolHomepage> $school;

If you want to track alumni from your own school, you can constrain the results to subjects that have a particular URI like this:

select $alumnus from <rmi://localhost/server1#foaf> where $alumnus
<http://xmlns.com/foaf/0.1/schoolHomepage> <http://www.wm.edu>;

If you want to find out what people were interested in, you might try this:

select $who $interest from <rmi://localhost/server1#foaf> where $who
<http://xmlns.com/foaf/0.1/interest> $interest;

Hopefully, you see the power of the directed graph model as a way of supporting the open-world assumption. You can always add new facts about known subjects without having to migrate your existing data. It becomes very easy to query these complicated datasets with powerful questions, even as you learn what facts are represented in them. The PURL system mentioned in "What's in a URI?" (DevX, July 19, 2007) is a useful way to name subjects that you want to accumulate facts about.

While the Mulgara project needs to make progress on lowering the bar to adoption, improving its documentation and tutorials, and increasing the number of environments in which it can be used (many of these issues will be addressed in the next release), it is a powerful and scalable data store with many of the features you would expect from a commercial enterprise storage system.

Yet, it is by no means the only solution out there for storing and querying RDF data. Oracle has become a powerful player in the semantic technologies space, beginning with 10g Release 2. Other tools such as Redland, Jena, Sesame, and the Talis Platform are all established solutions that have their own advantages, and you are encouraged to play around with all of them.

RDF is becoming an important data model on the web and in the enterprise. Understanding how you can accumulate, store, and query information in this format is going to become an important part of working with the information systems of the twenty-first century. Mulgara is a great tool for beginning to learn how to do that.

Resources



Brian Sletten is a liberal arts-educated software engineer with a focus on forward-leaning technologies. He has worked as a system architect, a developer, a mentor and a trainer. He has spoken at conferences around the world and writes about web-oriented technologies for several online publications. His experience has spanned the defense, financial and commercial domains. He has designed and built network matrix switch control systems, online games, 3D simulation/visualization environments, Internet distributed computing platforms, P2P and Semantic Web-based systems. He has a B.S. in Computer Science from the College of William and Mary and currently lives in Fairfax, VA. He is the President of Bosatsu Consulting, Inc., a professional services company focused on web architecture, resource-oriented computing, the Semantic Web, advanced user interfaces, scalable systems, security consulting and other technologies of the late 20th and early 21st Centuries.
Comment and Contribute

 

 

 

 

 


(Maximum characters: 1200). You have 1200 characters left.

 

 

Sitemap