Login | Register   
RSS Feed
Download our iPhone app
Browse DevX
Sign up for e-mail newsletters from DevX


Creating and Managing RDF Vocabularies : Page 3

As organizations move toward describing domains of interest outside of their software systems, they'll need to decide whether to define important terms and concepts on their own or reuse more general vocabularies. See how semantic web tools can help with either strategy.

Do Not Insist on Rigor Up Front
Scientist, professor emeritus, and author Donald Knuth has for years warned about the evils of prematurely optimizing software. Until you understand the run-time characteristics of your software, you will not know where to expend the effort to get the biggest performance improvements. Spending time working on performance improvements without this knowledge is likely to be a wasted effort.

There is a similar problem with overconstraining terms in an RDF vocabulary. RDFS includes predicates to indicate domain and range constraints for the applicability of a property to certain classes. This approach is undoubtedly helpful for production vocabularies, but spending the time on this endeavor in the early stages of the development of a vocabulary is possibly wasted effort and is almost guaranteed to slow you down. Get the terms right, get some examples of using them under your belt, consider any feedback from external parties, and only then go about the effort of constraining your vocabularies. By then you will likely understand the constraints sufficiently well enough to make good choices.

Use Metadata to Describe Your Metadata
While you certainly want to avoid any "turtles all the way down" meta trips, it is a great idea to add metadata to your metadata. RDF vocabularies are themselves information resources that deserve suitable annotations.

Vocabularies will not always be consumed directly from the files in which they are created. Services like Swoogle parse known vocabularies to make their terms and concepts accessible through search. This parsing can be enabled by applying the <rdfs:isDefinedBy> predicate. The prior Dublin Core example demonstrates this link back to the source:

<rdfs:isDefinedBy rdf:resource= "http://purl.org/dc/elements/1.1/"/>

This approach makes it easier to track the definitions back to their origins if they are found in the wild.

Additionally, as vocabularies evolve, it is helpful to indicate the stability of specific terms, which gives consumers either confidence or a warning that dependence on a term might not be the best idea. The World Wide Web Consortium (W3C) has a set of terms that is useful for this very purpose.

There are three terms defined: <vs:term_status>, <vs:moreinfo>, and <vs:userdocs>. The metadata on this property tells you that it is itself an unstable term, although it should be safe enough to use:

<rdf:Property rdf:ID="term_status"> <rdfs:label>term status</rdfs:label> <rdfs:comment>the status of a vocabulary term, one of 'stable','unstable','testing'.</rdfs:comment> <vs:term_status>unstable</vs:term_status> </rdf:Property>

Dublin Core extends the idea of metadata for metadata to include when terms were defined, when they were last modified, and what version they represent currently. Here's the relevant portion from the prior Dublin Core example:

<dcterms:issued>1999-07-02</dcterms:issued> <dcterms:modified>2006-12-04</dcterms:modified> <dcterms:hasVersion rdf:resource= "http://dublincore.org/usage/terms/history/#creator-005"/>

Reuse and Extend Existing Terms
When the terms in a vocabulary are well described as in the foregoing discussion, it makes it easier for someone else to reuse them appropriately. Your vocabulary may need to introduce some new concepts, but that doesn't mean you must invent all new terms.

As an example, Edd Dumbill, noted columist, author, and creator of the DOAP vocabulary, chose to reuse <foaf:Person> in DOAP to refer to the maintainers of a project:

<maintainer> <foaf:Person> <foaf:name>Edd Dumbill</foaf:name> <foaf:homepage rdf:resource="http://usefulinc.com/edd" /> </foaf:Person> </maintainer>

He certainly could have created a new notion of a person in this role, but there was simply no need to. RDF quite ably supports this mixing and matching of terms from different vocabularies and namespaces; it is one of its chief charms.

Even if it is necessary to introduce a new term, it is a reasonable approach to tie it back into an existing vocabulary. You might want to extend <dc:creator> through <rdfs:subClassOf> relationship for <askew:illustrator> and <askew:inker> (or <askew:tracer>) to model the world of comic book authors.

While defining these files with nothing more than a good text editor is convenient, most people will want better tool support for creating and managing RDF vocabularies and their attendant metadata. There are several tools available to assist you with this process (see the sidebar, "Vocabulary Management Tools").

This discussion covered some good strategies for deciding whether to create your own vocabularies or to seek consensus with others from your domains of interest. The W3C's semantic web technologies are designed to help keep it relatively easy to start with an approach that makes sense to you and your organization and consider external vocabularies at some future date.

Brian Sletten is a liberal arts-educated software engineer with a focus on forward-leaning technologies. He has worked as a system architect, a developer, a mentor and a trainer. He has spoken at conferences around the world and writes about web-oriented technologies for several online publications. His experience has spanned the defense, financial and commercial domains. He has designed and built network matrix switch control systems, online games, 3D simulation/visualization environments, Internet distributed computing platforms, P2P and Semantic Web-based systems. He has a B.S. in Computer Science from the College of William and Mary and currently lives in Fairfax, VA. He is the President of Bosatsu Consulting, Inc., a professional services company focused on web architecture, resource-oriented computing, the Semantic Web, advanced user interfaces, scalable systems, security consulting and other technologies of the late 20th and early 21st Centuries.
Comment and Contribute






(Maximum characters: 1200). You have 1200 characters left.



Thanks for your registration, follow us on our social networks to keep up-to-date