Login | Register   
RSS Feed
Download our iPhone app
Browse DevX
Sign up for e-mail newsletters from DevX


What's in a URI? : Page 4

Have you ever wondered about the syntax of a web resource name? Take a look at the semantic web through one of its lowest-level specifications.

Resolving References
Now that you have a better understanding of the motivation behind the different naming schemes, how do you go about resolving them? URLs are easy. The network transport is chosen, the host is identified, a default or specified port is chosen, and a connection is made. For the http scheme, a resource is retrieved usually through a GET or submitted through a POST.

What about URIs in general? Here you get back to the difference between identification and addressing. Without some additional infrastructure, you have no way of locating URIs that aren't URLs. However, as was mentioned previously, URLs are unreliable naming schemes.

URNs are fabulous at naming things in generic, long-lived ways, but they have a serious problem: there is no support for converting a pure name into a location on the web or in a corporate enterprise environment. You could certainly constrain the problem for a particular system and build your own resolution mechanism, but there is no clicking with URNs and clicking is what has made the web what it is. Your grandma clicks. (Well, at least my wife's grandma does.)

Solving this problem was one of the main motivations behind purl.org's Persistent Uniform Resource Locator (PURL), which is a service that has been run by the Online Computer Library Center (OCLC) in Dublin, OH for 12 years. Users may register with the PURLS system and create URIs that map to a URL. The name begins as a part of a URL that serves as both the identifier as well as a path to a resolver.

For example, the Dublin Core vocabulary has been defined as a series of PURLs. You can refer to the title term through a PURL: http://purl.org/dc/elements/1.1/title. If you attempt to resolve that link, you'll be taken to the definition of the RDF vocabulary. That location can change over time. As long as the PURL maintainer keeps the link current, this URL will remain stable and serves its purpose as both an identifier and an address.

It is now possible to make logical references to elements that move. A major goal of the semantic web is to be able to accumulate metadata about both information and non-information resources. You are no doubt familiar with the Sisyphean comedy of keeping address books current. The problem is that you are constantly ingesting a physical representation of a logical resource.

What you'd really like to know is, where in the world is http://purl.org/name/WaldoJohanssen? You would need a naming scheme that allowed for people with the same names—but job changes, marriages, physical moves, title changes, and so on—to all be managed with this level of indirection. You can imagine an address book that works with URIs only as being much easier to maintain. When you need the information, you simply ask for it.

What about the problem discussed previously about referring to a concept versus referring to a document about the concept? How do you know whether you are interpreting metadata about Mr. Johanssen or a document about him? With only a simple HTTP redirect you still run into this problem.

Recently, Zepheira, a provider of products and services for the semantic web, was contracted to modernize the architecture of the PURL service for scalability and to support new features to solve this problem. The W3C Technical Architecture Group (TAG) came up with something of a compromise to manage this signifier/signified problem, and Zepheira intended to extend the OCLC PURL service to support it.

The HTTP code 303 ("see also") will be used in place of the 200 code to respond to requests for non-information resources. In a way, the PURL service will be saying, "I acknowledge your request; yes, there is something here, but not a document." Ultimately, it will allow systems to be built that are able to tell the difference between information and non-information resources, concepts, or documents—potentially about those concepts.

This new capability will be a powerful means of achieving a compromise on the identification/addressing conflation issues discussed here. You want good names, and you want stable and persistent ways to refer to the things they represent. You want a resolution mechanism that works within the software frameworks and protocols that have found their way into widespread use. With that resolution in place, you can start to realize more of the semantic web's potential on the web we already have.

Additonal Resources

Brian Sletten is a liberal arts-educated software engineer with a focus on forward-leaning technologies. He has worked as a system architect, a developer, a mentor and a trainer. He has spoken at conferences around the world and writes about web-oriented technologies for several online publications. His experience has spanned the defense, financial and commercial domains. He has designed and built network matrix switch control systems, online games, 3D simulation/visualization environments, Internet distributed computing platforms, P2P and Semantic Web-based systems. He has a B.S. in Computer Science from the College of William and Mary and currently lives in Fairfax, VA. He is the President of Bosatsu Consulting, Inc., a professional services company focused on web architecture, resource-oriented computing, the Semantic Web, advanced user interfaces, scalable systems, security consulting and other technologies of the late 20th and early 21st Centuries.
Comment and Contribute






(Maximum characters: 1200). You have 1200 characters left.



Thanks for your registration, follow us on our social networks to keep up-to-date