Login | Register   
Twitter
RSS Feed
Download our iPhone app
TODAY'S HEADLINES  |   ARTICLE ARCHIVE  |   FORUMS  |   TIP BANK
Browse DevX
Sign up for e-mail newsletters from DevX


advertisement
 

State of the Semantic Web: Know Where to Look : Page 2

Those looking for evidence of progress on the Semantic Web do not have to look far. Several major projects and companies are embracing the vision and technology stack like never before.


advertisement
Emerging Infrastructure
The specifications behind the Semantic Web provide the ability to encode, link, and reason about data. Historically, it has been impossible to characterize an unqualified URL as a document or a reference to a non-network-addressable resource. The W3C Technical Architecture Group (TAG) has recently reached a decision that these non-networked resources can be given URIs/URLs. An infrastructure that resolves these references can indicate the special status by returning an HTTP response code of 303 (See Other) instead of a 200 (OK).

The Online Computer Library Center's (OCLC) Persistent URL infrastructure was recently rearchitectured for scalability and a series of new features, such as support for this 303 guidance. This new version lays down some key infrastructure for assigning good, resolvable names for terms and concepts, something that has been sorely missing in the Semantic Web technology stack. As such, the new system can define concepts to disambiguate RDF subjects. URIs can be given to proteins, people, legislation, places, etc. While historically you may have chosen a pseudo-canonical URL from a site such as Wikipedia, now it is possible to define a new canonical URL for the terms and subjects that are of interest to your organization.

Embeddable Semantic Web Applications
Thomson Reuters runs a free site, called OpenCalais, for identifying terms and concepts from within unstructured text. With plugins such as Gnosis for FireFox, it is possible to turn the OpenCalais service directly on the pages you visit to identify people, places, organizations, industries, etc., even on sites that do not publish information with support for GRDDL, microformats, and RDFa. These extracted terms can then be linked back into other data sources to automate the process of extracting information as you surf the Web. This service is a step toward a larger vision. Thomson Reuters' CEO has even caught Semantic Web skeptic Tim O'Reilly's attention with his vision of where this is going.



Another FireFox plugin, Solvent from the Simile project, makes it easy for you to compose lightweight and shareable screen scrapers to extract content from arbitrary pages. This highlights that, while it is great when sites support Semantic Web technologies, the success of the vision does not require everyone to get on board. Automated and semi-automated extraction are key approaches to linking content in structured and unstructured forms.

Support By Open Source and Commercial Organizations
One of the major barriers to adoption of semantic technologies is the lack of support in software. There have always been quality parsing, producing, and querying APIs, but major software initiatives have in general taken a wait-and-see approach. This is increasingly becoming less of an issue as major open source initiatives such as Drupal and Mozilla have committed to supporting RDF and SPARQL.

Perhaps more valuable than adoption by Open Source projects is the long-anticipated support for the technologies by major commercial software players. This too has finally come to pass. Oracle was one of the first major vendors to adopt RDF and OWL in its database engines. It cleverly co-opted its existing Spatial Engine (with its network data model) to support the graph models of RDF. It is now possible to mix RDF and non-RDF data within the same database engine.

Industry giants Yahoo! and Microsoft have also been making announcements and acquisitions in this space. Google is promoting interoperability in the social networking world through Open Social while MySpace, eBay, Twitter, and Yahoo! are pursuing DataPortability initiatives.

New technology companies have emerged along the way with tools to help developers, knowledge workers, and other organizational stakeholders build software systems around these ideas. TopQuadrant's TopBraid Suite, Franz's AllegroGraph, the Talis Platform, Thetus Publisher, and OpenLink's Virtuoso server are among the leaders of these emerging markets.

Companies like Zepheira, LLC, Semantic Arts, and Sandpiper Software are working with major corporations around the world to adopt these ideas within their organizations with training, strategic guidance, and implementation assistance.

The pain of failed Enterprise Application Integration (EAI) and Service-Oriented Architecture (SOA) initiatives are driving financial services, news media, insurance, and other conservative industries to look for new solutions to its IT needs. Those industries are considering the successes of the web and want to know how to adopt those ideas internally. The Cleveland Clinic is a leader in adopting Semantic Web technologies to improve their ability to meet the needs of their patients. The goals of the clinic are to lower their IT costs, add business functionality, and avoid the technology flux treadmill. The clinic's goal is not to use semantic technologies per se; it's to use them as viable solutions.

Learning About Semantic Technologies
Developers, managers, and executives can learn about Semantic Web technologies at major conferences. This year, relevant content has appeared at:

New books are being published to help people navigate these technologies, including overviews, RDF, and modeling with OWL.

Conclusion
Semantic Web technologies are here in many important ways, and you are most likely using these technologies on a daily basis; even if it's an indirect usage. The success of these technologies is not simply a question of everyone adopting the same models and the same terms; it is about a rich and vibrant ecosystem of data, documents, and software tied together in useful ways.



Brian Sletten is a liberal arts-educated software engineer with a focus on forward-leaning technologies. He has worked as a system architect, a developer, a mentor and a trainer. He has spoken at conferences around the world and writes about web-oriented technologies for several online publications. His experience has spanned the defense, financial and commercial domains. He has designed and built network matrix switch control systems, online games, 3D simulation/visualization environments, Internet distributed computing platforms, P2P and Semantic Web-based systems. He has a B.S. in Computer Science from the College of William and Mary and currently lives in Fairfax, VA. He is the President of Bosatsu Consulting, Inc., a professional services company focused on web architecture, resource-oriented computing, the Semantic Web, advanced user interfaces, scalable systems, security consulting and other technologies of the late 20th and early 21st Centuries.
Comment and Contribute

 

 

 

 

 


(Maximum characters: 1200). You have 1200 characters left.

 

 

Sitemap