Login | Register   
RSS Feed
Download our iPhone app
Browse DevX
Sign up for e-mail newsletters from DevX


Ten Pitfalls of Enterprise Ontology Management : Page 3

As ontologists and business strategists incorporate semantic web technologies in large organizations, they experience a natural growing pains process. This article will help you through that process.

Mixing Processes for Semantics and Constraints
When designing web services you typically create XML Schemas to validate incoming data elements. You can automate the process of creating XML Schemas by extracting a sub-set of the elements from your ontology. You can transform these subsets from OWL and RDF directly into XML Schema files that are imported into other XML Schemas or WSDL files.

Figure 3. Mapping Ontology into an XML Schema: This figure shows an example of an XML Schema diagram.
In the past, the process of creating an XML Schema was used to define the meaning of data elements. And if the XML Schemas changed the definitions of the data elements then the constraints of a specific XML Schema structure became mixed up. See Figure 3 for an example of an XML Schema diagram. This is where your ontologies come in. You can use ontologies as a central location to store the semantics or meaning of the data elements that live on the leaves of XML documents. When they are stored in a well-controlled centralized corporate ontology the definitions of the data elements go beyond the needs of a specific version of a web service. The data elements have individual histories:
  • Creation dates
  • Approval workflow status
  • Approval committees
  • Revisions and date-stamps of when they were approved for corporate usage
On the other hand, you should not view XML Schemas as containers of the semantics of data elements. XML Schemas are containers of the data elements and each one expresses the order and cardinality of the collection of elements. XML Schemas are the constraints of a specific data exchange. For example, a single developer can add and delete Web services for data subscribers. Your job as a corporate ontologist is to support such activities, maintain semantics, and get out of the way of a specific business unit that has their own instance of specific required fields, which must be present as inputs to their web services.

So when people have questions about the meaning of data: that is the ontologist's signal to step in and bring the tools to build semantic precision. But if a problem has to do with what elements are present, what order they appear in a transaction, which data elements are required, and which ones are optional; it is recommended to let the data publisher and subscriber try to work things out.

Untested Upper Ontologies
Upper ontology classes are some of the most critical parts of your corporate ontology. These upper ontology classes are the root classes that are either direct subclasses of the OWL Thing class or they are second-level subclasses of Thing. Teams frequently get in heated arguments about the pros and cons of these upper ontologies and there are many complex tradeoffs about the depth and breath of these upper ontologies. Some of these issues are worthy of long discussion because they have long-term impact. Computer systems that have similar upper ontologies will have much lower integration costs. If upper ontologies are stable then people will develop trust in the systems. They are the anchors of your semantics and the foundation of your building. Change them frequently and you will quickly lose the trust of your stakeholders. The first myth about upper ontologies is that it is impossible to "test" the usefulness of the upper ontologies. This is simply not the case. Here is a simple method to test your upper ontologies:

  1. Create a simple one-page handout that describes your upper ontologies.
  2. Give each class a label and a short description.
  3. If necessary, provide short explanations of what types of subclass and properties will be placed under this class.
  4. Then take a list of around 100 subclasses and properties and ask a group of around 10 business analysts to classify each of the subclasses and properties using one of your upper ontologies.
  5. If each business analyst classifies each subclass or property according to how you designed the ontologies you have a winner.
  6. If they are not consistent you need to go back to the drawing board and look at your ontology again.

An upper ontology is like a high-level sieve. Data elements come pouring out of requirements like little grains of sand and need to be sorted correctly by the "uppers." Even a novice that is unfamiliar with your ontology should have the ability to guess how the data elements are sorted.

Repeat the testing process until approximately 95 percent of all data elements are correctly sorted into the correct subclass. If you do this you can confidently tell your management team that the ontology is not just a personal interpretation of how elements should be classified: it is based on a repeatable testing process.

Ambiguous Definitions
Almost every project seems to have a few wonderful "wordsmiths" who can help you write great data element definitions. These are the people that still have an old dog-eared copy of a dictionary in their office. They tend to love to read, they have a love for words, they speak with precision and are keen observers of how other people use words to discuss complex topics. These are the people you want on your team to help write your data element definitions. Here is a summarized list of five characteristics for great data element definitions:

  • Precise - The definition should use words that have a precise meaning. Try to avoid words that have multiple meanings or multiple word senses.
  • Concise - The definition should use the shortest description possible that is still clear.
  • Non Circular - The definition should not use the term you are trying to define in the definition itself. This is known as a circular definition.
  • Distinct - The definition should differentiate a data element from other data elements. This process is called disambiguation.
  • Unencumbered - The definition should be free of embedding rationale, functional usage, domain information, or procedural information.

Once you have a great definition, make sure that every class, property, range value and all derived artifacts carry the definitions with it. It is disappointing to open an OWL file, an XML Schema, or a relational database only to see that none of the tables or columns have any definitions and you are left to guess at the meaning.

Comment and Contribute






(Maximum characters: 1200). You have 1200 characters left.



Thanks for your registration, follow us on our social networks to keep up-to-date