Duplicating Data Elements
Using keywords and synonyms is one way to ensure that search tools find and display the data elements a user needs.|
One of the first tests of a high-quality ontology is to look for duplication of data elements. The larger an ontology
grows, the higher the probability that an untrained user will enter a new data element that already exists in your
corporate ontology. Untrained users usually do this because they are not aware that it already exists in the ontology.
If your ontology management system has a search tool, then you can always train new users on how to use these tools.
But searching tools alone are usually not enough. For example, a user might search for a term such as "Individual"
not knowing that you stored information about human beings under the class "Person." Using keywords and synonyms is
one way to ensure that search tools find and display the data elements a user needs.
The second line of defense against accidental duplication of data elements is a human-centric review process. Most
ontologies have a few expert users that are familiar with the structure and conventions used in an ontology. When a
novice user adds a new data element to an ontology an e-mail or other notification message can be sent to experts
alerting them that new data elements are pending their review.
In large standards this review process usually is directed to a committee of experts who have a specialized
understanding for specific parts of the ontology. In financial institutions some members might specialize in stock
transactions and some in bonds. The key is to have a clearinghouse to assign data elements to the group that has the
most expertise. This is one of the central aspects of data governance and data stewardship that must be in place for
the ontology to gain enterprise respect and usage.
Removing roles from properties is one of the best ways to keep your ontologies reasonable.|
A common oversight a novice ontologist can make is mistaking a role for an actual object. For example, a person may
play many roles in a business event. In healthcare, a person might play the role of a patient, a nurse, a physician
or an office assistant. It seems obvious at first to take a form that has the label "PatientName" and create a
property of patient-name
in your ontology. Then you might add physician-name
, and nurse-name
. The key is to realize that these labels on the medical forms reflect the role that a "Person" plays in the business event. To create precise rules around names, you want to remove the role from the name and create subclasses of Person for each of these roles. The PersonGivenName
can then be properties of Person and your business rules for validating these names can be shared. You can then create a PersonRoleCode
and assign the role to successive values of PersonRoleCode="Patient"
, etc. Removing roles from properties is one of the best ways to keep your ontologies reasonable.