y previous article
showed you how to improve your applications' sophistication by using the power of ontologies to tap into conceptual information about a domain. This article discusses how to supplement applications by providing them with a rudimentary understanding of English vocabulary.
Software applications have many uses for recognizing vocabulary, ranging from spell-checking to providing alternative suggestions for search criteriathe scenario explored in this article. You will see how to create a storefront application that provides free-form search access into the store's inventory. Because the goal of any store is to make money, this store leverages a lexical understanding of the user's search to return the greatest number of relevant results. This storefront will, for example, try to provide relevant results for the intended search even when users inadvertently mistype their search criteria. Figure 1
shows a simple example.
|Figure 1. Storefront Responding to a Misspelled Word: Even though the user misspelled "pants" in the search box the storefront was still able to present meaningful results back to the user.|
One way to create this type of behavior is to use a lexicon such as WordNet to search and navigate words and their meanings. WordNet is an English-language lexicon (you can think of this as a dictionary) developed at Princeton University and funded largely by government grants. Words in WordNet that share a common meaningsynonymsare organized into groups called synsets
. Additionally, WordNet defines relationships between synsets to capture the semantic relationships between words. An example of a relationship in WordNet is that of antonyms, words that have opposite meanings from one another. Later in this article you will be exposed to other types of relationships supported in WordNet and to a couple of techniques for getting programmatic access to WordNet's relationship information.
|Figure 2. OWL Representation of WordNet: This figure illustrates the concepts in WordNet and their relationships to each other (Source: Wordnet in RDFS and OWL)|
In a way, you can think of WordNet as a lexical ontology, a conceptualization of the entities and relationships of things in the domain of language. In fact, the W3C has provided a representation of WordNet in the Web Ontology Language (OWL). The ontological perspective into WordNet is interesting for many reasons, not the least of which is to help describe the structure of WordNet and simplify visualization of its concepts and their relationships. Figure 2
illustrates these ideas from the W3C document that describes the OWL representation of WordNet.
As Figure 2 shows, each lexical expression of a concept will likely map to different words in different languages. You can think of the lexical form of a word as the series of letters that represent the word in a particular language. Each word might have different possible meanings; for example, "pants" might mean an article of clothing, or could refer to the heavy breathing of a dog. The WordSense concept expresses such multiple meanings. Lastly, related WordSenses are grouped by synsets as defined earlier.