Using Smart Tags in Office 2003 (cont'd)
MOSTL and Regular Expressions
Regular expressions are a powerful form of wildcard searches. Most developers (and even power users) are familiar with regular expressions used to search for files, such as *.doc. When it comes to regular expression pattern matching, you will need more sophisticated patterns. The following pattern matches US phone numbers within a text string.
   ((\(\d{3}\))|(\d{3}-))?\d{3}-\d{4}
advertisement
If you are not familiar with regular expressions, you might find this but of code a little difficult to read. Let's take a brief look at the syntax of regular expressions. Table 1 lists the most common meta-characters that are used in regular expression patterns.

Table 1: This Smart Tag Meta-character listing will make it easy to follow the exploration.
Meta-Character Meaning
Character Matching  
. (period) Matches any single character except the newline character
[] Matches any one of the enclosed characters. Ranges can be specified by using a hyphen, such as [0-9]
| The OR operator. For example, x|y will match either x or y
Position Matching  
^ Matches the beginning of string
$ Matches the end of string
Repetition Matching  
? Matches 0 or 1 instances of the preceding character
+ Matches 1 or more instances of the preceding character
\ Indicates that the next character should not be interpreted as a regular expression special character
* Matches 0 or more instances of the preceding character
() Groups a series of characters together
{n} Matches exactly n instances of the preceding character
{n,} Matches at least n instances of the preceding character
{n,m} Matches at lease n but no more than m instances of the preceding character
Special Characters  
\s Matches any single white space character, including space, tab, line feed, and form feed
\w Matches any alphanumeric character, including the underscore character
\d Matches a single-digit character
\f Matches a form feed
\n Matches a line feed
\r Matches a carriage return
\t Matches a tab
\v Matches a vertical tab
\b Matches a word boundary, such as a space


Examine this by breaking down the phone number pattern shown above. Start at the end of the pattern:
   \d{3}-\d{4}
This means that the expression must contain exactly three digits, followed by a hyphen, and then exactly four digits. Thus, 555-1234 is a match. Now look at the first part:
   ((\(\d{3}\))|(\d{3}-))?
Notice that there is a pipe (|) in the middle, indicating an or operator. On the right side of that you can see:
   (\d{3}-)
This will match exactly three digits followed by a hyphen. On the left side of the or (|) you see:
   (\(\d{3}\))
Notice that \( and \) are used to indicate that parenthesis are part of the pattern. In between these is the \d{3} indicating exactly three digits. So the pattern matches phone numbers in these formats:
   (281)866-7444
   281-866-7444 
This portion of the pattern ends with a question mark, indicating that the pattern immediately prior to the question mark is optional. Because the area code pattern is grouped in parentheses and the question mark is outside the parentheses, the entire pattern inside the parentheses is optional. The pattern will also be a match:
   866-7444
You could get even more sophisticated and match 281.866.7444 in addition to allowing a space between the various parts of the phone number and match foreign phone numbers.

Here's another example. Let's say you use a bug-tracking Web site to track issues that need to be addressed in your current project. You can define a smart tag that recognizes "Bug #123456" or "bug 123456" and link directly to that bug's details on the Web site. Here is the regular expression that matches these phrases:
   [B|b]ug\s(#){0,1}\d{6}
You can create a smart tag definition to recognize this pattern and link to the Web site. It looks like Listing 2. Note that the URL in this example is a placeholder that should be replaced with the URL to your bug-tracking Web site. The resulting smart tag menu is shown in Figure 4.

 
Figure 4: The result of our example is this customized IP Address smart tag.
For a great in-depth look at regular expressions, look at the May/June 2003 issue of CoDe for an article by Jim Duffey called "Getting Started With Regular Expressions." Also, you can read Markus Egger's white paper on MSDN, entitled "Regular Expression Support in Microsoft Office System Smart Tags," posted in August 2003.

Developing Custom Smart Tags in .NET
Since the release of Office XP, developers have been able to build custom smart tags. You can get the Office XP Smart Tag SDK Version 1.1 at: http://www.microsoft.com/downloads .

The Office XP SDK contains the documentation and samples for creating smart tags in Office XP and for Office 2003. There is also a new type library (Microsoft Smart Tags 2.0 Type Library, which adds new functionality for the new smart tag features in Office 2003. This type library will be available on MSDN when Office 2003 is released.

 
Figure 5: A little bit of code leads to the custom bug tracking smart tag.
To implement smart tags, you'll typically create two classes: a Recognizer class and an Action class. (It is possible to create smart tags for special purposes that only have a Recognizer or an Action class, but that is beyond the scope of this article.) You can place both classes in the same DLL or create them in separate DLLs. In the example implemented later in this article, they will be in the same DLL.

The Recognizer class must implement the ISmartTagRecognizer interface. Table 2 lists the properties and methods in the ISmartTagRecognizer interface.

Table 2: ISmartTagRecognizer properties and methods
Property/Method Name Type Description
ProgID String The programmatic identifier of the recognizer interface
Name String A short title reflecting what the recognizer does
Desc String A longer description of what the recognizer does
SmartTagCount Integer The number of smart tag types that the recognizer supports
SmartTagDownloadURL String The URL embedded in documents to let users download new or updated actions
SmartTagName String The unique identifiers of smart tag types that the recognizer supports
Recognize N/A The method that recognizes the terms


If you want to take advantage of the new smart tag features in Office 2003, you must also implement the ISmartTagRecognizer2 interface. One of the new features is a new Recognize method (Recognize2) that simplifies identifying recognized text and allows you to identify the application being used. Table 3 shows the properties and methods for the ISmartTagRecognizer2 interface.

Table 3: ISmartTagRecognizer2 properties and methods
Property/Method Name Type Description
SmartTagInitialize N/A The initialize method that is fired before any other event
Recognize2 N/A A new Recognize method that simplifies identifying recognized text
PropertyPage String Displays a custom dialog box for configuring a smart tag through the smart tag's dialog box. This property tells the calling application that the smart tag supports customization
DisplayPropertyPage N/A Called when the user requests configuring the smart tag


The Action class must implement the ISmartTagAction interface. Table 4 displays the properties and methods for this interface.

Table 4: ISmartTagAction properties and methods
Property/Method Name Type Description
ProgID String The programmatic identifier of the DLL action interface
Name String A short title reflecting what the action does
Desc String A longer description of what the action does
SmartTagCount Integer The number of smart tag types that the DLL supports
SmartTagName String The unique string identifiers of smart tag types that the DLL supports
SmartTagCaption String A caption for a smart tag type for use in on-object menus
VerbCount Integer The total number of verbs supported by the DLL for a given smart tag type
VerbID Integer A unique integer identifier for a verb
VerbCaptionFromID String A caption for a smart tag type action for use in on-object menus
VerbNameFromID String A language-agnostic identifier string for a verb
InvokeVerb N/A The method that is called to invoke a verb


If you want to take advantage of the new smart tag features in Office 2003, you must also implement the ISmartTagAction2 interface. This allows you to create cascading menus, use dynamic captions, and so forth. Table 5 displays the properties and methods for this interface.

Table 5: ISmartTagAction2 properties and methods
Property/Method Name Type Description
SmartTagInitialize N/A Fires before any other event
ShowSmartTagIndicator Boolean Specifies if the smart tag underline indicator is displayed
IsCaptionDynamic Boolean Indicates if the caption for the action in the smart tag menu is dynamic. If so, the SmartTagInitialize method is called every time the menu is displayed.
VerbCaptionFromID2 String The caption for an action menu item. Allows for cascading menus.
InvokeVerb2 N/A Similar to the InvokeVerb method, however this new method passes in a local ID (LCID for locale identification.


Previous Page: Introduction Next Page: Building the Smart Tag Recognizer
Page 1: IntroductionPage 3: Building the Smart Tag Recognizer
Page 2: MOSTL and Regular ExpressionsPage 4: Building the Smart Tag Action