If you are not familiar with regular expressions, you might find this but of code a little difficult to read. Let's take a brief look at the syntax of regular expressions.
Table 1 lists the most common meta-characters that are used in regular expression patterns.
Table 1: This Smart Tag Meta-character listing will make it
easy to follow the exploration.
|
Meta-Character
|
Meaning
|
|
Character Matching
|
|
|
. (period)
|
Matches any single character except the newline
character
|
|
[]
|
Matches any one of the enclosed characters. Ranges
can be specified by using a hyphen, such as [0-9]
|
|
|
|
The OR operator. For example, x|y will match
either x or y
|
|
Position Matching
|
|
|
^
|
Matches the beginning of string
|
|
$
|
Matches the end of string
|
|
Repetition Matching
|
|
|
?
|
Matches 0 or 1 instances of the preceding
character
|
|
+
|
Matches 1 or more instances of the preceding
character
|
|
\
|
Indicates that the next character should not be
interpreted as a regular expression special
character
|
|
*
|
Matches 0 or more instances of the preceding
character
|
|
()
|
Groups a series of characters together
|
|
{n}
|
Matches exactly n instances of the preceding
character
|
|
{n,}
|
Matches at least n instances of the preceding
character
|
|
{n,m}
|
Matches at lease n but no more than m instances of
the preceding character
|
|
Special Characters
|
|
|
\s
|
Matches any single white space character,
including space, tab, line feed, and form feed
|
|
\w
|
Matches any alphanumeric character, including the
underscore character
|
|
\d
|
Matches a single-digit character
|
|
\f
|
Matches a form feed
|
|
\n
|
Matches a line feed
|
|
\r
|
Matches a carriage return
|
|
\t
|
Matches a tab
|
|
\v
|
Matches a vertical tab
|
|
\b
|
Matches a word boundary, such as a space
|
Examine this by breaking down the phone number pattern shown above. Start at the end of the pattern:
\d{3}-\d{4}
This means that the expression must contain exactly three digits, followed by a hyphen, and then exactly four digits. Thus, 555-1234 is a match. Now look at the first part:
((\(\d{3}\))|(\d{3}-))?
Notice that there is a pipe (|) in the middle, indicating an
or operator. On the right side of that you can see:
(\d{3}-)
This will match exactly three digits followed by a hyphen. On the left side of the
or (|) you see:
(\(\d{3}\))
Notice that \( and \) are used to indicate that parenthesis are part of the pattern. In between these is the \d{3} indicating exactly three digits. So the pattern matches phone numbers in these formats:
(281)866-7444
281-866-7444
This portion of the pattern ends with a question mark, indicating that the pattern immediately prior to the question mark is optional. Because the area code pattern is grouped in parentheses and the question mark is outside the parentheses, the entire pattern inside the parentheses is optional. The pattern will also be a match:
866-7444
You could get even more sophisticated and match 281.866.7444 in addition to allowing a space between the various parts of the phone number and match foreign phone numbers.
Here's another example. Let's say you use a bug-tracking Web site to track issues that need to be addressed in your current project. You can define a smart tag that recognizes "Bug #123456" or "bug 123456" and link directly to that bug's details on the Web site. Here is the regular expression that matches these phrases:
[B|b]ug\s(#){0,1}\d{6}
You can create a smart tag definition to recognize this pattern and link to the Web site. It looks like
Listing 2. Note that the URL in this example is a placeholder that should be replaced with the URL to your bug-tracking Web site. The resulting smart tag menu is shown in
Figure 4.
 | |
| Figure 4: The result of our example is this customized IP Address smart tag. |
For a great in-depth look at regular expressions, look at the May/June 2003 issue of CoDe for an article by Jim Duffey called "Getting Started With Regular Expressions." Also, you can read Markus Egger's white paper on MSDN, entitled "Regular Expression Support in Microsoft Office System Smart Tags," posted in August 2003.
Developing Custom Smart Tags in .NET
Since the release of Office XP, developers have been able to build custom smart tags. You can get the Office XP Smart Tag SDK Version 1.1 at:
http://www.microsoft.com/downloads .
The Office XP SDK contains the documentation and samples for creating smart tags in Office XP and for Office 2003. There is also a new type library (Microsoft Smart Tags 2.0 Type Library, which adds new functionality for the new smart tag features in Office 2003. This type library will be available on MSDN when Office 2003 is released.
 | |
| Figure 5: A little bit of code leads to the custom bug tracking smart tag. |
To implement smart tags, you'll typically create two classes: a Recognizer class and an Action class. (It is possible to create smart tags for special purposes that only have a Recognizer or an Action class, but that is beyond the scope of this article.) You can place both classes in the same DLL or create them in separate DLLs. In the example implemented later in this article, they will be in the same DLL.
The Recognizer class must implement the ISmartTagRecognizer interface.
Table 2 lists the properties and methods in the ISmartTagRecognizer interface.
Table 2: ISmartTagRecognizer properties and methods
|
Property/Method Name
|
Type
|
Description
|
|
ProgID
|
String
|
The programmatic identifier of the recognizer
interface
|
|
Name
|
String
|
A short title reflecting what the recognizer does
|
|
Desc
|
String
|
A longer description of what the recognizer does
|
|
SmartTagCount
|
Integer
|
The number of smart tag types that the recognizer
supports
|
|
SmartTagDownloadURL
|
String
|
The URL embedded in documents to let users
download new or updated actions
|
|
SmartTagName
|
String
|
The unique identifiers of smart tag types that the
recognizer supports
|
|
Recognize
|
N/A
|
The method that recognizes the terms
|
If you want to take advantage of the new smart tag features in Office 2003, you must also implement the ISmartTagRecognizer2 interface. One of the new features is a new Recognize method (Recognize2) that simplifies identifying recognized text and allows you to identify the application being used.
Table 3 shows the properties and methods for the ISmartTagRecognizer2 interface.
Table 3: ISmartTagRecognizer2 properties and methods
|
Property/Method Name
|
Type
|
Description
|
|
SmartTagInitialize
|
N/A
|
The initialize method that is fired before any
other event
|
|
Recognize2
|
N/A
|
A new Recognize method that simplifies identifying
recognized text
|
|
PropertyPage
|
String
|
Displays a custom dialog box for configuring a
smart tag through the smart tag's dialog box. This
property tells the calling application that the
smart tag supports customization
|
|
DisplayPropertyPage
|
N/A
|
Called when the user requests configuring the
smart tag
|
The Action class must implement the ISmartTagAction interface.
Table 4 displays the properties and methods for this interface.
Table 4: ISmartTagAction properties and methods
|
Property/Method Name
|
Type
|
Description
|
|
ProgID
|
String
|
The programmatic identifier of the DLL action
interface
|
|
Name
|
String
|
A short title reflecting what the action does
|
|
Desc
|
String
|
A longer description of what the action does
|
|
SmartTagCount
|
Integer
|
The number of smart tag types that the DLL
supports
|
|
SmartTagName
|
String
|
The unique string identifiers of smart tag types
that the DLL supports
|
|
SmartTagCaption
|
String
|
A caption for a smart tag type for use in
on-object menus
|
|
VerbCount
|
Integer
|
The total number of verbs supported by the DLL for
a given smart tag type
|
|
VerbID
|
Integer
|
A unique integer identifier for a verb
|
|
VerbCaptionFromID
|
String
|
A caption for a smart tag type action for use in
on-object menus
|
|
VerbNameFromID
|
String
|
A language-agnostic identifier string for a verb
|
|
InvokeVerb
|
N/A
|
The method that is called to invoke a verb
|
If you want to take advantage of the new smart tag features in Office 2003, you must also implement the ISmartTagAction2 interface. This allows you to create cascading menus, use dynamic captions, and so forth.
Table 5 displays the properties and methods for this interface.
Table 5: ISmartTagAction2 properties and methods
|
Property/Method Name
|
Type
|
Description
|
|
SmartTagInitialize
|
N/A
|
Fires before any other event
|
|
ShowSmartTagIndicator
|
Boolean
|
Specifies if the smart tag underline indicator is
displayed
|
|
IsCaptionDynamic
|
Boolean
|
Indicates if the caption for the action in the
smart tag menu is dynamic. If so, the
SmartTagInitialize method is called every time the
menu is displayed.
|
|
VerbCaptionFromID2
|
String
|
The caption for an action menu item. Allows for
cascading menus.
|
|
InvokeVerb2
|
N/A
|
Similar to the InvokeVerb method, however this new
method passes in a local ID (LCID for locale
identification.
|