Login | Register   
LinkedIn
Google+
Twitter
RSS Feed
Download our iPhone app
TODAY'S HEADLINES  |   ARTICLE ARCHIVE  |   FORUMS  |   TIP BANK
Browse DevX
Sign up for e-mail newsletters from DevX


Tip of the Day
Language: VB5,VB6,VBS
Expertise: Intermediate
Aug 16, 2001

Search multiple substrings with the RegExp object

The RegExp object in the Microsoft VBScript Regular Expression type library supports regular expression patterns containing the | (or) operator, which lets you search for multiple substrings at the same time. For example, the following piece of code lets you search for a month name in a source text:

' NOTE: this code requires a reference to the
'       Microsoft VBScript Regular Expression type library

Dim re As New RegExp
Dim ma As Match

re.Pattern = "january|february|march|april|may|june|july|september|october|novem" _
    & "ber|december"

' case isn't significant
re.IgnoreCase = True
' we want all occurrences
re.Global = True

' we assume that the string to be parsed is in the sourceText variable
For Each ma In re.Execute(sourceText)
    Print "Found '" & ma.Value & "' at index " & ma.FirstIndex
Next
The code above doesn't search for whole words, though, and would find false matches such as "marches". To force the Execute method to search only for whole words, we must embed the list of words among parenthesis, and add the \b sequence to specify that the occurrence should be on a word boundary:

re.Pattern = "\b(january|february|march|april|may|june|july|september|october|no" _
    & "vember|december)\b"
Thanks to the Join function, it is easy to create a generic function that searches for any word in an array:

' Search all the words specified in the array passed as a second argument
' returns a bi-dimensional array of variants, where arr(0,n) is the N-th
' matched word, and arr(1,n) is the index where the word has been found

' NOTE: requires a reference to the
'       Microsoft VBScript Regular Expression type library


Function InstrAllWords(ByVal Text As String, words() As String, _
    Optional IgnoreCase As Boolean) As Variant
    Dim re As New RegExp
    Dim ma As Match
    Dim maCol As MatchCollection
    Dim index As Long
    
    ' create the pattern in the form "\b(word1|word2|....|wordN)\b"
    re.pattern = "\b(" & Join(words, "|") & ")\b"
    ' we want all occurrences
    re.Global = True
    ' case insensitive?
    re.IgnoreCase = IgnoreCase
    
    ' get the result
    Set maCol = re.Execute(Text)
    
    ' now we can DIMension the result array
    ReDim res(1, maCol.Count) As Variant
    
    ' move results into the array
    For Each ma In maCol
        index = index + 1
        res(0, index) = ma.Value
        res(1, index) = ma.FirstIndex
    Next
    
    ' return to caller
    InstrAllWords = res
End Function
Here's is an example of how you can use the function above:

' fill an array with desired words
Dim words(2) As String
words(0) = "Visual": words(1) = "Basic": words(2) = "Windows"

Dim arr() as Variant
arr = InstrAllWords(txtSource.Text, words())
For i = 1 To UBound(arr, 2)
    Print "'" & arr(0, i) & "' at index " & arr(1, i)
Next
Francesco Balena
 
Comment and Contribute

 

 

 

 

 


(Maximum characters: 1200). You have 1200 characters left.

 

 

Sitemap
Thanks for your registration, follow us on our social networks to keep up-to-date