devxlogo

UniqueWords – Extract all individual words in a string

UniqueWords – Extract all individual words in a string

' build a list of all the individual words in a string'' returns a collection that contains all the unique words.' The key for each item is the word itself' so you can easily use the result collection to both ' enumerate the words and test whether a given word appears' in the text. Words are inserted in the order they appear' and are stored as lowercase strings.'' Numbers are ignored, but digit characters are preserved' if they appear in the middle or at the end of a word.Function UniqueWords(ByVal Text As String) As Collection    Dim thisWord As String    Dim i As Long    Dim wordStart As Long    Dim varWord As Variant    Dim res As String    ' prepare the result collection    Set UniqueWords = New Collection        ' ignore duplicate words    On Error Resume Next        ' extract all words from the text    For i = 1 To Len(Text)        Select Case Asc(Mid$(Text, i, 1))            Case 65 To 90, 97 To 122                ' an alpha char                If wordStart = 0 Then wordStart = i            Case 48 To 57                ' include digits only if suffix of a word (as in "ABCD23")            Case Else                If wordStart Then                    ' extract the word                    thisWord = LCase$(Mid$(Text, wordStart, i - wordStart))                    ' add to the collection, but ignore if already there                    UniqueWords.Add thisWord, thisWord                    ' reset the flag/pointer                    wordStart = 0                End If        End Select    Next        ' account for the last word    If wordStart Then        ' extract the word        thisWord = LCase$(Mid$(Text, wordStart, i - wordStart))        ' add to the collection, but ignore if already there        UniqueWords.Add thisWord, thisWord    End If    End Function

devx-admin

Share the Post: