Browse DevX
Sign up for e-mail newsletters from DevX

Tip of the Day
Language: Web Development
Expertise: Beginner
Oct 1, 1996



Building the Right Environment to Support AI, Machine Learning and Deep Learning

Parsing an HTML File

I am having problems parsing an HTML tag. This entry repeats hundreds of times in this file, but with different names and addresses, etc.

I need a routine that will search for for the first occurance of a particular HTML tag (so that I can return the entire line with an input command and capture the name), return it to a variable, then continue to search for the remander of the names (it happens that "

" is used directly in front of every name in the file and nowhere else), making sure to skip the ones that have already been returned. I've tried several things, but none have rendered the desired results.

The first thing you'll need to do is to read the file character by character with the Input$ function. If you just use Input #n, it will leave out most of the punctuation. Here is a section from a program I wrote to parse my bookmark file:

   Dim sTemp As String
   Dim sChar As String

   Open "filename.ext" For Input As #1
   Do While Not EOF(1)
      sChar = Input$(1, 1)
      If Asc(sChar) = 13 Then
         ' Don't include it in string
      ElseIf Asc(sChar) = 10 Then
         ' At this point, a full line has been read
         ' and should be processed with whatever method
         ' you choose.
         sTemp = ""
         sTemp = sTemp + sChar
      End If
   Close #1
As far as parsing the string, here is a sample loop of how to do it, assuming sTemp is your full line:
sSearch = "

" for i = 1 to len(sTemp) - len(sSearch) + 1 if mid$(sTemp, i, len(sSearch)) = sSearch then ' found string...do whatever exit for end if next i

DevX Pro
Comment and Contribute






(Maximum characters: 1200). You have 1200 characters left.



Thanks for your registration, follow us on our social networks to keep up-to-date