Capturing Search Engine Results

Capturing Search Engine Results

Question:
I would like to have a page that allows a user to conduct a search. I would like the action of the search to be an ASP page that in turn queries yahoo.com, altavista.digital.com, and excite.com, and returns the results back to a string variable in my ASP page.

I would then like to parse through this string, pulling out the relevant links from each search engine and create a new HTML stream with the links from all of the search engines, indicating the original source after the link.

What is the best way to capture content from another site for parsing and reformatting?

Thanks!

Answer:
The Microsoft Internet Transfer control makes it a snap to capture HTML from any Web site. The control has a number of properties and methods that are useful, but one of the most important is the OpenURL method. Using a single call to this method, you can retrieve HTML into a variable. You can then parse this variable in any way that you like.

The following excerpt is from a Visual Basic application that uses an Internet Transfer Control named Inet1. This procedure will call the OpenURL method and will save the HTML from the page in a variable. The URL that is to be retrieved is defined in a textbox on the form. The textbox is named txtURL. The HTML is parsed and each hyperlink that is found in the HTML is displayed in a list box on the form. The list box is named List1.

Private Sub cmdGetLinks_Click()        On Error GoTo ErrorHandler            Dim strHTML        List1.Clear    List1.AddItem "Finding Links..."        cmdGetLinks.Enabled = False        strfTheURL = txtURL.Text    strTemp = Inet1.OpenURL(txtURL.Text)        List1.Clear    strHTML = LCase(strTemp)    cmdGetLinks.Enabled = True    intLookForNextHREFHere = 1        Do        strNextHref = InStr(intLookForNextHREFHere, strHTML, "href")                If strNextHref < 1 Then Exit Do            intWhereIsTheEQ = InStr(strNextHref, strHTML, "=")                intLookHere = intWhereIsTheEQ + 1                strNextChar = Mid(strHTML, intLookHere, 1)                Const DOUBLE_QUOTE = 34 ' ASCII NUMBER                If strNextChar = Chr(DOUBLE_QUOTE) Then            ' The delimiter is a double-quote, look for the next one            intNextDelimiter = InStr(intLookHere + 1, strHTML, Chr(34))            strTheURL = Trim(Mid(strHTML, intLookHere + 1, intNextDelimiter - intLookHere - 1))            List1.AddItem strTheURL        Else            ' DOUBLE QUOTE CHARACTER NOT USED,            ' LOOK FOR START OF URL            Do Until Len(Trim(strNextChar)) > 0                intLookHere = intLookHere + 1                strNextChar = Mid(strHTML, intLookHere, 1)            Loop                        ' NOW LOOK FOR END OF URL BY LOCATING THE NEXT SPACE CHARACTER            ' OR THE NEXT HTML CLOSE TAG CHARACTER (>) WHICHEVER IS NEAREST            If InStr(intLookHere, strHTML, " ") < InStr(intLookHere, strHTML, ">") Then                intNextDelimiter = InStr(intLookHere, strHTML, " ")            Else                intNextDelimiter = InStr(intLookHere, strHTML, ">")            End If                    strTheURL = Trim(Mid(strHTML, intLookHere, intNextDelimiter - intLookHere))            List1.AddItem strTheURL                                End If                ' FIND NEXT URL        intLookForNextHREFHere = intNextDelimiter    Loop        Exit Sub    ErrorHandler:    Const REQUEST_TIMED_OUT = 35761    Select Case Err.Number        Case REQUEST_TIMED_OUT            MsgBox Err.Number & " - " & Err.Description        Case Else    End SelectEnd Sub
Share the Post:
Heading photo, Metadata.

What is Metadata?

What is metadata? Well, It’s an odd concept to wrap your head around. Metadata is essentially the secondary layer of data that tracks details about the “regular” data. The regular

XDR solutions

The Benefits of Using XDR Solutions

Cybercriminals constantly adapt their strategies, developing newer, more powerful, and intelligent ways to attack your network. Since security professionals must innovate as well, more conventional endpoint detection solutions have evolved

AI is revolutionizing fraud detection

How AI is Revolutionizing Fraud Detection

Artificial intelligence – commonly known as AI – means a form of technology with multiple uses. As a result, it has become extremely valuable to a number of businesses across

AI innovation

Companies Leading AI Innovation in 2023

Artificial intelligence (AI) has been transforming industries and revolutionizing business operations. AI’s potential to enhance efficiency and productivity has become crucial to many businesses. As we move into 2023, several

data fivetran pricing

Fivetran Pricing Explained

One of the biggest trends of the 21st century is the massive surge in analytics. Analytics is the process of utilizing data to drive future decision-making. With so much of