“Googlize” Your Java Apps to Search Billions of Web Pages

oogle has introduced a Web API service that enables developers to program search engine functionality into their applications. With the Google Web APIs service, a program can query more than 2 billion Web documents quickly and easily. Applications with this functionality allow users to schedule regular search requests that can help monitor the Web for new information on a subject or offer comparative analyses of the amount of information available on different subjects over time.

The Google Web APIs service provides a SOAP (Simple Object Access Protocol) interface to search Google’s index, accessing information and Web pages from its cache and checking the spelling of words against Google’s standard search syntax. With its use of the SOAP and WSDL standards, Google allows developers to program in three environments: Java, Perl, or Visual Studio .NET. In this article, I use a sample program I coded (GoogleSearchDemo.java) to demonstrate how to use the Google Web APIs service with Java code.

Get Started
First you need to download the Google kit from http://www.google.com/apis/download.html. The free downloadable kit contains:

  • A complete API reference describing the semantics of method calls and fields
  • Sample SOAP request and response messages
  • A Google Web API WSDL file
  • A Java library, example program, and Javadoc documentation
  • A sample .NET program

Create a Google Web APIs service account. Use your account username and password to log in and get an account key. Note that Google limits each developer who registers for the Web APIs service to 1,000 queries per day.

Now you’re ready to dive into the code. The following classes are included in the googleapi.jar file:

  • import com.google.soap.search.GoogleSearch; ? The GoogleSearch class provides access to the Google Web APIs, as well as Google search functions and cached pages via SOAP.
  • import com.google.soap.search.GoogleSearchResult; ? GoogleSearchResult encapsulates presents the complete results from each Google Web APIs search call. You should call the get methods only on this object; the fields are filled in when a search result is returned.
  • import com.google.soap.search.GoogleSearchResultElement; ? GoogleSearchResultElement contains an individual search result component of a GoogleSearchResult.
  • import com.google.soap.search.GoogleSearchFault; ? GoogleSearchFault is an exception that encapsulates various errors that can result from a Google API call.

Download my sample Google Web API program, GoogleSearchDemo.java. Create an instance of the GoogleSearch class and set the key that Google has provided. Keep in mind that Google won’t let you use its search functionality until you set the key:

GoogleSearch search = new GoogleSearch();search.setKey("yourkey");

After setting the key, set the query string for search:

search.setQueryString("cross language barriers for SOAP");

Now you need to invoke the Google search and store the return results:

GoogleSearchResult result = search.doSearch();

Next, iterate through the results:

GoogleSearchResultElement[] re = result.getResultElements();for ( int i1 = 0; i1 < re.length; i1++ ) {	System.out.println("" + re[i1].getTitle() + "
");}

Before compiling the code, you need to put the googleapi.jar file into your classpath.

What If I’m Behind a Firewall?
If you are running behind a firewall, Google search will return the following SOAP exception when you try to execute it:

com.google.soap.search.GoogleSearchFault: 
[SOAPException: faultCode=SOAP-ENV:Client;
msg=Error opening socket: api.google.com;
targetException=java.lang.IllegalArgumentException:
Error opening socket: api.google.com]

To get your code to work behind a firewall proxy, you’ll need to modify the GoogleSearch class and implement the following four methods of the org.apache.soap.transport.http.SOAPHTTPConnection class:

  • public void setProxyHost(String s){}
  • public void setProxyPort(int i){}
  • public void setProxyUserName(String s){}
  • public void setProxyPassword(String s){}

If you don’t want to modify your existing class, download patgoogle.jar, which is a patch for firewall proxies. It contains the GoogleSearch.class with this added modification. Be sure to place patgoogle.jar before googleapi.jar in the classpath, since a modified GoogleSearch class exists in patgoogle.jar. Hopefully, Google will include these changes in future releases of its kit so developers don’t have to add any patches. (In the sample program, I incorporated calls for firewall proxies too. If you are not running behind a firewall just comment those calls out.)

As Easy As It Seems
Using the Google Web APIs service as I’ve demonstrated, your application can search billions of Web page?and you don’t need to use any complicated code.

Share the Post:
Share on facebook
Share on twitter
Share on linkedin

Overview

The Latest

your company's audio

4 Areas of Your Company Where Your Audio Really Matters

Your company probably relies on audio more than you realize. Whether you’re creating a spoken text message to a colleague or giving a speech, you want your audio to shine. Otherwise, you could cause avoidable friction points and potentially hurt your brand reputation. For example, let’s say you create a

chrome os developer mode

How to Turn on Chrome OS Developer Mode

Google’s Chrome OS is a popular operating system that is widely used on Chromebooks and other devices. While it is designed to be simple and user-friendly, there are times when users may want to access additional features and functionality. One way to do this is by turning on Chrome OS

homes in the real estate industry

Exploring the Latest Tech Trends Impacting the Real Estate Industry

The real estate industry is changing thanks to the newest technological advancements. These new developments — from blockchain and AI to virtual reality and 3D printing — are poised to change how we buy and sell homes. Real estate brokers, buyers, sellers, wholesale real estate professionals, fix and flippers, and beyond may