dcsimg
TODAY'S HEADLINES  |   ARTICLE ARCHIVE  |   FORUMS  |   TIP BANK
Browse DevX
Sign up for e-mail newsletters from DevX


advertisement
 

Create a LAMP Search Engine Using Multithreaded Perl

Explore the multithreaded capabilities of Perl while building a LAMP Web crawler with all the necessary components of a basic search engine.


advertisement
ntil recently, people argued that Perl did not have stable multithreaded capabilities. This article presents a Perl application for which using multithreading capabilities makes sense: a Web crawler with all the necessary components of a basic search engine. The downloadable code includes the MySQL database creation scripts, Perl code, and PHP interface files.

The application requirements for the example are:

  1. All open source
  2. Small footprint
  3. Ability to score content
  4. Multithreaded application

To exemplify the point of a small footprint, the crawler, search engine, and database run on a very old Pentium 166MHz with 32MB RAM. It's not fast by any means, but the amount of performance you can get running Linux on such old hardware is amazing.

Figure 1 depicts the architecture utilized in this example by showing the different components that make up the search engine (e.g., the dictionary hash, the multithreaded crawler, and the PHP front-end used to search the database).

Figure 1: Architecture Utilized for Search Engine

The article breaks out into the following steps:

  1. Preliminary Setup
  2. Code Snippets and Explanations
  3. Web Interface—Searching the content

The Code Snippets and Explanations section describes the components listed in Figure 1.



Thanks for your registration, follow us on our social networks to keep up-to-date