Browse DevX
Sign up for e-mail newsletters from DevX


Create a LAMP Search Engine Using Multithreaded Perl

Explore the multithreaded capabilities of Perl while building a LAMP Web crawler with all the necessary components of a basic search engine.




Building the Right Environment to Support AI, Machine Learning and Deep Learning

ntil recently, people argued that Perl did not have stable multithreaded capabilities. This article presents a Perl application for which using multithreading capabilities makes sense: a Web crawler with all the necessary components of a basic search engine. The downloadable code includes the MySQL database creation scripts, Perl code, and PHP interface files.

The application requirements for the example are:

  1. All open source
  2. Small footprint
  3. Ability to score content
  4. Multithreaded application

To exemplify the point of a small footprint, the crawler, search engine, and database run on a very old Pentium 166MHz with 32MB RAM. It's not fast by any means, but the amount of performance you can get running Linux on such old hardware is amazing.

Figure 1 depicts the architecture utilized in this example by showing the different components that make up the search engine (e.g., the dictionary hash, the multithreaded crawler, and the PHP front-end used to search the database).

Figure 1: Architecture Utilized for Search Engine

The article breaks out into the following steps:

  1. Preliminary Setup
  2. Code Snippets and Explanations
  3. Web Interface—Searching the content

The Code Snippets and Explanations section describes the components listed in Figure 1.

Comment and Contribute






(Maximum characters: 1200). You have 1200 characters left.



Thanks for your registration, follow us on our social networks to keep up-to-date