Browse DevX
Sign up for e-mail newsletters from DevX


Create a LAMP Search Engine Using Multithreaded Perl : Page 4

Explore the multithreaded capabilities of Perl while building a LAMP Web crawler with all the necessary components of a basic search engine.




Building the Right Environment to Support AI, Machine Learning and Deep Learning

Step 3: Web Interface—Searching the Content
In this basic search engine example, the Perl application runs behind the curtains while the PHP Web application provides the interface for searching your MySQL database (populated by the crawler) and returning user-friendly data to the user. Figure 2 shows a screen shot that depicts a search for the word "database" and the results.

Figure 2: Search Results for the Word "Database"

The PHP source code is included in the code download.

Perl's Multithreaded Capabilities
You've seen how the multithreaded capabilities of Perl allow for some exciting possibilities. A complete tutorial on threads could fill a book, but the examples provided should give you a good understanding and get you on your way. In addition, you received the basic pieces of a Web search engine.

By running the crawlThread.pl program on multiple machines that all feed from the same MySQL database, you could scale the search engine horizontally. For additional information and the authoritative answers on how Perl threads behave, consult the documentation bundled with the Perl distribution. Before unleashing the code described in this article and traversing the Internet, you should add some logic in the crawler to ignore pages that you do not want crawled.

Brian Carr, founding member of Oracle Giants, is an Oracle Certified Professional and Oracle ACE.
Comment and Contribute






(Maximum characters: 1200). You have 1200 characters left.



Thanks for your registration, follow us on our social networks to keep up-to-date