RSS Feed
Download our iPhone app
Browse DevX
Sign up for e-mail newsletters from DevX


Utilizing a Multi-Core System with the Actor Model : Page 2

Demand for multi-core/multi-processor applications is growing, but developing for a multi-threaded application does not require a steep learning curve or an understanding of complicated edge cases. Learn how to develop efficient multi-threaded applications without using synchronized blocks.

Included in this article is an implementation of the above actor model for retrieving remote recipes from multiple sites in multiple formats. Each recipe is listed in one or more index files on the web, and the recipe is in HTML. The program retrieves these silos of information, harvests meaningful data, indexes it, and makes it available in a graphical user interface.

Author's Note: Read the terms and conditions of any web site before harvesting its contents.

With the stage set, let's introduce the actors:

RoundRobinUrlConsumerDistributes URLs to other actors
UrlResolverUrlConsumerRetrieves data streams for another actor
XhtmlTransformerStreamTransformerFormats HTML into XHTML for parsing
StyleSheetTransformerStreamTransformerConverts remote XML format into local data format
RdfParserStreamConsumerParses data stream into data structure
SeeAlsoExtractorRdfConsumerExtracts URLs from index data
IngredientProcessorRdfConsumerApplies local processing rules on data
RDFInserterRdfConsumerInserts data into a database

Listing 4 shows how these actors are connected to one another. The manage() methods are typed versions of the ActorManager#manage(Object) in Listing 3.

A ClusterMap and Main class are also provided in the download archive. To run the example, execute the Main class with the following two arguments: http://www.kraftcanada.com/en/search/SearchResults.aspx?gcatid=86 and http://www.cookingnook.com/free-online-recipes.html

Figure 1. ClusterMap: The tortilla soup recipe is revealed after clicking certain ingredients.
The Main class then opens the ClusterMap and begins harvesting the recipes. After a few recipes are harvested, select the check-box on the left to see the number of recipes that are harvested and click the clear button at the top to update the list of words extracted from the ingredients section. In this way, you can index and search multiple distinct recipe sites. For example, to find recipes that include lemon, cheddar, and garlic (yum), click on these ingredients and the Tortilla Soup recipe is revealed to include all three ingredients from the recipes harvested (see Figure 1).

In a multi-core system, the program uses over 30 threads to orchestrate the retrieval and processing of the data—downloading and processing as quickly as the remote host provides the data. In spite of the multi-threaded performance, there is no need to consider typical multi-threaded challenges, freeing the developer from worrying about the constraint on what each actor should do.

The actor model is a powerful metaphor to assist in creating multi-threaded applications, and by assigning remote addresses and enabling remote communication between actors, you can extend the model to assist in distributed challenges as well. By including life-cycle and dependency management and making actors aware of their environment, they can become agents, participating in a self-organizing system. This architecture has worked well for many distributed problems such as on-line trading, disaster response, and modelling social structure. It has also been the source of inspiration for many service-oriented architectures.

In essence, the actor model abstracts the nitty-gritty of multi-processor programming away from the developer. This reduces concurrency issues and improves the flexibility of the system. This simple model has a low learning curve, so new developers can quickly see how actors are implemented and understand how they fit together. By managing the actors properly, you can leverage the same implementations from multi-processor systems onto distributed networked systems in a gradual manner that can scale with the development demands.

James Leigh is an independent software consultant based in Toronto, has experience modeling business problems and concepts in software, and specializes in performance and technology integration. James has a background in semantic web technologies and decentralized networks. He is an active member in the OpenRDF community, and he's a developer of Sesame and Elmo.
Email AuthorEmail Author
Close Icon
Thanks for your registration, follow us on our social networks to keep up-to-date