RSS Feed
Download our iPhone app
Browse DevX
Sign up for e-mail newsletters from DevX


Writing Parallel Programs with Erlang : Page 4

Erlang, a language for concurrency, is a good choice for writing parallel programs to fully exploit current and future multicore CPUs.


Creating Parallel Processes

Erlang's support for concurrency enables you to exploit symmetric multiprocessor (SMP) architectures and multicore CPUs. In fact, processes should be the main driver when you design an Erlang application. The parallelism is provided by Erlang and not by the host operating system. Nevertheless, an Erlang process is halfway between a native thread and a green one: it originates within the Erlang Run Time System (ERTS) but doesn't share any state with other processes.

The following line in the compute function shows how you can create parallel processes in Erlang.

ProductsCollector_PID = spawn(fact, products_collector, [[], self()]),

The function spawn/3 allows you to create another process parallel to the one in which the function is launched. The first parameter is the module from which the process is spawned, the second one is the function to run in parallel and its list of parameters. The whole line returns a special variable, a PID variable, that is bound to the PID of the new spawned process. This value is precious because it is essential for talking to the new process.

Using the flowchart in Figure 1, you can figure out that the function compute spawns the function products_collector and multiple instances of the function partial_compute in parallel. Each instance of partial_compute will calculate the products of a single interval of numbers, in which the initial interval is divided.

Obviously, you cannot predict the time each instance of partial_compute requires to multiply the numbers in its interval. For this reason, I needed an additional process, products_collector, which receives the results from various instances of partial_compute as soon as they finish and can determine when all partial_compute functions are exited.

For more detail, here is how multiple instances of partial_compute are spawned:

lists:map(fun(X) -> 
            spawn(fact, partial_compute, [X,Numbers,ProductsCollector_PID])

The function lists:map takes two arguments. The second is a list, while the first is the inline definition of a function that has to be applied to the elements of the list. The lists:map function returns another list with the results. This is an example of how in Erlang you can pass names of functions as arguments to other functions.

Talking Between Processes

In Erlang, processes don't share anything and are completely separate from each other. Their only means of communication are messages. As previously stated, the products_collector processes receive messages from (1) partial_compute functions when they finish their calculations and from (2) the function compute when all the partial_compute functions have exited and are asking for the resulting list. Here is the syntax for sending messages in both cases:

Scenario 1
ProductsCollector_PID ! {get_products,self()},

Scenario 2
ProductsCollector_PID ! {add_product,multiplication(FactorsInTheInterval)}.

The schema in both lines sends a message composed of the PID of the process to which the message is sent, the operator !, and the message itself in the form of a tuple containing the information. In the first scenario, for example, the message contains the atom get_product and the PID of the process that is sending the message. This is necessary if you want the contacted process to answer back to the message.

All the sent messages are asynchronous, which means that the program doesn't wait for an answer (if one is due). It immediately executes the next instruction.

Now take a closer look to the function products_collector and the way the processes wait for a message to arrive and perform actions.

products_collector(L, Loop_PID) ->
        {add_product, Product} -> 
            Loop_PID ! {done},
            products_collector([Product|L], Loop_PID);
        {get_products, Client_PID} -> 
            Client_PID ! {products, L},
            products_collector(L, Loop_PID);
        {done} -> 

Using email as a metaphor, the processes have an inbox where all the arriving messages are queued and dealt with using the FIFO schema and the pattern-matching mechanism.

To truly grasp how all this works, it is very important to understand how the receive ... end construct works. Say you have a queue of messages that have arrived and a set of tuples representing different patterns. This is what happens when—and only when—a new message arrives:

  • The first message in the queue (i.e., the first arrived) is matched against the first pattern in the set:
    • For a match:
      • The statements that follow are executed.
      • The message is removed from the queue.
    • For no match:
      • The message is matched against the next pattern in the set.
  • If the first message doesn't match any pattern in the list, it is set apart in a so-called save queue and the second message is handled. All the messages in the save queue will no longer be matched, even after the arrival of a new message.

One of the receive ... end construct's additional features is a timer that stops the waiting for a new message after a time, performs some actions, and then takes messages from the save queue and puts them back into the mailbox in their original order.

Close Icon
Thanks for your registration, follow us on our social networks to keep up-to-date