Writing Parallel Programs with Erlang

Writing Parallel Programs with Erlang

or years, heating problems have prevented CPU manufacturers from building viable CPUs with clock rates higher than 4.0GHz. This single-core processor speed limitation means that the only viable way to make programs run faster is to run them on more than one processor. Consequently, manufacturers started making processors with multiple cores (i.e., multiprocessor CPUs). In fact, you can expect to see processors with more than 100 cores in the near future. In order to fully exploit the potential of these new processors, software developers will be forced to write parallel programs. As Herb Sutter wrote regarding Moore’s law: “The Free Lunch Is Over.” The transition won’t be easy, though, because writing concurrent programs is hard work.

The main framework for parallel computing in the past 30 years has been single processes, threads, or programs that communicate with one another by reading and writing in a shared memory area and using locks and semaphores. In multiprocessor CPUs, the individual cores talk to each other mainly by sending asynchronous messages. This framework requires you to pay a lot of attention to mutually exclusive algorithms to avoid the simultaneous use of common resources or side effects such as deadlocks from using semaphores.

This article introduces the functional language Erlang, a good choice for writing parallel programs, and explains how you can use it to fully exploit current and future multicore CPUs.

Meet Erlang

Designed at Ericsson laboratories in 1986, Erlang (named for Danish mathematician Agner Krarup Erlang) is used largely in the telecommunications industry. Nevertheless, it is a general-purpose, functional programming language that belongs to a more comprehensive rather than imperative class of declarative languages. One of its most important features is its implementation of concurrent computation according to the Actor model, a mathematical schema that implies that everything is an actor (much like the object-oriented philosophy of everything is an object). Because an actor sends and receives messages and accordingly acts and creates other actors, the Actor model is inherently concurrent, whereas the object-oriented paradigm is practically sequential.

The Erlang software concurrency paradigm leaves no space for state. You can assign variables only once and must perform iteration through recursion. No state means you do not need memory protection, which minimizes side effects as well as the amount of data shared between processes.

Erlang functions are similar to mathematical functions: with the same input they return the same value, regardless of the context of the call. Erlang functions pass each parameter by value, so anything that you can reference is a value. This feature is called reference transparency.

Although variable assignments are subject to strict rules, Erlang partially makes up for that restriction using a powerful mechanism called pattern matching. This feature is similar to the way human beings interpret reality?by determining similarities between real objects and the conceptual schemas in their minds.

A Mathematical Breakdown of Erlang

Number theory contains numerous basic calculations that can benefit from parallelism. For example, consider the calculation of the factorial of a number and the associative property of multiplication. If the factorial of n is:

 
[1] n! = 1 * 2 * 3 *...* n

And if q is an integer such as:

q < nt = n/q

You can write [1] as:

n! = a * b * c * ...* j

Where:

a = 1 * 2 * 3 *...* qb = [(q+1) * (q+2) * ... * 2q)c = [(2q+1) * (2q+2) * ... * 3q)...j = [((t-1)q+1) * ((t-1)q+2) * ... * tq]

If n/q results in a remainder:

n! = a * b * c * ...* j * k

Where:

k = [(tq + 1) * (tq + 2) * ... * n]
Figure 1. Simple Flowchart for the Code: You can see the interactions among the main functions in the code here.

That is, you can divide the whole interval of numbers in t+1 smaller intervals, calculate the partial products, and then multiply them again to obtain the final result.

With a multiprocessor machine, I wondered if by calculating a, b, ..., k simultaneously and then multiplying the results together I could shorten the total elapsed time. In an attempt to solve this problem, I wrote a small piece of Erlang code. While not a usable implementation of the factorial function, the code does serve to explore Erlang syntax and its potential for concurrency computation (see Figure 1).

The remainder of this article shows you how you can use Erlang's message-passing method to enable parallel processes to work together concurrently. During the analysis of the code, I explain some syntax features and show you how recursion, pattern matching, and concurrency work in Erlang. Covering all Erlang syntax and features is beyond the scope of this article, but you can find a lot of documentation on the Internet.

Anatomy of an Erlang Program

Erlang programs start with declarations that contain the dash character (-) in position 0 and that make assertions about the naming, availability, and definitions of functions. Take a look at a couple of declarations:

-module(fact). 

Modules

Erlang uses modules. The module fact resides in a file named fact.erl, and it requires the module declaration at the top (click here to download the file). As you likely noticed, all statements have to be terminated with a dot:

-export([start/2]).

In each module, you can write a lot of functions, but only the ones exported can be seen and called outside the module itself. The notation start/2 means that the exported function named start accepts two arguments:

start(ProcNum, Number) ->    ComputationStart = erlang:now(),    ...

This is the way you define a function. Again, notice that the whole definition is a statement and that the single instructions that compose it have to be separated by a comma.

Variables

Variables start with a capital letter. To avoid hidden effects, you can bind them to a constant value only once. A variable can be bound or not but the compiler will issue a warning if you use an unbound variable!

Erlang provides a few types of variables. The most used are numbers, atoms, lists, and tuples. Atoms are constant values with names that start with small letters. Otherwise, you have to enclose them in quotation marks. Lists and tuples respectively store a variable number and a fixed number of numbers, atoms, lists, and tuples. You enclose lists in brackets and tuples in curly brackets. Erlang is a dynamically typed language, so it performs variable type checking at run time.

The following line is the Erlang syntax to call a function that is exported by a module. In this case, the function seq of the module lists returns a list containing all integers from 1 to Number. The basic installation of Erlang provides a set of utility modules such as lists:

compute(ProcNum, lists:seq(1,Number)),

Note the call to the function compute. Here is the definition of that function:

compute(_, Numbers) when length(Numbers) == 1 ->    ...compute(ProcNum, Numbers) when ProcNum == 1 ->    ...compute(ProcNum, Numbers) ->    ...

You can define a function in more than one way. In this case, the three variants of the function differ by the presence of what is called a guard clause. But, as you will see further in the code, the same function can also have different definitions based on the number and the kind of input parameters.

Recursion and Pattern Matching

The following code contains the definitions of the function create_intervals, which divides the total interval of numbers between 1 and n into a number of smaller intervals that can be further treated by parallel processes.

create_intervals(ListLen, Delta) when ListLen =< Delta ->    ...create_intervals(ListLen, Delta) ->    create_intervals(Delta + 1, ListLen, Delta, [[1, Delta]]).create_intervals(StartIndex, ListLen, Delta, L) when ... ->    ...create_intervals(StartIndex, ListLen, Delta, L) ->    create_intervals(...).

A for loop could be a natural choice here, but Erlang substitutes iteration with recursion. Therefore, create_intervals calls itself and recursively builds a list of lists, with two atoms representing the extreme indexes of smaller intervals.

If you look at the same compute function, which divides a list of numbers into groups, calculates the single products for each group, and returns them in lists shorter than the initial one, you realize that applying it recursively at the end of the process results in a number that is the product of the numbers in the initial list. For this reason, the last line in the compute function is:

compute(ProcNum, Products).

Coming back to the function create_interval, note that it has multiple definitions that differ for guard clauses as well as for different input patterns.

Erlang uses pattern matching intensively both in choosing the right function to call and in assigning variables. In fact, you can read the operator = not as "equal to" but as "match to," an assertion that two items are equal also with regard to their shape. It is a subtle distinction but an important concept to keep in mind when reading or writing Erlang code. For example, when you enter the following code:

A = 1.

The system tries to match variable A with 1. If the system uses A for the first time the match succeeds and binds the variable A to 1, from then on A will match only 1 and any other match/assignment will fail.

You can foresee the potential of pattern matching with this example:

L = [a, b, c, d].[A | B] = L.

The notation | separates the head of a list (first item) from the tail (the remaining items). So, in this statement, pattern matching binds the variable A to atom a and B to the list [b, c, d].

Also, if you write:

[C, D | E] = L.

C will be bound to a, D to b, and E to the list [c,d].

Creating Parallel Processes

Erlang's support for concurrency enables you to exploit symmetric multiprocessor (SMP) architectures and multicore CPUs. In fact, processes should be the main driver when you design an Erlang application. The parallelism is provided by Erlang and not by the host operating system. Nevertheless, an Erlang process is halfway between a native thread and a green one: it originates within the Erlang Run Time System (ERTS) but doesn't share any state with other processes.

The following line in the compute function shows how you can create parallel processes in Erlang.

ProductsCollector_PID = spawn(fact, products_collector, [[], self()]),

The function spawn/3 allows you to create another process parallel to the one in which the function is launched. The first parameter is the module from which the process is spawned, the second one is the function to run in parallel and its list of parameters. The whole line returns a special variable, a PID variable, that is bound to the PID of the new spawned process. This value is precious because it is essential for talking to the new process.

Using the flowchart in Figure 1, you can figure out that the function compute spawns the function products_collector and multiple instances of the function partial_compute in parallel. Each instance of partial_compute will calculate the products of a single interval of numbers, in which the initial interval is divided.

Obviously, you cannot predict the time each instance of partial_compute requires to multiply the numbers in its interval. For this reason, I needed an additional process, products_collector, which receives the results from various instances of partial_compute as soon as they finish and can determine when all partial_compute functions are exited.

For more detail, here is how multiple instances of partial_compute are spawned:

lists:map(fun(X) ->             spawn(fact, partial_compute, [X,Numbers,ProductsCollector_PID])          end,           Intervals),

The function lists:map takes two arguments. The second is a list, while the first is the inline definition of a function that has to be applied to the elements of the list. The lists:map function returns another list with the results. This is an example of how in Erlang you can pass names of functions as arguments to other functions.

Talking Between Processes

In Erlang, processes don't share anything and are completely separate from each other. Their only means of communication are messages. As previously stated, the products_collector processes receive messages from (1) partial_compute functions when they finish their calculations and from (2) the function compute when all the partial_compute functions have exited and are asking for the resulting list. Here is the syntax for sending messages in both cases:

Scenario 1ProductsCollector_PID ! {get_products,self()},Scenario 2ProductsCollector_PID ! {add_product,multiplication(FactorsInTheInterval)}.

The schema in both lines sends a message composed of the PID of the process to which the message is sent, the operator !, and the message itself in the form of a tuple containing the information. In the first scenario, for example, the message contains the atom get_product and the PID of the process that is sending the message. This is necessary if you want the contacted process to answer back to the message.

All the sent messages are asynchronous, which means that the program doesn't wait for an answer (if one is due). It immediately executes the next instruction.

Now take a closer look to the function products_collector and the way the processes wait for a message to arrive and perform actions.

products_collector(L, Loop_PID) ->    receive        {add_product, Product} ->             Loop_PID ! {done},            products_collector([Product|L], Loop_PID);        {get_products, Client_PID} ->             Client_PID ! {products, L},            products_collector(L, Loop_PID);        {done} ->             noop    end. 

Using email as a metaphor, the processes have an inbox where all the arriving messages are queued and dealt with using the FIFO schema and the pattern-matching mechanism.

To truly grasp how all this works, it is very important to understand how the receive ... end construct works. Say you have a queue of messages that have arrived and a set of tuples representing different patterns. This is what happens when?and only when?a new message arrives:

  • The first message in the queue (i.e., the first arrived) is matched against the first pattern in the set:
    • For a match:
      • The statements that follow are executed.
      • The message is removed from the queue.
    • For no match:
      • The message is matched against the next pattern in the set.
  • If the first message doesn't match any pattern in the list, it is set apart in a so-called save queue and the second message is handled. All the messages in the save queue will no longer be matched, even after the arrival of a new message.

One of the receive ... end construct's additional features is a timer that stops the waiting for a new message after a time, performs some actions, and then takes messages from the save queue and puts them back into the mailbox in their original order.

How to Run the Code in Debian Linux

In 1998, Ericsson released Erlang as an open-source product and created the web site Erlang.org, which provides the code as well as binaries of the Erlang interpreter for Linux, Mac, and Windows environments. The code download also includes a set of module libraries and a real-time distributed database called Mnesia. The entire framework is called the Erlang Open Telecom Platform (OTP).

In Debian Linux, you simply run the following command:

apt-get install erlang 

The complete Erlang OTP platform, with all its tools and utility functions, will be installed on your machine.

You have to compile Erlang code in order to obtain the pseudocode for the interpreter, which is a file with .beam extension.

erlc fact.erl

Although quite verbose, the compiler is clear and helpful in understanding the philosophy of Erlang. For example, it is able to recognize when a variable is unbound and yet is going to be used.

After compilation, you can run all the functions exported in the module fact. This is very useful while debugging, where you need to find out which function doesn't work properly. Run the Erlang emulator erl and enter the following in its environment:

Erlang R13B01 (erts-5.7.2) [source] [smp:2:2] [rq:2] [async-threads:0] [kernel-poll:false]Eshell V5.7.2  (abort with ^G)1> 

You can recompile the code inside the emulator as follows:

c("fact.erl").

Typically, the function start represents the main to run the program:

2> fact:start(2,12).Time elapsed for factorial of 12 using 2 parallel processes: 1.45e-4 secondsok2> 

Try to calculate the factorial of a large integer with a number of parallel processes greater than 1 and you will see the differences in elapsed time between a traditional CPU and a multicore machine.

As previously stated, you can also test single functions if you export them and use them with the right input:

3> fact:compute(2,[123,234,456,345]).45279842403> 4> fact:multiplication([123,234,456,345]).45279842404>

Here is how you launch the debugger:

Erlang R13B01 (erts-5.7.2) [source] [smp:2:2] [rq:2] [async-threads:0] [kernel-poll:false]Eshell V5.7.2  (abort with ^G)1> c("fact.erl",[debug_info]).{ok,fact}2> im().<0.46.0>3> ii(fact).{module,fact}4> iaa([init]).true5> fact:start(2,12).

The statements starting with i launch a graphical processes monitor and a typical debugging window showing the source code (see Figure 2 and Figure 3). Of course, the system will open one of these windows for each process spawned during execution.


Figure 2. Processes Monitor: You can track all the processes managed by the Erlang emulator, which is useful.
 
Figure 3. Debugging Window: Here is a typical debugging window. You can launch one for each process spawned by the Erlang emulator.

The Next Step: LYME

You now hopefully have enough Erlang knowledge to make parallel processes work together concurrently. Nowadays, you will find a lot of production applications developed in Erlang, including Yahoo! del.icio.us, the chat backend of Facebook, Twitterfall (Twitter search engine for trends), and Amazon SimpleDB, many of which take advantage of multicore machines. Why not add your applications to the list? If you want to really dive into the language, the LYME (Linux, Yaws, Mnesia, Erlang) solution stack offers a good environment for writing fully functional solutions for the parallel world.

devx-admin

devx-admin

Share the Post:
Malyasian Networks

Malaysia’s Dual 5G Network Growth

On Wednesday, Malaysia’s Prime Minister Anwar Ibrahim announced the country’s plan to implement a dual 5G network strategy. This move is designed to achieve a

Advanced Drones Race

Pentagon’s Bold Race for Advanced Drones

The Pentagon has recently unveiled its ambitious strategy to acquire thousands of sophisticated drones within the next two years. This decision comes in response to

Important Updates

You Need to See the New Microsoft Updates

Microsoft has recently announced a series of new features and updates across their applications, including Outlook, Microsoft Teams, and SharePoint. These new developments are centered

Price Wars

Inside Hyundai and Kia’s Price Wars

South Korean automakers Hyundai and Kia are cutting the prices on a number of their electric vehicles (EVs) in response to growing price competition within

Solar Frenzy Surprises

Solar Subsidy in Germany Causes Frenzy

In a shocking turn of events, the German national KfW bank was forced to discontinue its home solar power subsidy program for charging electric vehicles

Malyasian Networks

Malaysia’s Dual 5G Network Growth

On Wednesday, Malaysia’s Prime Minister Anwar Ibrahim announced the country’s plan to implement a dual 5G network strategy. This move is designed to achieve a more equitable incorporation of both

Advanced Drones Race

Pentagon’s Bold Race for Advanced Drones

The Pentagon has recently unveiled its ambitious strategy to acquire thousands of sophisticated drones within the next two years. This decision comes in response to Russia’s rapid utilization of airborne

Important Updates

You Need to See the New Microsoft Updates

Microsoft has recently announced a series of new features and updates across their applications, including Outlook, Microsoft Teams, and SharePoint. These new developments are centered around improving user experience, streamlining

Price Wars

Inside Hyundai and Kia’s Price Wars

South Korean automakers Hyundai and Kia are cutting the prices on a number of their electric vehicles (EVs) in response to growing price competition within the South Korean market. Many

Solar Frenzy Surprises

Solar Subsidy in Germany Causes Frenzy

In a shocking turn of events, the German national KfW bank was forced to discontinue its home solar power subsidy program for charging electric vehicles (EVs) after just one day,

Electric Spare

Electric Cars Ditch Spare Tires for Efficiency

Ira Newlander from West Los Angeles is thinking about trading in his old Ford Explorer for a contemporary hybrid or electric vehicle. However, he has observed that the majority of

Solar Geoengineering Impacts

Unraveling Solar Geoengineering’s Hidden Impacts

As we continue to face the repercussions of climate change, scientists and experts seek innovative ways to mitigate its impacts. Solar geoengineering (SG), a technique involving the distribution of aerosols

Razer Discount

Unbelievable Razer Blade 17 Discount

On September 24, 2023, it was reported that Razer, a popular brand in the premium gaming laptop industry, is offering an exceptional deal on their Razer Blade 17 model. Typically

Innovation Ignition

New Fintech Innovation Ignites Change

The fintech sector continues to attract substantial interest, as demonstrated by a dedicated fintech stage at a recent event featuring panel discussions and informal conversations with industry professionals. The gathering,

Import Easing

Easing Import Rules for Big Tech

India has chosen to ease its proposed restrictions on imports of laptops, tablets, and other IT hardware, allowing manufacturers like Apple Inc., HP Inc., and Dell Technologies Inc. more time

Semiconductor Stock Plummet

Dramatic Downturn in Semiconductor Stocks Looms

Recent events show that the S&P Semiconductors Select Industry Index seems to be experiencing a downturn, which could result in a decline in semiconductor stocks. Known as a key indicator

Anthropic Investment

Amazon’s Bold Anthropic Investment

On Monday, Amazon announced its plan to invest up to $4 billion in the AI firm Anthropic, acquiring a minority stake in the process. This decision demonstrates Amazon’s commitment to

AI Experts Get Hired

Tech Industry Rehiring Wave: AI Experts Wanted

A few months ago, Big Tech companies were downsizing their workforce, but currently, many are considering rehiring some of these employees, especially in popular fields such as artificial intelligence. The

Lagos Migration

Middle-Class Migration: Undermining Democracy?

As the middle class in Lagos, Nigeria, increasingly migrates to private communities, a PhD scholar from a leading technology institute has been investigating the impact of this development on democratic

AI Software Development

ChatGPT is Now Making Video Games

Pietro Schirano’s foray into using ChatGPT, an AI tool for programming, has opened up new vistas in game and software development. As design lead at business finance firm Brex, Schirano

Llama Codebot

Developers! Here’s Your Chatbot

Meta Platforms has recently unveiled Code Llama, a free chatbot designed to aid developers in crafting coding scripts. This large language model (LLM), developed using Meta’s Llama 2 model, serves

Tech Layoffs

Unraveling the Tech Sector’s Historic Job Losses

Throughout 2023, the tech sector has experienced a record-breaking number of job losses, impacting tens of thousands of workers across various companies, including well-established corporations and emerging startups in areas

Chinese 5G Limitation

Germany Considers Limiting Chinese 5G Tech

A recent report has put forth the possibility that Germany’s Federal Ministry of the Interior and Community may consider limiting the use of Chinese 5G technology by local network providers

Modern Warfare

The Barak Tank is Transforming Modern Warfare

The Barak tank is a groundbreaking addition to the Israeli Defense Forces’ arsenal, significantly enhancing their combat capabilities. This AI-powered military vehicle is expected to transform the way modern warfare

AI Cheating Growth

AI Plagiarism Challenges Shake Academic Integrity

As generative AI technologies like ChatGPT become increasingly prevalent among students and raise concerns about widespread cheating, prominent universities have halted their use of AI detection software, such as Turnitin’s

US Commitment

US Approves Sustainable Battery Research

The US Department of Energy has revealed a $325 million commitment in the research of innovative battery types, designed to enable solar and wind power as continuous, 24-hour energy sources.

Netanyahu Musk AI

Netanyahu and Musk Discuss AI Future

On September 22, 2023, Israeli Prime Minister Benjamin Netanyahu met with entrepreneur Elon Musk in San Francisco prior to attending the United Nations. In a live-streamed discussion, Netanyahu lauded Musk