typical multi-threaded application in Java contains numerous synchronized methods and statements. They might also
contain calls to the methods wait()
that were introduced with Java 1.0, but these methods provide very primitive functionality and are easily misused. Java 5 introduced the java.util.concurrent package, which provides some higher-level abstractions away from wait()
. However, it can still be a challenge to appropriately use the synchronized and volatile keywords. Even when used correctly, getting them used efficiently can require complicated orchestrations of locks.
The biggest criticism of Java's synchronization is performance. Synchronization blocks become overly encompassing too easily. Although a synchronization block on its own is far from slow, when overly encompassing, it becomes a contested synchronization block. Contested synchronized blocks, or other blocking operations, are slow and require the OS to put threads to sleep and use interrupts to activate them. This puts pressure on the scheduler, resulting in significant performance degradation.
The actor model (native to some programming languages such as Scala) is a pattern for concurrent computation that enables applications to take full advantage of multi-core and multi-processor computing. The fundamental idea behind the actor model is that the application is broken up into "actors" that perform particular roles. Every method call (or message) to an actor is executed in a unique thread, so you avoid all of the contested locking issues typically found in concurrent applications. This allows for more efficient concurrent processing while keeping the complexity of actor implementations low, as there is no need to consider concurrent execution within each actor implementation.
The class in Listing 1 shows what an actor class might look
like. This class takes a string of words and saves them to an XML file, and includes a calculated code for every character stored. The code might be used later as an index or to find similar text blocks. Notice that this class is not thread safe and you can only use each instance from a single thread. This is normal, because each actor is used from only one thread. It is common not to have any synchronized or volatile keywords present in an actor class because they are not needed.
Long-lived, normally synchronized objects used by different threads are better off with a dedicated
thread—free from any synchronization issues. Each method call is placed in the queue (the order within the queue is not important) waiting until the actor is available to process the call. Think of this queue like your email in-box: messages are received at any time and are acted on when time permits. Typically, calls are asynchronous and do not block, so the calling thread continues execution and avoids any need to rely on thread interrupts. When callers need a result, you can pass a callback object as part of the parameters to allow the actor to notify the caller. In some cases, it is desirable to block the caller until the actor processes the message.
You can separate the storage actor in Listing 1 into a second actor
as shown in Listing 2. In this way, the storage
actor calls an instance of HexCoderActor with itself as the callback. The storage actor does not
wait for the HexCoder to generate the hex code, but instead continues with other items in its queue.
This allows the storage actor's thread to specialize in writing the resulting XML file, while the text code is
calculated asynchronously in another thread. Notice how these classes can take advantage of concurrent threads without
any special keywords or deep knowledge of concurrent programming.
Every actor needs a manager to allocate and manage its thread. Each actor also needs a proxy to send messages to its
queue. Implementing a basic actor manager is straightforward. In
Listing 3, shows such a manager written in Java 5. It uses Java's Proxy object to dynamically wrap an actor,
implementing all of the actor's interfaces. Every method call on the proxy is then queued in an
ExecutorService—void methods are asynchronous and other method calls block until the executor has finished executing and the result is available.
Exception Handling and Worker Services
In every program, it is important to test and have proper exception handling. This becomes even more important with multi-threaded programming, because asynchronous execution quickly becomes difficult to debug. Because execution is not done sequentially, a sequential debugger is less useful. Similarly, stack traces are shorter and do not give caller details. In these situations, it is best to either have the actor handle exceptions itself or enable callbacks to handle both successful results and exceptions.
You should also consider that calls to an actor do carry some overhead when compared to sequential calls. You need to queue messages passed to a separate thread and you cannot optimize with compilers in the same manner as sequential calls. This makes the actor model less applicable to smaller, faster objects that are better implemented as immutable or stateful. However, there are also advantages to running actors in a dedicated thread. By avoiding "synchronized" and "volatile" keywords, the on-board chip memory does not need to sync up with the main memory as often, since the actor's thread is the only thread that can access its variables. Modern compilers can also observe that the head-lock of the queue is only used from its actor thread and optimize it away, making it possible for actors to run without any interruption or mandatory memory flushing. Therefore, use actors for specialized worker services.
An example of worker services is an importing and indexing service. Consider the task of retrieving remote data,
processing it locally, and storing it into a local database. You might break this up into three steps:
- Retrieve data.
- Process data.
- Store result.
In this example, the remote data is not retrieved by a single connection, but rather in multiple files that are
listed in index files, mixed in with the data files. The remote data is in a format that you cannot process directly
and you need to pre-process or format it first. Furthermore, you need to convert the data because it uses a different
vocabulary. This creates six steps:
- Retrieve index or data file.
- Format the file for parsing.
- Convert data.
- If index, then list data files and go to step 1.
- Process data files.
- Insert data.
These six steps fit well into the actor model. Think of each of these steps as a job that one or more individuals (actors) need to perform.