Use Callbacks to Isolate Concurrency Bugs

ne of the most common errors Java programmers make when they first learn multi-threaded programming is to misunderstand locks. They believe that locking an object prevents access to its fields and methods, when in fact a lock on an object serves only to prevent other threads from gaining the same lock. This confusion, while understandable, can lead to vexing concurrency bugs.

In fact, many concurrency bugs come down to particular data being accessed at the wrong time?usually while someone else is changing it. Concurrency models in general rely on a set of interconnected synchronization zones, spread out over a number of source files. Such designs are vulnerable to “concurrency rot”, in which the delicate relationships between different elements become hard to manage as code changes. Wouldn’t it be nice if you really could lock an object and keep it all to yourself while you used it?

This article describes a method for constructing high-load concurrent servers, which prevents concurrency rot. By restricting all data access to a callback mechanism, the server can contain all concurrency issues in a single place, making it much easier to see if concurrency constraints have been violated.

Callbacks for Single-Threaded Access

Follow this tutorial to design and build a server that uses callbacks for single-threaded access. Objects containing sensitive data will be accessed linearly?that is, they will be available to only one thread at a time.

The way you will enforce this is opposite from the way it is usually done. Normally, multiple threads attempt to gain access to a resource by competing for a lock. The thread that gets the lock has access until it releases the lock, at which point the next thread gets access.

In this design, however, you use a callback. You pass a callback object with a method called access() to a gatekeeper. The gatekeeper passes the sensitive data to this method of the callback object. When the callback object is done using the data, the gatekeeper passes the data to the next callback object.

In a sense, this design isn’t really different from the typical one: each object or thread gets a turn, and during its turn it has exclusive access. But instead of having the threads contend for control, the control is entirely in the hands of the gatekeeper object, which decides who gets access when.

Benefits of Callbacks

Implementing a callback-based system requires extra work but it offers a number of benefits.

Normally, you must acquire certain locks before you can access certain data. Synchronized methods can make this mandatory, but they often aren’t enough because the accessing code still has to obey some kind of synchronization protocol. As a system gets larger, you are more and more liable to access the data at the wrong time. Also, just putting synchronization blocks around everything becomes more likely to result in deadlock.

A callback, on the other hand, cannot access the data until that data is passed to the method, and it cannot access the data after the method is finished. Thus, the period in which the data is available is precisely delineated.

Furthermore, this mechanism is controlled by the gatekeeper, which can implement any kind of ordering mechanism. Conversely, the traditional wait/notify method provides no way of knowing which threads will gain access at which times.

(Of course, a callback method is free to squirrel away a pointer to the sensitive data, or pass it to another object or thread. This would certainly violate the linearity guarantee, but it is not the kind thing you are liable to do accidentally.)

The Sum Class

To initiate your design, create the sensitive data object first. Use a very simple data object called a Sum:

public class Sum{  public int a, b, c;  // ...}

A sum stores three numbers: a, b, and c, where c = a + b. However, these variables are public, which means someone could modify them so that c is no longer equal to a + b. This is the problem you want to avoid.

Of course, you easily could protect c by hiding it behind a synchronized access method, but as mentioned previously, this article’s purpose is to explore an alternative method of data protection, one you would use when regular synchronization is either difficult or undesirable. Sum is a trivial object, but the complexity of the vulnerable data isn’t the focus here. Rather, the vulnerability itself is.

The GateKeeper Class

All access to the Sum object will go through the GateKeeper class. A GateKeeper controls access to the object that is passed to its constructor:

  GateKeeper gk = new GateKeeper( new Sum() );

To use the object hidden inside the GateKeeper, you must pass a callback to the GateKeeper’s use() method:

    gk.use( user );

The User, Accessor, and Mutator Interfaces

Java does not have first-class functions, so you can’t use actual callbacks. However, you can get close to real callbacks by using interfaces.

The Accessor interface describes an object that wants to read a piece of data:

public interface Accessor extends User{  public void access( Object o );}

Similarly, the Mutator interface describes an object that wants to write to a piece of data:

public interface Mutator extends User{  public void mutate( Object o );}

You also need to define an empty interface called User. Objects that implement User are either Accessors or Mutators:

public interface User{}

For good measure, you also have the interface MutatingAccessor, which implements both Accessor and Mutator. This isn’t strictly necessary, but it makes your object declarations neater:

public interface MutatingAccessor extends Accessor, Mutator{}

Implementing use()

Take a closer look at the use() method of the GateKeeper class, as this is where the action is. Note that the parameter to use() is of type User, which means that you can pass in an object that is an Accessor, a Mutator, or both:

  public void use( User user ) {

Since the user could be both a Mutator and an Accessor, you need to figure out which aspect of it you want to take care of first. For this tutorial, mutate first:

    if (user instanceof Mutator) {      Mutator mutator = (Mutator)user;

You know now that the user is a Mutator. But before you let the Mutator do its mutating, you need to acquire a lock. In fact, you will wrap the mutation activity inside a lock/unlock pair:

      try {        rwlock.getWriteLock();       // LOCK        mutator.mutate( o );      } finally {        rwlock.releaseWriteLock();   // UNLOCK      }

You see here that, deep down, the callback method isn’t fundamentally different from the traditional lock/access/unlock pattern. However, the locking is done entirely by the gatekeeper. This has a number of significant benefits:

  1. The structure of the locking and unlocking is very clear.
  2. The GateKeeper can determine the locking policy.
  3. The client code is much simpler.
  4. The locking and unlocking happens in only one place.

For high-availability servers, the fourth benefit is perhaps the most important. Normally, every client of a multi-threaded data structure takes care of locking and unlocking, and so each one can potentially lock a resource and forget to unlock it. As a programmer, you try to avoid this, but it still happens sometimes. If “sometimes” is too often for your highly available server, then you must provide a stronger guarantee.

The code fragment above uses a finally block to ensure that the resource is unlocked no matter what happens.

Now that you have let the Mutator mutate, you should allow the Accessor to access. You allowed the Mutator to change the data, but allow the Accessor only to read it:

    if (user instanceof Accessor) {      Accessor accessor = (Accessor)user;

This code is very similar to the mutation code above. The difference is that you need only a read lock, not a write lock:

      try {        rwlock.getReadLock();        accessor.access( o );      } finally {        rwlock.releaseReadLock();      }

Since a read lock is not exclusive, this configuration allows many readers to read the data simultaneously. At the same time, a thread that wants to write to the data has absolutely exclusive access.

The Pound Class (Testing the GateKeeper)

The GateKeeper is really the heart of this synchronization method, and the use() method is the heart of the GateKeeper. The use() method is fairly short and simple; it looks like it does what it’s supposed to. But you still should test it thoroughly.

The Pound class does this for you. As its name suggests, Pound “pounds” on the GateKeeper, using it as fast as possible with lots of threads. Inside the GateKeeper, of course, is a Sum object.

You run Pound like this:

% java Pound 20

This creates 20 Pound objects, running in 20 threads. Each Pound object takes its turning modifying and verifying the sum object.

Since a Pound wants access to the Sum object, it must go through the gatekeeper, which means it is an Accessor and Mutator. Each time through its main loop, a Pound object flips a coin and does either an access (read) or a mutation (write). Here is its mutate() method:

  public void mutate( Object o ) {    Sum sum = (Sum)o;

First, you change either a or b:

    // Change a or b.    int delta = rand.nextInt( 2000 ) - 1000;    if (rand.nextInt( 2 )==0) {      sum.a += delta;    } else {      sum.b += delta;    }

At this point, the data is in an inconsistent state: c does not equal a + b. This is precisely the time when you do not want anyone looking at the data.

Just to prove a point, do a yield() and a sleep(), allowing other threads to run. You want to make sure that your code works because it is correct, not because other threads didn’t get a chance:

    // The better to stress the thread-safety of the system.    Thread.yield();    try { Thread.sleep( 20 ); } catch( InterruptedException ie ) {}

After your daring pause, correct your inconsistency and continue:

    // Make sum correct again.    sum.c = sum.a + sum.b;    // Report.    checkAndReport( sum, "mutate" );    pause();  }

Once the call to mutate() ends, other threads will get a chance to run. Your access() method is much simpler.

    Sum sum = (Sum)o;    // Just check the sum.    checkAndReport( sum, "access" );    pause();

The only thing to do is output the values and verify that c does in fact equal a + b. If all is working correctly, it does.

You can easily test the code by running Pound with a good number of threads (say, 10 or 20) and letting it run for a while. It should report no errors.

Tracking How Many Threads Are Using An Object

I developed this technique because I was struggling with Java’s garbage collector. I faced a situation where I was running out of memory, but the memory was allocated in native code, so the garbage collector was not running. I had huge objects that I wanted collected, but Java didn’t know they were huge, so it didn’t bother.

What I really needed was a way to find out when a particular object was no longer being used, but this is something that Java hides from you. The whole point of the garbage collector is to take care of such objects automatically.

Inspired by the idea of linear variables and monadic state, I decided that what I needed was a single-threaded or linear object?one that would be accessed by only one thread at a time (although I stretched this idea a bit to allow multiple readers). By using this structure, I was able to more precisely control and track the number of threads using an object, which was exactly what I needed to know when an object was no longer in use.

Share the Post:
Share on facebook
Share on twitter
Share on linkedin

Overview

Recent Articles: