emember the red telephone box, once a familiar sight on the streets of London? That's a good example of mutually exclusive access to a shared resource, although you probably didn't find any locks on them. Why? Because only one person at a time could use one to make a call, and civil persons would not listen to a stranger's conversation while waiting outside.
Unfortunately, there are no guarantees that programs will be equally civil, so wise programmers use semaphores to keep processes from running amok and leaving shared resources, such as files and I/O devices, in inconsistent states.
Mutual exclusion locks (also, called "mutex locks" or simply locks or mutexes) are a special kind of semaphore. Each protects a single shared resource, qualifying it as a binary semaphore. Concurrent programs use locks to guarantee consistent communication among threads through shared variables or data structures. A piece of program code protected by a mutex lock is called a "critical section."
Mutex locks are often implemented using an indivisible "test-and-set" instruction in today's prevalent multi-core systems. Although generally deemed efficient, relying on an indivisible test-and-set instruction incurs a few hidden performance penalties. First, execution of such an instruction requires memory access, so it interferes with other cores' progress—especially when the instruction is in a tight loop. The effect may be felt even more acutely on systems with a shared memory bus. Another penalty stems from cache-coherency. Because the cache line containing a lock object is shared among cores, one thread's update to the lock invalidates the copies on the other cores. Each subsequent test of the lock on other cores triggers fetching the cache line. A related penalty is false sharing where an unrelated write to another part of the cache line invalidates the whole cache line. Even if the lock remains unchanged, the cache line must be fetched to test the lock on a different core.
Given all these problems, one might wonder "Why use locks at all? What are the alternatives?" One extreme alternative is to give up on communicating through shared variables and adopt the mantra of "no sharing." That involves replicating data and communicating via message passing. Unfortunately, the cost of replication and message passing is even greater than the overhead associated with locks on today's multi-core shared-memory architectures.
Another approach that has been actively pursued recently as an alternative to mutex locks is lock-free/non-blocking algorithms. Researchers have reported some isolated successes in designing practical non-blocking implementations. Nonetheless, non-blocking algorithms are hardly a "holy grail". Designing efficient non-blocking data structures remains difficult, and the promised performance gain has been elusive at best. You'll see more about non-blocking algorithms at the end of this article.
With no proven better alternatives at present, it makes sense to make the most of mutex locks until they are rendered no longer necessary. This article discusses some experiences with mutex locks in developing multi-threaded concurrent applications, using the mutex locks provided in Intel® Threading Building Blocks as examples.