Home » Avoiding the Perils of C++0x Data Races

Avoiding the Perils of C++0x Data Races

ace conditions are an inherent part of parallel programming. A race condition exists any time a program’s behavior may depend on the relative ordering of events on separate threads. In the vast majority of cases, race conditions are harmless?the program works regardless of which thread gets a lock first, or which thread processes a chunk of data. In some cases, however, race conditions can cause problems.

The main danger with race conditions is that by their very nature they are timing dependent. This becomes problematic when one thread executes a particular piece of code while another thread is executing a different piece of code. If the pieces of code in question are very small (only one or two CPU instructions, for example) and occur very rarely, then the race condition might not show up very often, and you may miss it entirely during testing. In fact, the conditions necessary for the problem to occur may not manifest at all during testing.

For example, if your test system has only one CPU, then threads cannot really execute in parallel. You must interleave them. This lack of true concurrency means that some potential race condition problems just cannot occur. The same problem exists to a lesser extent if you test code on a system with a small number of CPUs (for example, on a dual-core desktop machine) when the problematic conditions can happen only with a higher level of parallelism (for example, on a 64-CPU server machine).

This article demonstrates how race conditions in general and C++0x data races in particular can cause real problems in parallel code and offers some tips for preventing them.

Locks Cannot Prevent Race Conditions

Protecting your data with a mutex lock does not guarantee that your code will be free from problematic race conditions, even if you obsessively ensure that the data is accessed only while the lock is held. If the lock is at the wrong level of granularity, or the scope of the lock is wrong, then problematic race conditions can still occur.

For example, consider a simple data structure that contains a list of items and a count of the items in the list. If you protect each part of the data structure with its own mutex, you can still get race conditions even though everything is nominally synchronized. Because the parts are protected with individual mutexes, you must update them separately. This means that at certain points you will have updated one and not the other (for example, you have added a new item to the list but have not yet updated the count). Thus, when another thread accesses the data structure, it will see the two parts of the data structure as out of sync with each other.

The solution in this case is obvious: use a single mutex to protect the entire data structure. In more complex cases, it can be much harder to identify scenarios where there may be a race condition, and eliminating the race condition may require more extensive changes, such as changes to the interface.

Data Races Are Always Dangerous

In C++0x, a “Data Race” is a particular kind of race condition where two threads both access a non-atomic variable without synchronization, and at least one of those accesses is a write. A data race results in undefined behavior, so if you have a data race in C++0x then your program really could do anything at all.

It is a disturbingly common misconception that such data races are not problematic in practice. They are. In the absence of synchronization such as a mutex lock or atomic operations, compilers are free to optimize code such that variable accesses occur in a different order than the order written in your code. Not only that, but even if the instructions are generated in the sequence you expect, the actual memory accesses performed by the CPU may occur out of order. This is a particular issue with modern CPUs that have long instruction pipelines, branch prediction, and prefetching. For instance, the actual memory access for a load may occur several instructions prior to the load.

In order to demonstrate some of the problems with data races in C++0x, I wrote the following simple program:

#include #include #include unsigned const increment_count=2000000;unsigned const thread_count=2;unsigned i=0;void func(){    for(unsigned c=0;c threads;    for(unsigned c=0;cfunc is essentially a single INC instruction on an x86 CPU. You might therefore expect that the final value of your global variable i is simply the number of increments performed by all threads (thread_count * increment_count), which is not the case. The INC instruction is not atomic, so if you run this code on a multicore or multiprocessor system, then the final value of i will often be much less than the number of increments.
To demonstrate this point, here is the output of five consecutive runs of this code on my dual-core x86 laptop:
2 threads, Final i=2976075, increments=40000002 threads, Final i=3097899, increments=40000002 threads, Final i=4000000, increments=40000002 threads, Final i=3441342, increments=40000002 threads, Final i=2942251, increments=4000000
Because the code increments i 4,000,000 times (2,000,000 times on each thread), and it starts at zero, you might naively expect to see a final value of 4000000 (which one of the runs does produce). However, this is not the case; most of the time, you get far less. This is because the non-atomic increments on the different threads interfere with each other.
On x86 architectures, non-atomic increment operations are just a simple memory read, followed by a simple memory write of the new value. If another thread updates the value between the read and the write, then that value will be overwritten. The consequences might be different on other architectures. For instance, you might get values that are some combination of the values written, or you might get a processor exception.
Now imagine that this global counter is a reference count for some resource, and that each thread decrements the counter the same number of times as it increments it. The intention for the counter is that when it reaches zero then the object is freed. If some of the increments or decrements do not behave as expected, then this could easily lead to a memory leak or to the resource being freed too early. To demonstrate this behavior, I modified the program code:
#include #include #include unsigned const increment_count=2000000;unsigned const thread_count=2;unsigned i=0;void func(){    for(unsigned c=0;c threads;    for(unsigned c=0;c

`About Our Editorial Process`

At DevX, we’re dedicated to tech entrepreneurship. Our team closely follows industry shifts, new products, AI breakthroughs, technology trends, and funding announcements. Articles undergo thorough editing to ensure accuracy and clarity, reflecting DevX’s style and supporting entrepreneurs in the tech sphere.

See our full editorial policy.

`About Our Journalist`

  Charlie Frank
  Charlie has over a decade of experience in website administration and technology management. As the site admin, he oversees all technical aspects of running a high-traffic online platform, ensuring optimal performance, security, and user experience.
  View Author

 
   FlexAI nets $30 million to simplify AI computing infrastructure 
  Noah Nguyen   April 25, 2024   3:36 PM

 
   Google fires 28 protesting employees, maintains professional ethics 
  Johannah Lopez   April 25, 2024   1:49 PM

 
   Growing interest in small business ownership among millennials 
  Johannah Lopez   April 25, 2024   1:31 PM

 
   India’s rise in space exploration with private corporations 
  Johannah Lopez   April 25, 2024   8:42 AM

 
   Walmart partners with fintech One for installment payment service 
  Johannah Lopez   April 25, 2024   7:59 AM

 
   Biden administration bans noncompete agreements nationwide 
  April Isaacs   April 25, 2024   7:01 AM

 
   Media mogul poised for $1 billion boost if shares surge 
  Cameron Wiggins   April 24, 2024   5:19 PM

 
   Bitcoin miners’ strategy stabilizes market pre-halving 
  Noah Nguyen   April 24, 2024   5:12 PM

 
   Digital banking tackles fraud with advanced methods 
  Johannah Lopez   April 24, 2024   1:55 PM

 
   Wealth accumulation: Path to personal freedom and societal impact 
  Cameron Wiggins   April 24, 2024   1:25 PM

 
   Japan’s finance minister warns of currency market intervention 
  Rashan Dixon   April 24, 2024   1:14 PM

 
   TabaPay set to acquire Synapse’s assets 
  April Isaacs   April 24, 2024   11:58 AM

 
   Baton Rouge Entrepreneurship Week 2024 announced 
  Cameron Wiggins   April 24, 2024   11:36 AM

 
   Netherlands thriving in European venture capital growth 
  Noah Nguyen   April 23, 2024   5:42 PM

 
   Fintech startup Ramp sees formidable growth 
  Cameron Wiggins   April 23, 2024   5:14 PM

 
   Is Bitcoin the New Gold? Understanding Cryptocurrency as a Hedge Against Inflation 
  Rashan Dixon   April 23, 2024   4:05 PM

 
   Zipline delivers millionth order, aims higher 
  Rashan Dixon   April 23, 2024   3:49 PM

 
   Embracer Group splits into three, boosts efficacy 
  Johannah Lopez   April 23, 2024   1:58 PM

 
   Predicted 2025 COLA sees modest growth 
  Noah Nguyen   April 23, 2024   1:50 PM

 
   Unexpected eviction unsettles Indianapolis businesses 
  Cameron Wiggins   April 23, 2024   11:46 AM

 
   Final April disability payment scheduled for release 
  Johannah Lopez   April 23, 2024   8:08 AM

 
   Social Security amends SSI eligibility for 2024 
  Johannah Lopez   April 23, 2024   8:01 AM

 
   Silver tsunami presents opportunity for young entrepreneurs 
  Johannah Lopez   April 23, 2024   8:00 AM

 
   Unexpected eviction disrupts Indianapolis businesses 
  Cameron Wiggins   April 23, 2024   7:51 AM

 
   Embracer Group restructures into three entities 
  April Isaacs   April 23, 2024   7:07 AM

   
   Show More