Intel Go Parallel
Intel
Getting Started Concurrent Programming Community And Opinion Tools and Tips Advanced Concepts Go Parallel RSS Feed
 Print Print

Sutter Speaks: The Future of Concurrency
What does the future hold for concurrency? What will happen to the tools and techniques around concurrent programming? In part two of our series, concurrency guru Herb Sutter talks about these issues and what developers need to be reading to understand concurrency. 

In part one of our interview with the concurrency whisperer, Herb Sutter discussed concurrent vs. parallel, migration and scalability of applications. In this final installment he looks into his crystal ball with an eye towards the future and gives developers hints for the resources they need to be better concurrent programmers.

Threading is the ground floor. But if a "Moore's Law for Cores" takes off as predicted and we get a doubling of cores every few years, threading won't be sufficient to improve performance, right?

This reminds me of Charles Petzold's books from late 80s and early 90s. People were writing GUI apps successfully by hand, so to speak, following his guidelines. As libraries and tools evolved and got shipped, it all became more and more accessible to the mainstream. People today are writing Petzold-like threading.

The tools today won't help. Version one of Microsoft Foundation Classes was to GUIs what TBB is for concurrency. The tools are going to evolve too, and the market will decide. In GUIs there wasn't any one winner. Likewise, in five years there will be a shakeout in concurrency tools but not just one winner.

Can you speak about changes internal to Microsoft tools that take advantage of multi-core chips, such as Visual C++ turning on a switch to do parallel builds?

Products to do parallel builds have existed for a while and those features are increasingly in the box. That's just part of evolving tools, and an example of apps becoming multi-core-enabled. The more we can give developers tools libraries that are internally concurrent that they can use without having to deal with concurrency directly, the more benefit they'll have.

How is concurrency being addressed in the next version of C++?

C++ is doing something very similar to what Java is doing: A state-of-the-art memory model, an atomics library for lock-free code, and threads and locks.

Do you see a lot of interest in and usage of transactional memory, or is the concept too difficult for most developers to grasp?

It's not yet possible to answer who's using it because it hasn't been brought to market yet. Intel has a software transactional memory compiler prototype. But if the question is "Is it too hard for developers to use?" the answer is that I certainly hope not. The whole point is it's way easier than locks. It is the only major thing on the research horizon that holds out hope of greatly reducing our use of locks. It will never replace locks completely, but it's our only big hope to replacing them partially.

There are some limitations. In particular, some I/O is inherently not transactional—you can't take an atomic block that prompts the user for his name and read the name from the console, and just automatically abort and retry the block if it conflicts with another transaction; the user can tell the difference if you prompt him twice. Transactional memory is great for stuff that is only touching memory, though.

Every major hardware and software vendor I know of has multiple transactional memory tools in R&D. There are conferences and academic papers on theoretical answers to basic questions. We're not at the Model T stage yet where we can ship it out. You'll probably see early, limited prototypes where you can't do unbounded transactional memory—where you can only read and write, say, 100 memory locations. That's still very useful for enabling more lock-free algorithms, though.

What is the future of multi-core architecture? Will they comprise many homogenous cores, or will different instruction sets be mixed in together?

On the one extreme, you have the simplicity of many identical cores with a single monolithic memory. Then on the other, there's the Cell Processor, which has one general-purpose core and eight special cores with their own instruction sets. They don't even all touch the same memory, and the programmer has to explicitly move memory around from core to core. In between those two extremes, there's a big range.

You could imagine special-purpose cores where memory management is automated. You might look at having heterogeneous cores where the only difference is complexity and speed: 32 cores with the x86 instruction set, but some are faster than others. Some big out-of-order cores that are faster, or simpler in-order cores, all on the same chip.

This option also lets you have a nice balance between running existing applications just as fast on the big cores, but also running many-core apps that have more and more concurrency on lots of little cores.

So, there are at least those four major places: one, all homogeneous, which is where multi-core is today on the desktop; two, some faster, some slower; three, different cores and instruction sets but with shared memory; and four, the Cell example, where you not only have to know where your memory is but where the cores are.

Is this the most attention you've ever paid to chips and instruction sets in your career?

Yes.

Is the move to concurrency easier for C++ developers?

Maybe in some ways, but we all will need to learn concurrency to some degree. Most C++ programmers already care about performance and are used to getting close to the hardware, and many have been doing multithreading already.

What has this process shown you about the differences between hardware and software people?

It's always been true that software people have needed to know something about the hardware, but they generally think of "the hardware" as this monolithic thing below them. They don't know about things like instruction reordering or whether caches are associative and in what way. Most software developers don't know those details of what their billion transistors are going and don't need to be exposed to that deep complexity.

Hardware people tend to talk about "the software" as this monolithic thing above them. They're not always aware that an application is an extremely complex system made up of interconnected parts written by different people at different times, and that these modules link-in dynamically so that it might not be testable. Say you download FireFox and two plugins, your grandma might be the very first person ever to have downloaded that exact combination of software on her machine. Hardware-oriented folks are sometimes surprised at the consequences of that on concurrency, especially because many of today's concurrency primitives, such as locks, aren't composable, which means they make it hard to take two independently authored libraries or modules—or plugins—that are individually correct and know that the combination is still correct and won't deadlock, for example.

What should developers be reading to understand concurrency?

Well, they can go to my blog. Tim Mattson wrote a book on parallel patterns [Patterns for Parallel Programming by Mattson, Beverly Sanders and Berna Massingill, Addison-Wesley, 2004]. Doug Lea has a wonderful book on Java threading [Java Concurrency in Practice by Brian Goetz, Tim Peierls, Joshua Bloch, Joseph Bowbeer, David Holmes and Doug Lea, Addison-Wesley, 2006]. David Butenhof wrote Programming with POSIX Threads [Addison-Wesley, 1997]. Joe Duffy is writing something similar for Windows threads [in press].

We're just in the first year or two of multi-core. It's time to bring it into the mainstream with new products, libraries, conferences, books, and magazine articles. Over the next few years, those will help to answer the questions about whether people are understanding concurrency or not, by turning it into Yes, they are, and succeeding with ever-increasing scalability and reliability.

Page 1 of 1
Submit article to:
Ever wonder why we don't hear more from threading practitioners about how they managed to grok concurrency? Perhaps it's because they're too busy enjoying the performance increases. They won't say it's easy, but the Vegas Pro developers at Sony Creative Software are understandably proud of their growing expertise in threading and OpenMP. »
While threading can be a challenge, new software development tools help simplify the process by identifying thread correctness issues and performance opportunities. We present a methodology that has been used to successfully thread many applications and discuss tools that can assist in developing multi-threaded applications. »
This paper describes the performance analysis phase of the threading methodology we presented in our previous paper, "Best Practices for Developing and Optimizing Threaded Applications." »
How Can Theory of Constraints Help in Software Optimization?
Performance Scaling in the Multi-Core Era
» More Personalized Content
Getting Started (90)
Concurrent Programming (105)
Community and Opinion (48)
Tools and Tips (85)
Advanced Concepts (58)
What concurrency info do you need right now?
(Choose your top answer.)
An introduction
Threading basics
Advanced parallelism concepts
Optimization tools and techniques

View Results
Past Votes