arallel programming has been around for decades, but unless you had access to special-purpose hardware, you've probably written mostly single CPU applications. You could distribute truly intensive applications across a network—but doing so was a lot of work and involved a lot of overhead.
For several years, programming tools have allowed you to use multiple threads in the same program. Multiple threads may be able to run on multiple CPUs within your computer so multi-threading sometimes provides nice speed improvements. Unfortunately, while multi-threading requires less overhead than distributing applications across a network, it's still fairly tricky.
Microsoft's new Task Parallel Library (TPL) provides a new approach for using multiple threads. It provides a set of relatively simple method calls that let you launch a group of routines all at once.
This article provides an introduction to TPL. It explains the main pieces of TPL and provides simples examples. A follow-up article will describe performance effects resulting from converting some of the programs I've written for other articles to use TPL.
Before you learn more about TPL, however, it's worth a few minutes to look at the future of computer hardware, so you'll know why you should care. Multi-Core Background
For more than 40 years, the number of transistors that could fit on a chip has roughly doubled every two years. Along with that doubling has come increased CPU speed, which was dubbed "Moore's Law" after Intel cofounder Gordon Moore explained the trend in his famous 1965 paper.
Unfortunately, the techniques used to maintain this frenetic pace are starting to wear thin—literally too thin. Many past density increases depended on lithographic techniques that produce smaller and smaller chip components. The lithographic technique is much like using a slide projector to shine the image of the chip you want on a silicon wafer. The ultimate limit to how small you can make a feature using those techniques depends on the wavelength of the light you are using. Chip manufacturers are currently working with deep ultra-violet light, electron beams, and x-rays, which squeeze lithographic techniques to their limits, but the era of easy speed gains is rapidly approaching its end.
|Future operating systems will probably gain some benefit from multiple CPUs and improved compilers may be able to find opportunities for parallelism for you automatically.
As the big performance gains from lithographic techniques come to an end in the next decade or so, researchers are turning to other methods for squeezing extra performance out of a computer. Some of these, such as optical computing and quantum computing, use radically different approaches to building computers. These approaches show great promise but are unlikely to produce practical commercial results for many years.
A more immediately practical approach is to use more CPUs at the same time. If you can't make a CPU twice as fast, perhaps you can divide your work across two CPUs. In fact, many new computers these days have multiple cores—multiple CPUs on a single chip. Dual and even quad core systems are common and chips with even more cores are in the works. Vendors have been working on 8-, 16-, and even 32-core systems.
IBM's Cell architecture allows for a whole host of relatively small CPUs scattered throughout your environment in electronic products such as computers, cell phones, televisions, game consoles, and just about anything else you can think of. In a few years, you may have dozens or even hundreds of small CPUs talking to each other through an ad hoc network in your living room.
Just having multiple CPUs sitting around will probably give you some performance improvements. Future operating systems will probably gain some benefit from multiple CPUs and improved compilers may be able to find opportunities for parallelism for you automatically.
However, there's a limit to how much these tools can do for you transparently. An eight-core system may potentially have eight times as much CPU power, but if you don't pitch in and help you're likely to see only a small improvement in your application's performance.
To really get the most out of your hardware, you're going to have to start incorporating parallelism into your applications. Toward that end, TPL gives you a new set of tools that will let you get started quickly and relatively easily.