|
||||||||
December 10, 2007
Recently, there have been articles by prominent members of the development community discussing the need for threading in applications. The problem is that we are reaching the limits of what hardware engineers can do to increase processor speed. Years ago, when processors were relatively slow and disks were small, developers spent time optimizing their applications for both speed and space. However, now processors are fast and disks are large, so most developers rely on the hardware to improve performance and worry little about disk space. Unfortunately, the advances in hardware are changing. Where it used to be that processors were just made faster, that isn't happening anymore. For the time being, we are seeing that, because hardware vendors are incapable of using their standard solutions for improving base processor performance, clock speeds are essentially the fastest they are going to get until there is a major breakthrough in technology. Instead, hardware vendors are adding more cores to their CPUs or adding multiple CPUs to a single machine. Developers need to go back to optimizing their software to take advantage of these new hardware concepts. That means introducing threading into their software to allow more instructions to be executed at the same time.
Do all Developers need to Understand Threading? My experience is that most developers do not understand threading, and that most don't believe they need to. There are good reasons for this. These days, most developers are working on web applications, and when you are writing a web application in .NET or Java, the platform takes care of the threading for you. The application server creates threads and hands each request off to a single thread. The developer is responsible for writing the code that a single thread will execute. Therefore, instead of writing code that utilizes multiple threads, the developer creates thread-safe code. Writing thread-safe code is far easier than understanding what an application is doing and writing efficient code that takes advantage of the full power of the computer. If developers become accustomed to writing thread-safe code, then their code will be ready for operation in a multithreaded environment. While many developers function by writing code that is executed by one thread at a time, in my opinion, that is a naive view of what software developers need to do to be successful. Even if all developers don't need to fully understand multithreading, it will be highly beneficial going forward if all developers adopt the practice of writing solidly thread-safe code in all situations (server side, client side, business logic, algorithmic computations, etc.).
Pipeline Processing Example
Experimenting with Threading In an ideal world, you will achieve the best performance if the number of threads in an application is equal to the number of processors or cores that you have available. However, we all know that programs don't run in an ideal world. In the real world, performance of applications often has less to do with the performance of the CPU and more to do with how often the application needs to wait for I/O. With this in mind, let's look at how we can organize the email sending application with threads. First, some assumptions and requirements for our sample application. The application has a number of email templates and each user may get one email for each type of template. For simplicity, we will assume that there is one database that contains a series of email templates and user accounts. We will also assume that there is one query to retrieve a list of email templates and another to retrieve a list of users that should receive an email given a specific template type. How many different ways could we organize this work? Before starting, we need to determine if the tasks can be executed in parallel. If each task must be done in order, and the data items themselves must be processed in a specific order, then threading is not an option. In this case, the order in which the emails for individual users are processed doesn't really matter. Given the nature of the task, we can definitely parallelize creating the emails. We will discuss whether we can parallelize sending the emails later. Now we need to determine how best to organize our blocks of work. Based on the requirements, because each user receives one email per template type, it makes the most sense to group the work by template type instead of trying to separate the users into arbitrary groups. Now that we know we will break the work into components around email templates, how should we organize the work? There are at least two reasonable solutions that we need to investigate. The first option (see Figure 1) is to get a list of email templates and then create some threads. Each thread is given a single template type to operate on. Once created, each thread gets a list of users who should receive an email for that template type. Then, for each user an email is created and sent until all users have been processed. We could either create one thread per template type or we could create fewer threads and have each thread handle multiple template types. This is a relatively simple solution; plus, it will be faster than doing this same work without threads.
The I/O Problem
Given these two possible designs, which is better? To be honest, I don't know. The answer is really dictated by your specific situation. How many emails are you sending each day? Is most of your time spent creating the emails or is most of your time spent retrieving and sending the email after it was created? If creating the email is where you spend your time, because the templates are large or the replacements are complex, then the second design is probably a better match. If the majority of your time is spent retrieving data from the database or communicating with the email server, then improving the performance of creating the emails will not help your application overall. To determine the best design for your situation, you need to understand the environment in which your application will run and then do your own performance testing. You also need to understand the maintenance needs of each design. The first design in this article is easy to understand and maintain, while the second may provide significant performance benefits, but is much more difficult to develop and maintain.
Conclusion
A Look Ahead One new advantage Threading Building Blocks bring to the table is application scalability, which has always been difficult in the past. Historically, multithreaded applications have been designed and developed for a specific set of hardware. The number of threads allocated to each task was typically selected empirically, after testing the application in a specific hardware environment, and finding out which set of thread assignments produced the maximal performance. TBB's CILK-like task-stealing mechanism means that, with a properly designed application, you can rely on your application itself to automatically detect idle processors/cores and assign them tasks from the queue, hence automatically maximizing the use of the available processors (even if the application is installed onto new hardware with a different number of processing cores). This type of portable scalability has always been difficult to achieve in applications constructed using native threads.
Page 1 of 1
|
||||||||
|
Ryan Bloom is the director of Native Development for Peopleclick Inc. in Raleigh, NC. He has been in software development and management for nearly 10 years and is a member of the Apache Software Foundation.
|