Login | Register   
RSS Feed
Download our iPhone app
Browse DevX
Sign up for e-mail newsletters from DevX


Plan for the Future: Express Parallelism, Don't Manage It  : Page 3

When adding parallelism, the key design choice is to express concurrency in an application without explicitly managing the scheduling of that concurrency onto the hardware.




Full Text Search: The Key to Better Natural Language Queries for NoSQL in Node.js

Parallelizing assign_fitness

After creating each generation, the sample application must assign a fitness value to each individual in that new generation. The serial implementation of assign_fitness in serial_ga.cpp uses std::for_each to iterate through the new children in the my_individuals vector to assign fitness values to them:

inline void population::assign_fitness() { std::for_each( my_individuals.begin() + population_size, my_individuals.end(), set_fitness() ); }

The process to implement a natively threaded version of assign_fitness is similar to that you saw in the preceding section to implement the threaded version of generate_children, so you'll see a condensed version here. First, add code to create and join a set of native threads:

inline void population::assign_fittness() { for ( int t = 0; t < num_threads; ++t ) { handles[t] = _beginthread( &start_set_fitness, 0, (void *)t ); } WaitForMultipleObjects( num_threads, (HANDLE *)handles, true, INFINITE ); }

Next, package the loop so you can pass it through the Windows threading API:

inline void start_set_fitness( void *x ) { population_helper::set_fitness( int(x) ); }

Finally, modify the loop code to use the same adjust_begin_end scheduling routine as generate_children:

static void population_helper::set_fitness( const int thread_id ) { size_t begin = population_size; size_t end = my_individuals->size(); adjust_begin_end( thread_id, begin, end ); for ( size_t i = begin; i < end; ++i ) { (*my_individuals)[i].set_fitness(); } }

The TBB implementation simply replaces std::for_each with tbb::parallel_for by using a blocked_range of iterators:

inline void assign_fitness() { tbb::parallel_for( tbb::blocked_range<vector_type::iterator>( my_individuals.begin() + population_size, my_individuals.end() ), set_fitness_body(), tbb::auto_partitioner() ); }

Implementing set_fitness_body is straightforward:

struct set_fitness_body { void operator() (const tbb::blocked_range < vector_type::iterator > &range ) const { for ( vector_type::iterator i = range.begin(); i != range.end(); ++i) { i->set_fitness(); } } };

Again, the natively threaded code is more difficult to write, uses a naive scheduling policy and is tied to the use of num_threads. Of course it's possible to write a set of advanced support routines using native threads to do a better job of managing the concurrency—but if you did that, you'd be writing a concurrency platform instead of focusing on the application's features.

Comment and Contribute






(Maximum characters: 1200). You have 1200 characters left.



Thanks for your registration, follow us on our social networks to keep up-to-date