concurrency runtime学习笔记之二:并行

并行依赖的是底层多线程处理机制,线程的创建和销毁,还有线程间的同步问题常常令人望而生畏。concrrency runtime提供的并行库Parallell Patterns Library (PPL)提高了线程处理机制的抽象级别,让C++多线程并行变得异常的简单。



Concurrency Runtime Architecture




#include <windows.h>
#include <ppl.h>
#include <array>

using namespace std;
using namespace Concurrency;

// Computes the nth Fibonacci number.
int fibonacci(int n)
   if(n < 2)
      return n;
   return fibonacci(n-1) + fibonacci(n-2);

// Calls the provided work function and returns the number of milliseconds 
// that it takes to call that function.
template <class Function>
__int64 time_call(Function&& f)
   __int64 begin = GetTickCount();
   return GetTickCount() - begin;

int _tmain(int argc, _TCHAR* argv[])
   __int64 elapsed;

   // An array of Fibonacci numbers to compute.
   array<int, 4> a = { 24, 26, 41, 42 };

   // Use the for_each algorithm to compute the results serially.
   elapsed = time_call([&] 
      size_t sum = 0;
      for_each (a.begin(), a.end(), [&](int n)
         sum += fibonacci(n);
      wcout << sum << endl;
   wcout << L"serial time: " << elapsed << L" ms" << endl;
   return 0;

如果调用并行算法,就把for_each换成parallel_for_each,不过要注意在一个并行循环体内,对共享资源的操作只允许读不允许写,所以要实现累加的话需要用到互斥对象  Concurrency::critical_section 

   elapsed = time_call([&] 
      critical_section cs;
      size_t sum = 0;
      parallel_for_each (a.begin(), a.end(), [&](int n)
         sum += fibonacci(n);
      wcout << sum << endl;
   wcout << L"parallel time: " << elapsed << L" ms" << endl;

更好的办法是用Concurrency::combinable 来协调线程间的冲突:

   elapsed = time_call([&] 
      combinable<size_t> sum;
      parallel_for_each (a.begin(), a.end(), [&](int n)
         sum.local() += fibonacci(n);
      size_t fibsum = sum.combine(std::plus<int>());

      wcout << fibsum << endl;
   wcout << L"parallel time: " << elapsed << L" ms" << endl;

取消并行循环要将并行算法包含在一个任务组中,并用Concurrency::task_group::cancelConcurrency::structured_task_group::cancel取消这个任务组,因为从构架上来说,并行库是在task scheduler模块之下的,所以取消机制是自上而下的,也就是说取消的是整个任务组,看程序:

   structured_task_group tasks;
      parallel_for_each(a.begin(), a.end(), [&](int n) 
if (n == 41) { tasks.cancel(); } }); });

以上是parallel_for_each算法的大致应用,parallel_for算法用法大同小异,parallel_invoke用的较少。尽管concurrency runtime封装并优化了多线程调用,但是线程调用的开销仍是存在的,运用时要考虑下是否值得。一般来说,计算密集型的应用比较合适,比方说图象处理算法,而对于小型轻量循环体就弊大于利了。






1----> 1
1阶楼梯的爬法总共为: 1
2----> 1 1
2----> 2
2阶楼梯的爬法总共为: 2
3----> 1 1 1
3----> 1 2
3----> 2 1
3----> 3
3阶楼梯的爬法总共为: 4
4----> 1 1 1 1
4----> 1 1 2
4----> 1 2 1
4----> 1 3
4----> 2 1 1
4----> 2 2
4----> 3 1
4阶楼梯的爬法总共为: 7

5----> 1 1 1 1 1
5----> 1 1 1 2
5----> 1 1 2 1
5----> 1 1 3
5----> 1 2 1 1
5----> 1 2 2
5----> 1 3 1
5----> 2 1 1 1
5----> 2 1 2
5----> 2 2 1
5----> 2 3
5----> 3 1 1
5----> 3 2
5阶楼梯的爬法总共为: 13

看下规律,大概就是个fibonacci数列,f(n) = f(n-1) + f(n-2) + f(n-3) ,其中f(1)=1,f(2)=2,f(3)=4


