Lambda expression

lambda structure
[capture list] (parameter list) -> return type { function body }

(parameter list) and return type are optional

Value and Reference capture

There is a notable question: when lambda used between CPU and GPU, we need to notice the store location of variable.
For example, if we open up a device memory, but the code is written in host code and the lambda also in host code. Now, we want use lambda to capture a variable, as shown in the following code.

template <typename Func>
__global__ void myFunc(Func func) {
	std::cout << func(0) << std::endl;
}

int main() {
    std::vector<int, CudaAllocator<int>> arr(n);

	muFunc<<<gridSize, blockSize>>> ([] __device__ (int i) -> void{
		arr[i] = 0;
	});

    return 0;
}

The CudaAllocator is a custom class which implements the allocate and deallocate member function

Please note that the above code is incomplete, if we just use the [], we can't use the arr in the lambda function body. But what should we use in [] ?
In fact, we use [&] or [=] directly are both wrong method. When we use [&], its meaning is like we let a device code to access a variable which is opened up in host memory. And if we use the [=], its meaning is like we copy the entire data of vector, it's time and resource consuming.

So, what is the right way to implement it ? We can use arr.data() to get the orginal pointer which points to the device memory, note that this variable is still in the host memory, so next we can use [=] to get a copy value so that device code can use it. In summary, the first step is used to solve the resource copy problem, and the second step is used to solve the resource access between host and device memory.

posted @   0x7F  阅读(3)  评论(0编辑  收藏  举报
相关博文:
阅读排行:
· 终于写完轮子一部分:tcp代理 了,记录一下
· 震惊!C++程序真的从main开始吗?99%的程序员都答错了
· 别再用vector<bool>了!Google高级工程师:这可能是STL最大的设计失误
· 单元测试从入门到精通
· 【硬核科普】Trae如何「偷看」你的代码?零基础破解AI编程运行原理
点击右上角即可分享
微信分享提示