Lambda expression
lambda structure
[capture list] (parameter list) -> return type { function body }
(parameter list) and return type are optional
Value and Reference capture
There is a notable question: when lambda used between CPU and GPU, we need to notice the store location of variable.
For example, if we open up a device memory, but the code is written in host code and the lambda also in host code. Now, we want use lambda to capture a variable, as shown in the following code.
template <typename Func>
__global__ void myFunc(Func func) {
std::cout << func(0) << std::endl;
}
int main() {
std::vector<int, CudaAllocator<int>> arr(n);
muFunc<<<gridSize, blockSize>>> ([] __device__ (int i) -> void{
arr[i] = 0;
});
return 0;
}
The
CudaAllocator
is a custom class which implements theallocate
anddeallocate
member function
Please note that the above code is incomplete, if we just use the []
, we can't use the arr
in the lambda function body. But what should we use in []
?
In fact, we use [&]
or [=]
directly are both wrong method. When we use [&]
, its meaning is like we let a device code to access a variable which is opened up in host memory. And if we use the [=]
, its meaning is like we copy the entire data of vector, it's time and resource consuming.
So, what is the right way to implement it ? We can use arr.data()
to get the orginal pointer which points to the device memory, note that this variable is still in the host memory, so next we can use [=]
to get a copy value so that device code can use it. In summary, the first step is used to solve the resource copy problem, and the second step is used to solve the resource access between host and device memory.
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 终于写完轮子一部分:tcp代理 了,记录一下
· 震惊!C++程序真的从main开始吗?99%的程序员都答错了
· 别再用vector<bool>了!Google高级工程师:这可能是STL最大的设计失误
· 单元测试从入门到精通
· 【硬核科普】Trae如何「偷看」你的代码?零基础破解AI编程运行原理