『转载』使用 CMake 构建跨平台 CUDA 应用程序
原文地址:
https://developer.nvidia.com/zh-cn/blog/building-cuda-applications-cmake/
https://developer.nvidia.com/blog/building-cuda-applications-cmake/
一个有意思的cuda入门项目:https://github.com/LitLeo/OpenCUDA
对我个人主要是cmake入门。project不是必须的,但是可以指定编译语言,应该是和编译器的选择有关:
project(cmake_and_cuda LANGUAGES CXX CUDA)
此外, CUDACXX
和 CXX
环境变量可以分别设置为 nvcc
和 C ++编译器的路径。
下面这个不太懂,看起来是在指定cpu程序的编译器:
You can explicitly specify a host compiler to use with NVCC using the CUDAHOSTCXX environment variable. (This controls the -ccbin option for NVCC.)
下面这个使得和build目标particles 相关的编译使用c++11标准:
target_compile_features(particles PUBLIC cxx_std_11)
位置无关代码:
set_target_properties(particles PROPERTIES POSITION_INDEPENDENT_CODE ON)
看起来是给静态库(particles )用的,cmake的动态库自动启用位置无关特性,静态库需要连接到动态库,通过上面行启用,对cuda语言的支持需要3.8以上版本。
CMake 3.8 supports the
POSITION_INDEPENDENT_CODE
property for CUDA compilation, and builds all host-side code as relocatable when requested. This is great news for projects that wish to use CUDA in cross-platform projects or inside shared libraries, or desire to support esoteric C++ compilers.
可分离编译(https://developer.nvidia.com/blog/separate-compilation-linking-cuda-device-code/):主要讲的似乎是cuda代码可以被存放在多个静态库中(主要手段是将device linking推迟到静态库链接动态库或可执行程序之后),且支持cmake的增量编译特性。
在上面引的文章中解释了问题:
One of the key limitations that device code linking lifts is the need to have all the code for a GPU kernel present when compiling the kernel, including all the device functions that the kernel calls. As C++ programmers, we are used to calling externally defined functions simply by declaring the functions’ prototypes (or including a header that declares them).
GPU编译需要知道kernel调用的全部函数具体代码,在C++中,编译特定文件时可以用头文件、声明一下该函数等方式暂时搪塞过去(链接时才会真正的去找函数代码)。