How to do error checking in CUDA(如何在CUDA里做错误检查)
https://codeyarns.com/2011/03/02/how-to-do-error-checking-in-cuda/
Error checks in CUDA code can help catch CUDA errors at their source. There are 2 sources of errors in CUDA source code:
- Errors from CUDA API calls. For example, a call to
cudaMalloc()
might fail. - Errors from CUDA kernel calls. For example, there might be invalid memory access inside a kernel
在CUDA代码里,错误检查可以帮助找到CUDA代码里的错误,有两种从代码里产生的错误
- CUDA API调用错误。如,一个cudaMalloc()调用可能会失败。
- CUDA kernel调用错误。如,可能会在某个kernel的实现了访问了非法的内存。
All CUDA API calls return a cudaError
value, so these calls are easy to check:
所有CUDA API调用都会返回一个cudaError值,所以这种调用非常容易检查。
if ( cudaSuccess != cudaMalloc( &fooPtr, fooSize ) ) printf( "Error!\n" );
CUDA kernel invocations do not return any value. Error from a CUDA kernel call can be checked after its execution by calling cudaGetLastError()
:
CUDA kernel不返回任何值。从CUDA kernel调用产生的错误可以在该调用完毕后,从cudaGetLastError()中检查到。
fooKernel<<< x, y >>>(); // Kernel call if ( cudaSuccess != cudaGetLastError() ) printf( "Error!\n" );
These two types of checks can be elegantly wrapped up in two simple error-checking functions like this:
这两种检查可以非常优雅地封装在两个错误检查函数中,如下,
// Define this to turn on error checking #define CUDA_ERROR_CHECK #define CudaSafeCall( err ) __cudaSafeCall( err, __FILE__, __LINE__ ) #define CudaCheckError() __cudaCheckError( __FILE__, __LINE__ ) inline void __cudaSafeCall( cudaError err, const char *file, const int line ) { #ifdef CUDA_ERROR_CHECK if ( cudaSuccess != err ) { fprintf( stderr, "cudaSafeCall() failed at %s:%i : %s\n", file, line, cudaGetErrorString( err ) ); exit( -1 ); } #endif return; } inline void __cudaCheckError( const char *file, const int line ) { #ifdef CUDA_ERROR_CHECK cudaError err = cudaGetLastError(); if ( cudaSuccess != err ) { fprintf( stderr, "cudaCheckError() failed at %s:%i : %s\n", file, line, cudaGetErrorString( err ) ); exit( -1 ); } // More careful checking. However, this will affect performance. // Comment away if needed. err = cudaDeviceSynchronize(); if( cudaSuccess != err ) { fprintf( stderr, "cudaCheckError() with sync failed at %s:%i : %s\n", file, line, cudaGetErrorString( err ) ); exit( -1 ); } #endif return; }
Using these error checking functions is easy:
使用这两个错误检查函数非常简单:
CudaSafeCall( cudaMalloc( &fooPtr, fooSize ) ); fooKernel<<< x, y >>>(); // Kernel call CudaCheckError();
These functions are actually derived from similar functions which used to be available in the cutil.h
in old CUDA SDKs.
这两个函数实际上也是从简单的旧CUDA SDK里导出的
【推荐】还在用 ECharts 开发大屏?试试这款永久免费的开源 BI 工具!
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步