libcudnn (R5) not found in library path
环境:Ubuntu 18.04 + Torch7 + cuda10
在运行使用cudnn的lua程序的时候产生错误:
/home/majiabiao/torch/install/bin/luajit: /home/majiabiao/torch/install/share/lua/5.1/trepl/init.lua:389: /home/majiabiao/torch/install/share/lua/5.1/trepl/init.lua:389: /home/majiabiao/torch/install/share/lua/5.1/cudnn/ffi.lua:1603: 'libcudnn (R5) not found in library path. Please install CuDNN from https://developer.nvidia.com/cuDNN Then make sure files named as libcudnn.so.5 or libcudnn.5.dylib are placed in your library load path (for example /usr/local/lib , or manually add a path to LD_LIBRARY_PATH) Alternatively, set the path to libcudnn.so.5 or libcudnn.5.dylib to the environment variable CUDNN_PATH and rerun torch. For example: export CUDNN_PATH="/usr/local/cuda/lib64/libcudnn.so.5" stack traceback: [C]: in function 'error' /home/majiabiao/torch/install/share/lua/5.1/trepl/init.lua:389: in function 'require' test.lua:41: in main chunk [C]: in function 'dofile' ...biao/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk [C]: at 0x555bad5c4570
去我安装的cuda文件夹下查看:/usr/local/cuda-10.1/lib64,发现没有 libcudnn.so.5 or libcudnn.5.dylib而是libcudnn.so.7 libcudnn.so.7.5.0
根据错误信息查看文件/home/majiabiao/torch/install/share/lua/5.1/cudnn/ffi.lua
local libnames = {'libcudnn.so.5', 'libcudnn.5.dylib', 'cudnn64_5.dll'} local ok = false for i=1,#libnames do ok = pcall(function () cudnn.C = ffi.load(libnames[i]) end) if ok then break; end end if not ok then
将'libcudnn.so.5', 'libcudnn.5.dylib'修改为'libcudnn.so.7', 'libcudnn.7.dylib'结果变为以下错误
/home/majiabiao/torch/install/bin/luajit: /home/majiabiao/torch/install/share/lua/5.1/trepl/init.lua:389: /home/majiabiao/torch/install/share/lua/5.1/trepl/init.lua:389: /home/majiabiao/torch/install/share/lua/5.1/cudnn/ffi.lua:1619: These bindings are for CUDNN 5.x (5005 <= cudnn.version > 6000) , while the loaded CuDNN is version: 7500 Are you using an older or newer version of CuDNN? stack traceback: [C]: in function 'error' /home/majiabiao/torch/install/share/lua/5.1/trepl/init.lua:389: in function 'require' test.lua:41: in main chunk [C]: in function 'dofile' ...biao/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk [C]: at 0x561e50b7e570
那么之前那个找不到libcudnn.so.5 or libcudnn.5.dylib的错误的原因是不兼容某些版本的cudnn,只兼容5005到6000之间的版本
那么如果使用的是不兼容版本的cudnn,那么就在现有cuda的基础上更改cudnn的版本,只需要替换一些文件
去nvidia官网下载将要使用的版本的cudnn,解压出来的文件夹cuda中有两个文件夹include和lib64
#删除原有版本的cudnn(注意你的文件夹是否是这个路径) sudo rm -rf /usr/local/cuda/include/cudnn.h sudo rm -rf /usr/local/cuda/lib64/libcudnn* #将下载好的新的版本的cudnn文件放到刚才删除的文件所在位置: sudo cp include/cudnn.h /usr/local/cuda/include/ sudo cp lib64/lib* /usr/local/cuda/lib64/ #cd到/usr/local/cuda/lib64/文件夹下,建立软链接(注意版本号换成你自己的) sudo chmod +r libcudnn.so.5.0.5 sudo ln -sf libcudnn.so.5.0.5 libcudnn.so.5 sudo ln -sf libcudnn.so.5 libcudnn.so sudo ldconfig
之后就可以了