GpuArrayException: No cuda device available尝试解决
问题:
在import keras或import ttheano时出现了以下:
>>> import keras Using Theano backend. ERROR (theano.gpuarray): Could not initialize pygpu, support disabled Traceback (most recent call last): File "/data_d/old_home/home/.conda/envs/lib/python2.7/site-packages/theano/gpuarray/__init__.py", line 227, in <module> use(config.device) File "/data_d/old_home/home/.conda/envs/lib/python2.7/site-packages/theano/gpuarray/__init__.py", line 214, in use init_dev(device, preallocate=preallocate) File "/data_d/old_home/home/.conda/envs/lib/python2.7/site-packages/theano/gpuarray/__init__.py", line 99, in init_dev **args) File "pygpu/gpuarray.pyx", line 658, in pygpu.gpuarray.init File "pygpu/gpuarray.pyx", line 587, in pygpu.gpuarray.pygpu_init GpuArrayException: No cuda device available
尝试了pip uninstall theano并且使用conda install theano安装后,出现了更为奇怪的问题,搜索之后发现是由于theano1.0.4和numpy16.0出现不兼容等问题,所以进行了卸载。
重新使用pip install theano之后,进行操作,仍旧是同样的错误:
>>> import theano ERROR (theano.gpuarray): Could not initialize pygpu, support disabled Traceback (most recent call last): File "/data_d/old_home/home/.conda/envs/ib/python2.7/site-packages/theano/gpuarray/__init__.py", line 227, in <module> use(config.device) File "/data_d/old_home/home/.conda/envs/lib/python2.7/site-packages/theano/gpuarray/__init__.py", line 214, in use init_dev(device, preallocate=preallocate) File "/data_d/old_home/home/.conda/envs/lib/python2.7/site-packages/theano/gpuarray/__init__.py", line 99, in init_dev **args) File "pygpu/gpuarray.pyx", line 658, in pygpu.gpuarray.init File "pygpu/gpuarray.pyx", line 587, in pygpu.gpuarray.pygpu_init GpuArrayException: No cuda device available
其他配置如下:
[global] floatX = float32 device =cuda [cuda] root=/usr/local/cuda-8.0 ##.theanorc文件
echo $PATH /data_d/old_home/home/.conda/envs/bin:/usr/local/cuda-8.0/bin:/data_d/public/miniconda2/bin:/usr/local/cuda-9.0/bin:/usr/local/sbin: /usr/local/bin:/usr/sbin:/usr/bin:/s:/usr/local/cuda-8.0/bin/local/games:/snap/bin:/usr/local/cuda-8.0/bin
CUDA_VISIBLE_DEVICES=1 CUDA_HOME=/usr/local/cuda-8.0 PATH="$PATH:/usr/local/cuda-8.0/bin" LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda-8.0/lib64:/usr/local/cuda-8.0/extras/CUPTI/lib64" #.bashrc文件
at /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2 #define CUDNN_MAJOR 6 #define CUDNN_MINOR 0 #define CUDNN_PATCHLEVEL 21
所使用的theano版本为1.0.4,对应的pygpu为0.7.6。
是否是cuda-8.0文件夹的所有者被改变?不行。
跑测试程序也是同样的报错:
Using Theano backend. ERROR (theano.gpuarray): Could not initialize pygpu, support disabled Traceback (most recent call last): File "/data_d/old_home/home/.conda/envs/lib/python2.7/site-packages/theano/gpuarray/__init__.py", line 227, in <module> use(config.device) File "/data_d/old_home/home/.conda/envs/lib/python2.7/site-packages/theano/gpuarray/__init__.py", line 214, in use init_dev(device, preallocate=preallocate) File "/data_d/old_home/home/.conda/envs/lib/python2.7/site-packages/theano/gpuarray/__init__.py", line 99, in init_dev **args) File "pygpu/gpuarray.pyx", line 658, in pygpu.gpuarray.init File "pygpu/gpuarray.pyx", line 587, in pygpu.gpuarray.pygpu_init GpuArrayException: No cuda device available Training ----------- ('train cost: ', array(4.1908903, dtype=float32)) ('train cost: ', array(0.10415509, dtype=float32)) ('train cost: ', array(0.01151281, dtype=float32)) ('train cost: ', array(0.00458441, dtype=float32)) Testing ------------ 40/40 [==============================] - 0s 5us/step ('test cost:', 0.005374030210077763) ('Weights=', array([[0.56634265]], dtype=float32), '\nbiases=', array([2.001063], dtype=float32))
尝试一:
修改配置文件,改为了cuda0,结果import theano时:
[global] floatX = float32 device =cuda0 [cuda] root=/usr/local/cuda-8.0
>>> import theano ERROR (theano.gpuarray): Could not initialize pygpu, support disabled Traceback (most recent call last): File "/data_d/old_home/home/.conda/env/lib/python2.7/site-packages/theano/gpuarray/__init__.py", line 227, in <module> use(config.device) File "/data_d/old_home/home/.conda/envs/lib/python2.7/site-packages/theano/gpuarray/__init__.py", line 214, in use init_dev(device, preallocate=preallocate) File "/data_d/old_home/home/.conda/envs/lib/python2.7/site-packages/theano/gpuarray/__init__.py", line 99, in init_dev **args) File "pygpu/gpuarray.pyx", line 658, in pygpu.gpuarray.init File "pygpu/gpuarray.pyx", line 587, in pygpu.gpuarray.pygpu_init GpuArrayException: GPU is too old for CUDA version
在https://blog.csdn.net/qq_33200967/article/details/80689543看到,需要检查cuda是否安装成功,由于直接用make报错,https://devtalk.nvidia.com/default/topic/1048902/cuda-setup-and-installation/cuda-samples-ubuntu-make-file-errors/,
所以使用了sudo make -k,发现输出结果为:
./deviceQuery Starting... CUDA Device Query (Runtime API) version (CUDART static linking) Detected 1 CUDA Capable device(s) Device 0: "" CUDA Driver Version / Runtime Version 9.0 / 8.0 CUDA Capability Major/Minor version number: 2.1 Total amount of global memory: 963 MBytes (1010040832 bytes) ( 1) Multiprocessors, ( 48) CUDA Cores/MP: 48 CUDA Cores GPU Max Clock rate: 1046 MHz (1.05 GHz) Memory Clock rate: 875 Mhz Memory Bus Width: 64-bit L2 Cache Size: 65536 bytes Maximum Texture Dimension Size (x,y,z) 1D=(65536), 2D=(65536, 65535), 3D=(2048, 2048, 2048) Maximum Layered 1D Texture Size, (num) layers 1D=(16384), 2048 layers Maximum Layered 2D Texture Size, (num) layers 2D=(16384, 16384), 2048 layers Total amount of constant memory: 65536 bytes Total amount of shared memory per block: 49152 bytes Total number of registers available per block: 32768 Warp size: 32 Maximum number of threads per multiprocessor: 1536 Maximum number of threads per block: 1024 Max dimension size of a thread block (x,y,z): (1024, 1024, 64) Max dimension size of a grid size (x,y,z): (65535, 65535, 65535) Maximum memory pitch: 2147483647 bytes Texture alignment: 512 bytes Concurrent copy and kernel execution: Yes with 1 copy engine(s) Run time limit on kernels: No Integrated GPU sharing Host Memory: No Support host page-locked memory mapping: Yes Alignment requirement for Surfaces: Yes Device has ECC support: Disabled Device supports Unified Addressing (UVA): Yes Device PCI Domain ID / Bus ID / location ID: 0 / 2 / 0 Compute Mode: < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) > deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 9.0, CUDA Runtime Version = 8.0, NumDevs = 1, Device0 = NVS 315 Result = PASS
查看nvidia显卡驱动版本:https://blog.csdn.net/s_sunnyy/article/details/64121826
cat /proc/driver/nvidia/version NVRM version: NVIDIA UNIX x86_64 Kernel Module 384.130 Wed Mar 21 03:37:26 PDT 2018 GCC version: gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.10)
查看本机nvidia显卡:
:/dev$ ls -l nvidia* crw-rw-rw- 1 root root 195, 0 5月 17 12:53 nvidia0 crw-rw-rw- 1 root root 195, 1 5月 17 12:53 nvidia1 crw-rw-rw- 1 root root 195, 255 5月 17 12:53 nvidiactl crw-rw-rw- 1 root root 195, 254 5月 17 12:53 nvidia-modeset crw-rw-rw- 1 root root 240, 0 5月 17 12:53 nvidia-uvm
查看cudnn的版本:, conda list -n username
cudatoolkit 10.0.130 0
cudnn 7.3.1 cuda10.0_0
似乎版本过高,https://blog.csdn.net/li57681522/article/details/82491617
安装的cudatoolkit和cudnn程序包版本是:10.0
but实际上,但根本就没有安装过cuda10.0。
所以尝试卸载
conda uninstall cudnn Fetching package metadata ........... Solving package specifications: . Package plan for package removal in environment /data_d/old_home/home/.conda/envs: The following packages will be REMOVED: cudnn: 7.3.1-cuda10.0_0 Proceed ([y]/n)? y
conda uninstall cudatoolkit Fetching package metadata ........... Solving package specifications: . Package plan for package removal in environment /data_d/old_home/home/.conda/envs: The following packages will be REMOVED: cudatoolkit: 10.0.130-0 cupti: 10.0.130-0 Proceed ([y]/n)? y
使用:
conda install cudatoolkit=8.0 Fetching package metadata ........... Solving package specifications: . Package plan for installation in environment /data_d/old_home/home/.conda/envs: The following NEW packages will be INSTALLED: cudatoolkit: 8.0-3 Proceed ([y]/n)? y
conda install cudnn=6.0 Fetching package metadata ........... Solving package specifications: . Package plan for installation in environment /data_d/old_home/home/.conda/env: The following NEW packages will be INSTALLED: cudnn: 6.0.21-cuda8.0_0 Proceed ([y]/n)? y
cudatoolkit 8.0 3
cudnn 6.0.21 cuda8.0_0
查询结果如上。
结果依旧同样的错误。
GpuArrayException: No cuda device available
尝试在新环境下重新安装Cuda等。https://blog.csdn.net/lyy14011305/article/details/59500819
按照这个http://deeplearning.net/software/theano/install_ubuntu.html安装numpy\theano等包时,出现以下问题:
Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/data_d/old_home/home/.conda/envs/lib/python2.7/site-packages/theano/__init__.py", line 156, in <module> import theano.gpuarray 。。。 AttributeError: ('The following error happened while compiling the node', DnnVersion(), '\n', "'module' object has no attribute '_get_ndarray_c_version'")
https://github.com/pymc-devs/pymc3/issues/3340的解决办法是将theano升级为1.0.4(conda安装的为1.0.3),但是在升级时遇到了问题:
conda install theano=1.0.4 Fetching package metadata ........... PackageNotFoundError: Packages missing in current channels: - theano 1.0.4* We have searched for the packages in the following channels: - https://repo.continuum.io/pkgs/main/linux-64 - https://repo.continuum.io/pkgs/main/noarch - https://repo.continuum.io/pkgs/free/linux-64 - https://repo.continuum.io/pkgs/free/noarch - https://repo.continuum.io/pkgs/r/linux-64 - https://repo.continuum.io/pkgs/r/noarch - https://repo.continuum.io/pkgs/pro/linux-64 - https://repo.continuum.io/pkgs/pro/noarch
尝试将numpy降到1.15
conda install numpy=1.15 Fetching package metadata ........... Solving package specifications: . Package plan for installation in environment /data_d/old_home/home/.conda/envs/xhs2: The following NEW packages will be INSTALLED: mkl_fft: 1.0.12-py27ha843d7b_0 numpy: 1.15.4-py27h7e9f1db_0 The following packages will be DOWNGRADED: numpy-base: 1.16.4-py27hde5b4d6_0 --> 1.15.4-py27hde5b4d6_0 Proceed ([y]/n)? y
没有了上面的AttributeError的错误,但是之后报的错仍旧是一模一样,当.theanorc中device =cuda0时,报错:
GpuArrayException: GPU is too old for CUDA version
当设置为:device =cuda时,报错:
GpuArrayException: No cuda device available