TensorFlow 以及 cuda 和 cudnn 的环境配置问题

https://zhuanlan.zhihu.com/p/351787834

显卡驱动是我们使用显卡的最基础的条件，没有显卡驱动就没有办法使用显卡；但是，对于机器学习或者深度学习环境的配置来说，不必过多的关注显卡驱动，我们要做的只是要看看我们的显卡驱动是不是版本过低，因为每一个版本的 cuda toolkit 都有一个最低版本的显卡驱动要求【如果版本过低或者不是 NVIDIA 驱动，因为CUDA Toolkit本地安装包内含特定版本Nvidia显卡驱动，下面安装 cuda 中可以选择更新显卡驱动】

显卡驱动版本一定不能低，而且支持向后兼容，因此越高越好
如下所示，展示了每一个 cuda 版本对显卡驱动的最低要求，查看：https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html#cuda-major-component-versions

通常来说，我们的显卡版本只要高于这个要求，就可以不在关注显卡驱动的问题

使用 nvidai-smi 命令查看当前机器上的显卡信息：

本机当前的Driver Version: 411.31

所以本机安装的组件的版本：

tensorflow-gpu 1.15.0

cuda_10.0.130_411.31_win10

cudnn-10.0-windows10-x64-v7.6.5.32

1：安装tensorflow 1.15.0, 在激活的环境中安装

如果以上命令不行，也可以使用一下命令尝试

conda install --channel https://conda.anaconda.org/anaconda tensorflow-gpu=1.15.0
或 
pip install tensorflow-gpu==1.15.0 -i https://pypi.tuna.tsinghua.edu.cn/simple

2：安装CUDA

CUDA(ComputeUnified Device Architecture)，是显卡厂商NVIDIA推出的运算平台和编程模型。

CUDA是一种由NVIDIA推出的通用并行计算架构，该架构使GPU能够解决复杂的计算问题。允许开发者通过利用图形处理单元（GPU）的功能来进行计算，可以显着提高计算性能。.

如果之前安装了旧的版本，首先可以卸载旧的版本，之前安装的是9.0的版本

首先查看对应关系 ,查看地址：https://tensorflow.google.cn/install/source_windows#gpu

CUDA下载地址：https://developer.nvidia.com/cuda-downloads

下载之前的版本，这里下载的是10.1 update2版本

解压安装，安装完成后查看

C:\Users\Administrator>nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Sun_Jul_28_19:12:52_Pacific_Daylight_Time_2019
Cuda compilation tools, release 10.1, V10.1.243

3：安装cuDNN

NVIDIA cuDNN是用于深度神经网络的GPU加速库，专门为深度学习网络（如CNN）有了它才能在GPU上完成深度学习的计算它强调性能、易用性和低内存开销。

如果你要用GPU训练模型，cuDNN不是必须的，但是一般会采用这个加速库。

NVIDIA cuDNN可以集成到更高级别的机器学习框架中，如谷歌的Tensorflow、加州大学伯克利分校的流行caffe软件。

简单的插入式设计可以让开发人员专注于设计和实现神经网络模型，而不是简单调整性能，同时还可以在GPU上实现高性能现代并行计算。

cuDnn依赖CUDA，必须要先安装CUDA。

下载地址： https://developer.nvidia.cn/rdp/cudnn-archive 根据版本信息，我们下载7.6.5

下载完成后解压，将cuda中的三个文件夹复制到cuda的安装目录中。

配置pyCharm, 测试代码：

if __name__ == '__main__':
    os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
    tf.compat.v1.disable_eager_execution()

    print(tf.__version__)
    print('GPU:', tf.test.is_gpu_available())

    a = tf.constant(1.)
    b = tf.constant(2.)
    sess = tf.compat.v1.Session()

    print(sess.run(a + b))

输入结果如下：红色是程序中print的结果

D:\ProgramData\Anaconda3\envs\tensorflow\python.exe E:/python/anaconda/main.py
2022-01-05 17:20:50.200423: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll
1.15.0
2022-01-05 17:20:52.432109: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2022-01-05 17:20:52.435774: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll
2022-01-05 17:20:52.574292: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties: 
name: GeForce 930M major: 5 minor: 0 memoryClockRate(GHz): 0.941
pciBusID: 0000:01:00.0
2022-01-05 17:20:52.574466: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll
2022-01-05 17:20:52.577610: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_100.dll
2022-01-05 17:20:52.580471: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_100.dll
2022-01-05 17:20:52.582051: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_100.dll
2022-01-05 17:20:52.586049: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_100.dll
2022-01-05 17:20:52.588763: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_100.dll
2022-01-05 17:20:52.598200: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2022-01-05 17:20:52.600780: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
GPU: True
2022-01-05 17:20:54.022492: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2022-01-05 17:20:54.022637: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165]      0 
2022-01-05 17:20:54.022717: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0:   N 
2022-01-05 17:20:54.025966: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/device:GPU:0 with 3048 MB memory) -> physical GPU (device: 0, name: GeForce 930M, pci bus id: 0000:01:00.0, compute capability: 5.0)
2022-01-05 17:20:54.032587: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties: 
name: GeForce 930M major: 5 minor: 0 memoryClockRate(GHz): 0.941
pciBusID: 0000:01:00.0
2022-01-05 17:20:54.032763: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll
2022-01-05 17:20:54.032872: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_100.dll
2022-01-05 17:20:54.032979: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_100.dll
2022-01-05 17:20:54.033083: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_100.dll
2022-01-05 17:20:54.033189: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_100.dll
2022-01-05 17:20:54.033298: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_100.dll
2022-01-05 17:20:54.033405: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2022-01-05 17:20:54.034719: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2022-01-05 17:20:54.036856: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties: 
name: GeForce 930M major: 5 minor: 0 memoryClockRate(GHz): 0.941
pciBusID: 0000:01:00.0
2022-01-05 17:20:54.037077: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll
2022-01-05 17:20:54.037245: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_100.dll
2022-01-05 17:20:54.037369: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_100.dll
2022-01-05 17:20:54.037493: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_100.dll
2022-01-05 17:20:54.037610: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_100.dll
2022-01-05 17:20:54.037730: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_100.dll
2022-01-05 17:20:54.037853: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2022-01-05 17:20:54.045311: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2022-01-05 17:20:54.045536: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2022-01-05 17:20:54.045677: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165]      0 
2022-01-05 17:20:54.045764: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0:   N 
2022-01-05 17:20:54.048339: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3048 MB memory) -> physical GPU (device: 0, name: GeForce 930M, pci bus id: 0000:01:00.0, compute capability: 5.0)
3.0

Process finished with exit code 0

出现错误及解决：

1： Attempting to fetch value instead of handling error Internal: failed to get device attribute 13 for device 0: CUDA_ERROR_UNKNOWN: unknown error

问题产生的原因：驱动与cuda不匹配，更新nvidia驱动即可。
解决办法：进入cuda-gpu匹配页面： http://developer.nvidia.com/cuda-gpus
我的显卡是 GeForce 930M

点击，选择930M，

点击 Drivers 链接进入显卡驱动下载页面

=================================

CUDA Device Query ，算力： 5

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0\extras\demo_suite>deviceQuery.exe
deviceQuery.exe Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "GeForce 930M"
  CUDA Driver Version / Runtime Version          10.0 / 10.0
  CUDA Capability Major/Minor version number:    5.0
  Total amount of global memory:                 4096 MBytes (4294967296 bytes)
  ( 3) Multiprocessors, (128) CUDA Cores/MP:     384 CUDA Cores
  GPU Max Clock rate:                            941 MHz (0.94 GHz)
  Memory Clock rate:                             900 Mhz
  Memory Bus Width:                              64-bit
  L2 Cache Size:                                 1048576 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096)
  Maximum Layered 1D Texture Size, (num) layers  1D=(16384), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(16384, 16384), 2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 65536
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  2048
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 4 copy engine(s)
  Run time limit on kernels:                     Yes
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  CUDA Device Driver Mode (TCC or WDDM):         WDDM (Windows Display Driver Model)
  Device supports Unified Addressing (UVA):      Yes
  Device supports Compute Preemption:            No
  Supports Cooperative Kernel Launch:            No
  Supports MultiDevice Co-op Kernel Launch:      No
  Device PCI Domain ID / Bus ID / location ID:   0 / 1 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 10.0, CUDA Runtime Version = 10.0, NumDevs = 1, Device0 = GeForce 930M
Result = PASS

posted @ 2022-01-03 17:00 南极山阅读(60) 评论(0) 收藏举报

刷新页面返回顶部

南极山

努力学习新技术....

TensorFlow 以及 cuda 和 cudnn 的环境配置问题

公告