nvidia docker Cannot load libnvcuvid.so.1
问题
使用docker对runtime进行封装,程序需要用到GPU且在host正常运行,在容器内则报错:
Cannot load libnvcuvid.so.1
[hevc_cuvid @ 0x559da3fbd80] Failed loading nvcuvid.
terminate called after throwing an instance of 'std::runtime_error'
what(): failed to open avcodec
Aborted (core dumped)
排查
第一反应ldd查看依赖,程序并不直接依赖libnvcuvid.so.1
host存在/usr/lib/x86_64-linux-gnu/libnvcuvid.so.1
容器内部却找不到库文件,已包含docker run --gpus all
容器执行nvidia-smi
能够正常显示GPU调用CUDA也正常
分析
既然CUDA正常排除container-toolkit问题,深挖应用节点依赖的avcodec需要用到显卡codec
默认不开启codec支持,Driver Capability列表如下
Driver Capability | Description |
---|---|
compute |
required for CUDA and OpenCL applications. |
compat32 |
required for running 32-bit applications. |
graphics |
required for running OpenGL and Vulkan applications. |
utility |
required for using nvidia-smi and NVML. |
video |
required for using the Video Codec SDK. |
display |
required for leveraging X11 display. |
empty or unset | default driver capability: utility , compute |
解决
$ docker run --rm --runtime=nvidia -e NVIDIA_DRIVER_CAPABILITIES=compute,video,utility nvidia/cuda:12.0.1-cudnn8-runtime-ubuntu18.04 ldconfig -p | grep libnvcuvid
libnvcuvid.so.1 (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libnvcuvid.so.1
$ docker run --rm --gpus 'all,"capabilities=compute,video,utility"' nvidia/cuda:12.0.1-cudnn8-runtime-ubuntu18.04 ldconfig -p | grep libnvcuvid
libnvcuvid.so.1 (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libnvcuvid.so.1