Loading

nvidia docker Cannot load libnvcuvid.so.1

问题

使用docker对runtime进行封装,程序需要用到GPU且在host正常运行,在容器内则报错:

Cannot load libnvcuvid.so.1
[hevc_cuvid @ 0x559da3fbd80] Failed loading nvcuvid.
terminate called after throwing an instance of 'std::runtime_error'
  what(): failed to open avcodec
Aborted (core dumped)

排查

第一反应ldd查看依赖,程序并不直接依赖libnvcuvid.so.1
host存在/usr/lib/x86_64-linux-gnu/libnvcuvid.so.1
容器内部却找不到库文件,已包含docker run --gpus all
容器执行nvidia-smi能够正常显示GPU调用CUDA也正常

分析

既然CUDA正常排除container-toolkit问题,深挖应用节点依赖的avcodec需要用到显卡codec
默认不开启codec支持,Driver Capability列表如下

Driver Capability Description
compute required for CUDA and OpenCL applications.
compat32 required for running 32-bit applications.
graphics required for running OpenGL and Vulkan applications.
utility required for using nvidia-smi and NVML.
video required for using the Video Codec SDK.
display required for leveraging X11 display.
empty or unset default driver capability: utility, compute

解决

$ docker run --rm --runtime=nvidia -e NVIDIA_DRIVER_CAPABILITIES=compute,video,utility nvidia/cuda:12.0.1-cudnn8-runtime-ubuntu18.04 ldconfig -p | grep libnvcuvid
	libnvcuvid.so.1 (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libnvcuvid.so.1

$ docker run --rm --gpus 'all,"capabilities=compute,video,utility"' nvidia/cuda:12.0.1-cudnn8-runtime-ubuntu18.04 ldconfig -p | grep libnvcuvid
	libnvcuvid.so.1 (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libnvcuvid.so.1

参考

User Guide — container-toolkit 1.10.0 documentation

posted @ 2024-07-09 10:42  azureology  阅读(181)  评论(0编辑  收藏  举报