解决Cupy相关报错

问题背景

仅作为一个记录,介绍一些单独安装和使用cupy的过程中有可能遇到的一些报错及相应的解决办法,还有一些问题是配置过程中遇到的环境问题。

libnvrtc相关问题

报错信息:

RuntimeError: CuPy failed to load libnvrtc.so.11.2: OSError: libnvrtc.so.11.2: cannot open shared object file: No such file or directory

解决方案:

$ sudo find / -name "libnvrtc.so.11.2
$ export LD_LIBRARY_PATH=xxx/lib/:$LD_LIBRARY_PATH

配置好正确的libnvrtc路径即可。

cuVS相关问题

报错信息:

RuntimeError: cuVS >= 24.12 or pylibraft < 24.12 should be installed to use this feature

解决方案:

$ python3 -m pip install --upgrade cuvs-cu11==24.12.*

cdist计算报错

报错信息:

Traceback (most recent call last):
    dis = cdist(middle, grid)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/lib/python3.12/site-packages/cupyx/scipy/spatial/distance.py", line 630, in cdist
    pairwise_distance(XA, XB, output_arr, metric, p)
  File "resources.pyx", line 110, in cuvs.common.resources.auto_sync_resources.wrapper
  File "/root/miniconda3/lib/python3.12/site-packages/pylibraft/common/outputs.py", line 83, in wrapper
    ret_value = f(*args, **kwargs)
                ^^^^^^^^^^^^^^^^^^
  File "distance.pyx", line 140, in cuvs.distance.distance.pairwise_distance
  File "exceptions.pyx", line 37, in cuvs.common.exceptions.check_cuvs
cuvs.common.exceptions.CuvsException: CUDA error encountered at: file=/pyenv/versions/3.12.9/lib/python3.12/site-packages/libraft/include/raft/linalg/detail/coalesced_reduction-inl.cuh line=271: call='cudaPeekAtLastError()', Reason=cudaErrorInvalidConfiguration:invalid configuration argument
Obtained 26 stack frames
#1 in /root/miniconda3/lib/python3.12/site-packages/libcuvs/lib64/libcuvs.so: raft::cuda_error::cuda_error(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) +0xbd [0x7ff1f5dfca1d]
#2 in /root/miniconda3/lib/python3.12/site-packages/libcuvs/lib64/libcuvs.so(+0x4310f5) [0x7ff1f5c450f5]
#3 in /root/miniconda3/lib/python3.12/site-packages/libcuvs/lib64/libcuvs.so(+0xf383ac) [0x7ff1f674c3ac]
#4 in /root/miniconda3/lib/python3.12/site-packages/libcuvs/lib64/libcuvs.so(+0xf3865f) [0x7ff1f674c65f]
#5 in /root/miniconda3/lib/python3.12/site-packages/libcuvs/lib64/libcuvs.so: void cuvs::distance::pairwise_distance<float, int, float>(raft::resources const&, float const*, float const*, float*, int, int, int, cuvsDistanceType, bool, float) +0x3b6 [0x7ff1f679d426]
#6 in /root/miniconda3/lib/python3.12/site-packages/libcuvs/lib64/libcuvs.so: void cuvs::distance::pairwise_distance<float, std::experimental::layout_right, int, float>(raft::resources const&, std::experimental::mdspan<float const, std::experimental::extents<int, 18446744073709551615ul, 18446744073709551615ul>, std::experimental::layout_right, raft::host_device_accessor<std::experimental::default_accessor<float const>, (raft::memory_type)2> >, std::experimental::mdspan<float const, std::experimental::extents<int, 18446744073709551615ul, 18446744073709551615ul>, std::experimental::layout_right, raft::host_device_accessor<std::experimental::default_accessor<float const>, (raft::memory_type)2> >, std::experimental::mdspan<float, std::experimental::extents<int, 18446744073709551615ul, 18446744073709551615ul>, std::experimental::layout_right, raft::host_device_accessor<std::experimental::default_accessor<float>, (raft::memory_type)2> >, cuvsDistanceType, float) +0x3c7 [0x7ff1f679da87]
#7 in /root/miniconda3/lib/python3.12/site-packages/libcuvs/lib64/libcuvs_c.so(+0x7a1a3) [0x7ff29d8e91a3]
#8 in /root/miniconda3/lib/python3.12/site-packages/libcuvs/lib64/libcuvs_c.so: cuvsPairwiseDistance +0x37 [0x7ff29d8e9837]
#9 in /root/miniconda3/lib/python3.12/site-packages/cuvs/distance/distance.cpython-312-x86_64-linux-gnu.so(+0x48237) [0x7ff29e4e6237]
#10 in python3: _PyObject_Call +0x122 [0x55e162]
#11 in python3: _PyEval_EvalFrameDefault +0x503a [0x52d84a]
#12 in python3: PyVectorcall_Call +0xe1 [0x4dcf63]
#13 in /root/miniconda3/lib/python3.12/site-packages/cuvs/common/resources.cpython-312-x86_64-linux-gnu.so(+0x4850e) [0x7ff29e48850e]
#14 in python3: _PyObject_MakeTpCall +0x2fb [0x51e38b]
#15 in python3: _PyEval_EvalFrameDefault +0x6ce [0x528ede]
#16 in python3: PyEval_EvalCode +0xae [0x5e581e]
#17 in python3() [0x60bfd7]
#18 in python3() [0x6071c7]
#19 in python3() [0x61f452]
#20 in python3: _PyRun_SimpleFileObject +0x1b0 [0x61ed90]
#21 in python3: _PyRun_AnyFileObject +0x43 [0x61eb83]
#22 in python3: Py_RunMain +0x303 [0x617c93]
#23 in python3: Py_BytesMain +0x39 [0x5d03c9]
#24 in /lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x7ff2f3fc3d90]
#25 in /lib/x86_64-linux-gnu/libc.so.6: __libc_start_main +0x80 [0x7ff2f3fc3e40]
#26 in python3() [0x5d01f9]

原因:输入了0维数组,需确认输入正确。

libffi相关问题

报错信息:

$ git pull
/usr/lib/git-core/git-remote-https: symbol lookup error: /lib/x86_64-linux-gnu/libp11-kit.so.0: undefined symbol: ffi_type_pointer, version LIBFFI_BASE_7.0

解决方案:

$ sudo find / -name libffi*
xxx/lib/libffi.so.7

找到相应的动态链接库文件之后,先做一个备份:

$ mv xxx/lib/libffi.so.7 xxx/lib/libffi.so.7.bak

然后制作一个软链接:

$ sudo ln -s /usr/lib/x86_64-linux-gnu/libffi.so.7.1.0 xxx/lib/libffi.so.7

确保相应的动态链接库在系统路径的配置中:

export LD_LIBRARY_PATH=/usr/lib/x86_64-linux-gnu:$LD_LIBRARY_PATH

总结概要

本文记录了一些使用python-cupy的过程中有可能的遇到的一些问题,一部分是环境配置问题,还有一部分是运行输入问题。

版权声明

本文首发链接为:https://www.cnblogs.com/dechinphy/p/cupy-error.html

作者ID:DechinPhy

更多原著文章:https://www.cnblogs.com/dechinphy/

请博主喝咖啡:https://www.cnblogs.com/dechinphy/gallery/image/379634.html

posted @ 2025-04-25 15:53  DECHIN  阅读(93)  评论(0)    收藏  举报