nvidia errors in TensorFlow

Go to check GPU memory usage, when encountering nvidia or cuda error.

I’ve been worked with tensorflow-gpu for a while. When run the codes, nvidia reports kinds of errors without clear instruction some time.

As far as my knowledge, met a nvidia error, one should check whether the gpu memory is occupied by other process by

nvidia-smi -l

As the following screenshot shows, the GPU-util is 0%, but the memory is nearly used out. If one run a new tensorflow process, errors are likely to be reported.
If you are coding with pycharm, there is scitific mode. In this mode, the process will not exit automatically, so the tensorflow session remains consuming the GPU memrory.

这里写图片描述

error example

E tensorflow/stream_executor/cuda/cuda_blas.cc:366] failed to create cublas handle: CUBLAS_STATUS_NOT_INITIALIZED

posted on   yusisc  阅读(23)  评论(0编辑  收藏  举报

编辑推荐:
· 开发者必知的日志记录最佳实践
· SQL Server 2025 AI相关能力初探
· Linux系列:如何用 C#调用 C方法造成内存泄露
· AI与.NET技术实操系列(二):开始使用ML.NET
· 记一次.NET内存居高不下排查解决与启示
阅读排行:
· 阿里最新开源QwQ-32B,效果媲美deepseek-r1满血版,部署成本又又又降低了!
· 开源Multi-agent AI智能体框架aevatar.ai,欢迎大家贡献代码
· Manus重磅发布:全球首款通用AI代理技术深度解析与实战指南
· 被坑几百块钱后,我竟然真的恢复了删除的微信聊天记录!
· AI技术革命,工作效率10个最佳AI工具

导航

< 2025年3月 >
23 24 25 26 27 28 1
2 3 4 5 6 7 8
9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29
30 31 1 2 3 4 5
点击右上角即可分享
微信分享提示