重置GPU显存 Reset GPU memory after CUDA errors
Sometimes CUDA program crashed during execution, before memory was flushed. As a result, device memory remained occupied.
There are some solutions:
1.
Try using:
nvidia-smi --gpu-reset
nvidia-smi -r
2.
Although it should be unecessary to do this in anything other than exceptional circumstances, the recommended way to do this on linux hosts is to unload the nvidia driver by doing
sudo rmmod nvidia
with suitable root privileges and then reloading it with
sudo modprobe nvidia
If the machine is running X11, you will need to stop this manually beforehand, and restart it afterwards. The driver intialisation processes should eliminate any prior state on the device.
This answer has been assembled from comments and posted as a community wiki to get this question off the unanswered list for the CUDA tag
3.
This methods working for me:
check what is using your GPU memory with
sudo fuser -v /dev/nvidia*
Your output will look something like this:
USER PID ACCESS COMMAND /dev/nvidia0: root 1256 F...m Xorg username 2057 F...m compiz username 2759 F...m chrome username 2777 F...m chrome username 20450 F...m python username 20699 F...m python
Then kill the PID that you no longer need on htop
or with
sudo kill -9 PID.
4.
Or simply reboot:
sudo reboot
欢迎转载,转载请保留页面地址。帮助到你的请点个推荐。
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 基于Microsoft.Extensions.AI核心库实现RAG应用
· Linux系列:如何用heaptrack跟踪.NET程序的非托管内存泄露
· 开发者必知的日志记录最佳实践
· SQL Server 2025 AI相关能力初探
· Linux系列:如何用 C#调用 C方法造成内存泄露
· 无需6万激活码!GitHub神秘组织3小时极速复刻Manus,手把手教你使用OpenManus搭建本
· Manus爆火,是硬核还是营销?
· 终于写完轮子一部分:tcp代理 了,记录一下
· 别再用vector<bool>了!Google高级工程师:这可能是STL最大的设计失误
· 单元测试从入门到精通