1.0 Block I/O
Because flask is a block I/O, even torch.cuda.empty_cache() still cannot help for it.
Eventually, I find that the solution is: creating the new thread for SD pipeline .
2.0 Create new thread
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 | import threading import torch import json import gc from PIL import Image from diffusers import StableDiffusionPipeline from flask import Flask, request, render_template app = Flask(__name__, static_url_path = ' ', static_folder=' ', template_folder=' ') @app .route( "/" , methods = [ "POST" ]) def index(): # 1.0 thread class MyThread(threading.Thread): def __init__( self ,): threading.Thread.__init__( self ) # 1.1 pipeline def run( self ): # 1.2 clean cuda if torch.cuda.is_available(): gc.collect() torch.cuda.empty_cache() torch.cuda.ipc_collect() model_id, prompt = "YOUR_MODEL_ID" , "YOUR_PROMPT" pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype = torch.float16).to( 'cuda' ) image = pipe(prompt).images[ 0 ] # 2.0 create 5 thread threads = [] for i in range ( 5 ): threads.append(MyThread(i)) threads[i].start() return app.response_class(response = json.dumps({ 'status' : 'success' }), status = 200 , mimetype = 'application/json' ) if __name__ = = '__main__' : app.debug = True app.run(host = '0.0.0.0' , port = 82 ) |
PS: 這是我從項目簡化出來的代碼,未經測試。
First, 1.0 create a pipeline thread instance
Second, 1.2 clean the cuda space before running pipeline
Finally, 2.0 start a pipeline thread
If pipline is in a new thread, the cuda space can be released by torch.cuda.empty_cache().
3.0 Project Code
https://github.com/kenny-chen/ai.diffusers
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 阿里最新开源QwQ-32B,效果媲美deepseek-r1满血版,部署成本又又又降低了!
· 开源Multi-agent AI智能体框架aevatar.ai,欢迎大家贡献代码
· Manus重磅发布:全球首款通用AI代理技术深度解析与实战指南
· 被坑几百块钱后,我竟然真的恢复了删除的微信聊天记录!
· AI技术革命,工作效率10个最佳AI工具