1.0 Block I/O

Because flask is a block I/O, even torch.cuda.empty_cache() still cannot help for it.

Eventually, I find that the solution is: creating the new thread for SD pipeline .

2.0 Create new thread

import threading
import torch
import json
import gc
from PIL import Image
from diffusers import StableDiffusionPipeline
from flask import Flask, request, render_template

app = Flask(__name__, static_url_path='', static_folder='', template_folder='')


@app.route("/", methods=["POST"])
def index():
    # 1.0 thread
    class MyThread(threading.Thread):
        def __init__(self,):
            threading.Thread.__init__(self)

        # 1.1 pipeline
        def run(self):
            # 1.2 clean cuda
            if torch.cuda.is_available():
                gc.collect()
                torch.cuda.empty_cache()
                torch.cuda.ipc_collect()

            model_id, prompt = "YOUR_MODEL_ID", "YOUR_PROMPT"
            pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16).to('cuda')
            image = pipe(prompt).images[0]

    # 2.0 create 5 thread
    threads = []
    for i in range(5):
        threads.append(MyThread(i))
        threads[i].start()

    return app.response_class(response=json.dumps({'status': 'success'}), status=200, mimetype='application/json')


if __name__ == '__main__':
    app.debug = True
    app.run(host='0.0.0.0', port=82)

　　PS: 這是我從項目簡化出來的代碼，未經測試。

First, 1.0 create a pipeline thread instance
Second, 1.2 clean the cuda space before running pipeline
Finally, 2.0 start a pipeline thread
If pipline is in a new thread, the cuda space can be released by torch.cuda.empty_cache().

3.0 Project Code

https://github.com/kenny-chen/ai.diffusers

posted on 2023-01-17 15:32 chankuang 阅读(803) 评论(0) 编辑收藏举报

刷新页面返回顶部

科技美学

公告

1.0 Block I/O

2.0 Create new thread

3.0 Project Code