Latent Diffusion Model初体验
此方式在服务器(Ubuntu)上安装Stable Diffusion,使用miniconda虚拟环境。在Windows10系统上使用VScode远程开发方法进行调试和运行。
注意:使用VSCode连接服务器远程开发调试,代码中的相对路径是打开文件夹的路径,并不是当前这个py文件所在的路径
源码拷贝
github仓库地址,直接选择下载压缩文件即可
解压到选定目录,查看项目文件结构如下
虚拟环境
按照github仓库README.md
的方法,创建虚拟环境的命令如下
conda env create -f environment.yaml
conda activate ldm
但中间出了很多问题,总结如下
-
环境配置要去github上克隆项目,要用到git,所以要先确保git安装,并且环境变量配置好
-
上述命令要在当前项目目录下运行,运行命令前检查当前目录下是否包含
environment.yaml
和setup.py
两个文件 -
环境配置中要去github克隆并安装
clip
和taming-transformers
两个项目,并置于项目目录/src下,所需时间较长,耐心等待
VSCode带命令行参数运行py文件
需要在Run and Debug
一栏创建launch.json
文件。并在configurations
中指定args
,具体参数见github仓库README.md
格式为
"--arg1", "value_1",
"--arg2", "value_2",
然后打开要执行的py文件,点击绿色
按钮,或直接使用F5
快捷键,运行即可,在Terminal
会输出刚刚设定的命令行参数
运行输出如下
txt2img
生成的结果在output/txt2img-samples
目录下
查看生成的图片,prompt是a happy bear reading a newspaper, oil on canvas(病毒怪物正在弹吉他,布面油画)
尝试更换prompt:A german shepherd dog sits on the edge of a vast wheat field at sunset, realistic style(夕阳西下,一条德国牧羊犬坐在广阔的麦田边,写实风格)
,生成效果如下
img2img
对给定的原始图像,根据prompt进行修改
命令行参数如下
"--prompt", "Teddy bear sitting on a park bench",
"--init-img", "/data4/stable_diffusion_reconstruction/decoded/image-cvpr/org_img/paper_imgs/00095_org.png",
"--n_iter", "2",
"--n_samples", "2",
初始图像
运行结果
效果图
问题记录
Problem1:packaging.version.InvalidVersion: Invalid version: '0.10.1,<0.11'
issue地址:https://github.com/CompVis/latent-diffusion/issues/207
解决方式:
pip install packaging==21.3
pip install 'torchmetrics<0.8'
Problem2:ValueError: Connection error, and we cannot find the requested files in the cached path(error code :BertTokenizerFast.from_pretrained("bert-base-uncased"))
参考地址:https://github.com/huggingface/transformers/issues/25111
解决方式
downloading the following file in the https://huggingface.co/bert-base-uncased/tree/main:
- config.json
- vocab.txt
- pytorch_model.bin
tokenizer = BertTokenizerFast.from_pretrained(args.text_encoder)
改为
tokenizer = BertTokenizerFast.from_pretrained("./bert_localpath/")
./bert_localpath/ is the path where I put the above file.(ps:这里我使用的是绝对路径,相对路径还是会报错)
Problem3:CUDA error: out of memory
Exception has occurred: RuntimeError
CUDA error: out of memory
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
File "/mnt/disk6_brain/latent_diffusion/latent-diffusion-main/scripts/txt2img.py", line 28, in load_model_from_config
model.cuda()
File "/mnt/disk6_brain/latent_diffusion/latent-diffusion-main/scripts/txt2img.py", line 108, in <module>
model = load_model_from_config(config, "/mnt/disk6_brain/latent_diffusion/latent-diffusion-main/models/ldm/text2img-large/model.ckpt") # TODO: check path
RuntimeError: CUDA error: out of memory
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
参考地址:https://www.codetd.com/ru/article/14935168
默认使用0号GPU,但是0号GPU已经被占用了,所以要在代码中修改默认GPU编号,此修改要在import torch
之前
import os
os.environ["CUDA_VISIBLE_DEVICES"] = '1'