Latent Diffusion Model初体验

此方式在服务器(Ubuntu)上安装Stable Diffusion,使用miniconda虚拟环境。在Windows10系统上使用VScode远程开发方法进行调试和运行。

注意:使用VSCode连接服务器远程开发调试,代码中的相对路径是打开文件夹的路径,并不是当前这个py文件所在的路径

源码拷贝

github仓库地址,直接选择下载压缩文件即可

image-20230809185427273

解压到选定目录,查看项目文件结构如下

image-20230809185907800

虚拟环境

按照github仓库README.md的方法,创建虚拟环境的命令如下

conda env create -f environment.yaml
conda activate ldm

但中间出了很多问题,总结如下

  1. 环境配置要去github上克隆项目,要用到git,所以要先确保git安装,并且环境变量配置好

    image-20230809190351461

  2. 上述命令要在当前项目目录下运行,运行命令前检查当前目录下是否包含environment.yamlsetup.py两个文件

    image-20230809190521112
  3. 环境配置中要去github克隆并安装cliptaming-transformers两个项目,并置于项目目录/src下,所需时间较长,耐心等待

    image-20230809190709828

VSCode带命令行参数运行py文件

需要在Run and Debug一栏创建launch.json文件。并在configurations中指定args,具体参数见github仓库README.md

格式为

"--arg1", "value_1",
"--arg2", "value_2",
image-20230809191204506

然后打开要执行的py文件,点击绿色按钮,或直接使用F5快捷键,运行即可,在Terminal会输出刚刚设定的命令行参数

image-20230809191604713

运行输出如下

image-20230809191750839

txt2img

生成的结果在output/txt2img-samples目录下

查看生成的图片,prompt是a happy bear reading a newspaper, oil on canvas(病毒怪物正在弹吉他,布面油画)

image-20230809192955371

尝试更换prompt:A german shepherd dog sits on the edge of a vast wheat field at sunset, realistic style(夕阳西下,一条德国牧羊犬坐在广阔的麦田边,写实风格),生成效果如下

image-20230809192919240

img2img

对给定的原始图像,根据prompt进行修改

命令行参数如下

"--prompt", "Teddy bear sitting on a park bench",
"--init-img", "/data4/stable_diffusion_reconstruction/decoded/image-cvpr/org_img/paper_imgs/00095_org.png",
"--n_iter", "2",
"--n_samples", "2",

初始图像

image-20231121103205254

运行结果

image-20231121103256286

效果图

image-20231121103322347

问题记录

Problem1:packaging.version.InvalidVersion: Invalid version: '0.10.1,<0.11'

issue地址:https://github.com/CompVis/latent-diffusion/issues/207

解决方式:

pip install packaging==21.3
pip install 'torchmetrics<0.8'

Problem2:ValueError: Connection error, and we cannot find the requested files in the cached path(error code :BertTokenizerFast.from_pretrained("bert-base-uncased"))

参考地址:https://github.com/huggingface/transformers/issues/25111

解决方式

downloading the following file in the https://huggingface.co/bert-base-uncased/tree/main:

  • config.json
  • vocab.txt
  • pytorch_model.bin
tokenizer = BertTokenizerFast.from_pretrained(args.text_encoder)
改为

tokenizer = BertTokenizerFast.from_pretrained("./bert_localpath/")

./bert_localpath/ is the path where I put the above file.(ps:这里我使用的是绝对路径,相对路径还是会报错)

Problem3:CUDA error: out of memory

Exception has occurred: RuntimeError
CUDA error: out of memory
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
  File "/mnt/disk6_brain/latent_diffusion/latent-diffusion-main/scripts/txt2img.py", line 28, in load_model_from_config
    model.cuda()
  File "/mnt/disk6_brain/latent_diffusion/latent-diffusion-main/scripts/txt2img.py", line 108, in <module>
    model = load_model_from_config(config, "/mnt/disk6_brain/latent_diffusion/latent-diffusion-main/models/ldm/text2img-large/model.ckpt")  # TODO: check path
RuntimeError: CUDA error: out of memory
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

参考地址:https://www.codetd.com/ru/article/14935168

默认使用0号GPU,但是0号GPU已经被占用了,所以要在代码中修改默认GPU编号,此修改要在import torch之前

import os
os.environ["CUDA_VISIBLE_DEVICES"] = '1'
posted @ 2023-08-09 19:31  dctwan  阅读(499)  评论(0编辑  收藏  举报