[GenerativeAI] How to train both?

闲话

[labelbox] The leading training data platform for data labelling

末日要到了~

Labelbox CEO discusses breakthroughs in AI training data

Building vs. buying a training data platform

🎙 Manu Sharma/CEO of Labelbox about the future of data labeling automation

TheSequence interviews ML practitioners to merge you into the real world of machine learning and artificial intelligence

 

Synthetic data won't fully replace human supervision in AI

Synthetic data creation has become more popular and more similar to real datasets over recent years. GAN models can generate augmented data to diversify datasets where real data is thin, or to mitigate issues caused by changes in the camera sensor or lighting conditions. But at Labelbox, we're seeing our customers build complex AI systems where synthetic data generation is completely unnecessary, and at best used only for augmentation. "I think we humans will continue to supervise AI systems for a while. Don't underestimate human ingenuity," said Manu.

 

 

正式开始


CLIP: Connecting text and images

OpenAI 的 multimodal 神經網路 (下) CLIP: Connecting Text and Images

Ref: Diffusion模型的5种训练方法,Ai画图训练,微调,fine tuning (b zhan)

    • scheduler:不属于网络模型的一部分。控制噪点。
    • encoder: 文字编码
    • Vae:编码图像,得到两张特征图:均值、标准差。
    • Unet:生成网络。

 

(训练篇) Stable Diffusion训练方法对比

Stable Diffusion Quick Kit 动手实践 – 使用 Dreambooth 进行模型微调在 SageMaker 上的优化实践 [aws sm方案]

 

ubuntu@ip-10-0-0-148:~/stable-diffusion-webui/models/dreambooth/my_model$ ls
db_config.json  my_model.yaml  working

ubuntu@ip-10-0-0-148:~/stable-diffusion-webui/models/dreambooth/my_model$ pwd
/home/ubuntu/stable-diffusion-webui/models/dreambooth/my_model
View Code

(4) Set parameters

 

  • 训练结果 .pt文件

Ref: Textual Inversion

The result of the training is a .pt or a .bin file (former is the format used by original author, latter is by the diffusers library) with the embedding in it.

 

 

  

Ref: 适配Diffusers框架的全套教程来了!从T2I-Adapter到大热ControlNet

# todo list.

 

Ref: 不得不读 | 深入浅出ControlNet,一种可控生成的AIGC绘画生成算法!

提出几种基于不同图像条件输入的方式,以控制生成。

使用Canny边缘检测,用随机阈值从互联网上获得3M的边缘-图像-caption数据对。该模型 使用Nvidia A100 80G进行600个gpu小时的训练。使用的基础模型是Stable Diffusion 1.5。此外,对上述Canny边缘数据集按图像分辨率进行排序,并采样1k、10k、50k、500k样本的子集。使用相同的实验设置来测试数据集规模的影响。

 

ref: https://huggingface.co/blog/train-your-controlnet

Getting started with training your ControlNet for Stable Diffusion

Training your own ControlNet requires 3 steps:

    1. Planning your condition: ControlNet is flexible enough to tame Stable Diffusion towards many tasks. The pre-trained models showcase a wide-range of conditions, and the community has built others, such as conditioning on pixelated color palettes.

    2. Building your dataset: Once a condition is decided, it is time to build your dataset. For that, you can either construct a dataset from scratch, or use a sub-set of an existing dataset. You need three columns on your dataset to train the model:

      • a ground truth image,

      • conditioning_image and

      • prompt.

    3. Training the model: Once your dataset is ready, it is time to train the model. This is the easiest part thanks to the diffusers training script. You'll need a GPU with at least 8GB of VRAM.

 

 

四少女火爆外网,ControlNet 组合拳效果惊人,颠覆 AI 绘画游戏规则

给 AI 画画模型加 buff

ControlNet 的原理,本质上是给预训练扩散模型增加一个额外的输入,控制它生成的细节。

这里可以是各种类型的输入,作者给出来的有 8 种,包括草图、边缘图像、语义分割图像、人体关键点特征、霍夫变换检测直线、深度图、人体骨骼等。

那么,让大模型学会“按输入条件生成图片”的原理是什么呢?

ControlNet 整体思路和架构分工如下:

 

“锁定模型”和“可训练副本”通过一个 1×1 的卷积层连接,名叫“0 卷积层”。

0 卷积层的权重和偏置初始化为 0,这样在训练时速度会非常快,接近微调扩散模型的速度,甚至在个人设备上训练也可以。

例如一块英伟达 RTX 3090TI,用 20 万张图像数据训练的话只需要不到一个星期

 

posted @ 2021-12-03 10:41  郝壹贰叁  阅读(45)  评论(0编辑  收藏  举报