如何使用Alpaca-LoRA来微调ChatGPT这样的模型？

原文标题：How to use Alpaca-LoRA to fine-tune a model like ChatGPT

原文地址：https://replicate.com/blog/fine-tune-alpaca-with-lora?continueFlag=4ecae39885197a5c008faabbefb5c824

低秩自适应(LoRA)是一种微调模型的技术，它比以前的方法有一些优点:

它速度更快，使用的内存更少，这意味着它可以在消费级硬件上运行。
输出要小得多(兆字节，而不是吉字节)。
可以在运行时将多个经过微调的模型组合在一起。

上个月，我们写了一篇关于使用LoRA快速微调稳定扩散的博客。我们的朋友Simon Ryu(又名@cloneofsimo)将LoRA技术应用于稳定扩散，允许人们从少量训练图像中创建自定义训练风格，然后在预测时混合和匹配这些风格，以创建高度自定义的图像。

一个月后，我们看到LoRA被应用到其他地方。现在它被用于微调像LLaMA这样的大型语言模型。本月早些时候，Eric J. Wang发布了Alpaca-LoRA项目，该项目包含了使用PEFT重现斯坦福Alpaca结果的代码，PEFT是一个库，允许您使用各种基于转换器的语言模型并使用LoRA对它们进行微调。这样做的好处在于，它允许您在中等硬件上以更小的输出(可能是可组合的)廉价而高效地对模型进行微调。

在这篇博客文章中，我们将向您展示如何使用LoRA使用Alpaca(羊驼)训练数据来微调LLaMA。

Prerequisites

GPU machine. Thanks to LoRA you can do this on low-spec GPUs like an NVIDIA T4 or consumer GPUs like a 4090. If you don't already have access to a machine with a GPU, check out our guide to getting a GPU machine.
LLaMA weights. The weights for LLaMA have not yet been released publicly. To apply for access, fill out this Meta Research form.

需要GPU机器和LLaMA权重

Step 1: Clone the Alpaca-LoRA repo

We’ve created a fork of the original Alpaca-LoRA repo that adds support for Cog. Cog is a tool to package machine learning models in containers and we're using it to install the dependencies to fine-tune and run the model.

Clone the repository using Git:

git clone https://github.com/daanelson/alpaca-lora
cd alpaca-lora

Step 2: Get LLaMA weights

Put your downloaded weights in a folder called unconverted-weights. The folder hierarchy should look something like this:

unconverted-weights
├── 7B
│   ├── checklist.chk
│   ├── consolidated.00.pth
│   └── params.json
├── tokenizer.model
└── tokenizer_checklist.chk

Convert the weights from a PyTorch checkpoint to a transformers-compatible format using this command:

cog run python -m transformers.models.llama.convert_llama_weights_to_hf \
  --input_dir unconverted-weights \
  --model_size 7B \
  --output_dir weights

You final directory structure should look like this:

weights
├── llama-7b
└── tokenizermdki

Step 3: Install Cog

sudo curl -o /usr/local/bin/cog -L "https://github.com/replicate/cog/releases/latest/download/cog_$(uname -s)_$(uname -m)"
sudo chmod +x /usr/local/bin/cog

Step 4: Fine-tune the model

The fine-tuning script is configured by default to work on less powerful GPUs, but if you have a GPU with more memory, you can increase MICRO_BATCH_SIZE to 32 or 64 in finetune.py .

If you have your own instruction tuning dataset, edit DATA_PATH in finetune.py to point to your own dataset. Make sure it has the same format as alpaca_data_cleaned.json.

Run the fine-tuning script:

cog run python finetune.py

This takes 3.5 hours on a 40GB A100 GPU, and more than that for GPUs with less processing power.

Step 5: Run the model with Cog

$ cog predict -i prompt="Tell me something about alpacas."
Alpacas are domesticated animals from South America. They are closely related to llamas and guanacos and have a long, dense, woolly fleece that is used to make textiles. They are herd animals and live in small groups in the Andes mountains. They have a wide variety of sounds, including whistles, snorts, and barks. They are intelligent and social animals and can be trained to perform certain tasks.

Next steps

Here are some ideas for what you could do next:

Bring your own dataset and fine-tune your own LoRA, like Cabrita: A portuguese finetuned instruction LLaMA, or Fine-tune LLaMA to speak like Homer Simpson.
Push the model to Replicate to run it in the cloud. This is handy if you want an API to build interfaces, or to run large-scale evaluation in parallel. You'll need to keep it private so the weights aren't public.
Combine LoRAs. It is possible to combine different Stable Diffusion LoRAs to have a fine-tuned style and fine-tuned object in the same image. What could be possible if this was done with language models?
Fine-tune the larger LLaMA models with the Alpaca dataset (or other datasets) and see how they perform. This should be possible with PEFT and LoRA, although it will need larger GPUs.

posted on 2023-04-11 16:46 宋岳庭阅读(846) 评论(0) 编辑收藏举报