Mistral-7B-Instruct-v0.2 运行尝鲜,原始权重和pytorch/safetensor两种方式

https://docs.mistral.ai/models/

Mistral-7B-Instruct-v0.2 raw_weights: https://models.mistralcdn.com/mistral-7b-v0-2/Mistral-7B-v0.2-Instruct.tar
md5sum: fbae55bc038f12f010b4251326e73d39

mistral-7B-v0_2: https://models.mistralcdn.com/mistral-7b-v0-2/mistral-7B-v0.2.tar

https://github.com/mistralai-sf24/hackathon

run our 7B model and to finetune it

Mistral 7B v0.2 基础模型开源,魔搭社区微调教程和评测来啦! ModelScope小助理 2024-03-26

Mistral 7B v0.2是基础模型,并不适合直接使用推理使用,推荐使用其instruct版本

qucik start with raw_weights, hackathon

下载原始模型权重文件并运行

# download the model
$ wget -c https://models.mistralcdn.com/mistral-7b-v0-2/Mistral-7B-v0.2-Instruct.tar
$ md5sum Mistral-7B-v0.2-Instruct.tar

# 解压, 得到 consolidated.00.pth、params.json、tokenizer.model, 把这三放到文件夹 Mistral-7B-v0.2-Instruct-raw 里面
$ tar -xf Mistral-7B-v0.2-Instruct.tar

$ git clone https://github.com/mistralai-sf24/hackathon.git
$ cd hackathon
$ pip install -r requirements_hackathon.txt
$ python -m main demo ../Mistral-7B-v0.2-Instruct-raw

$ python -m main interactive ../Mistral-7B-v0.2-Instruct-raw

TypeError: ModelArgs.__init__() missing 1 required positional argument: 'sliding_window' 错误是因为 Mistral-7B-Instruct-v0.2 取消了滑动窗口,需要注释掉代码里的 sliding_window,最后运行成功。修改后的代码

Chat template

<s>[INST] Instruction [/INST] Model answer</s>[INST] Follow-up instruction [/INST]

例子:"[INST] What is your favourite condiment? [/INST]Well, I'm quite partial to a good squeeze of fresh lemon juice. It adds just the right amount of zesty flavour to whatever I'm cooking up in the kitchen! [INST] Do you have mayonnaise recipes? [/INST]"

apply_chat_template 的例子

原始模型权重文件转换为 huggingface 格式

转换脚本

$ python convert_mistral_weights_to_hf.py --input_dir Mistral-7B-v0.2-Instruct-raw --model_size 7B --output_dir Mistral-7B-v0.2-Instruct-hf
Traceback (most recent call last):
  File "/data/user/yicairun/repo/lm/mistralai/convert_mistral_weights_to_hf.py", line 276, in <module>
    main()
  File "/data/user/yicairun/repo/lm/mistralai/convert_mistral_weights_to_hf.py", line 264, in main
    write_model(
  File "/data/user/yicairun/repo/lm/mistralai/convert_mistral_weights_to_hf.py", line 92, in write_model
    sliding_window = int(params["sliding_window"])
KeyError: 'sliding_window'

不转了,直接下载,可选镜像站或者modelscope

model-00001-of-00003.safetensors SHA256: 63654d601820b88b1fa8b4a98df5714f700fbc5b3df2cc4ecbabdced35096d31
model-00002-of-00003.safetensors SHA256: a42716540ecb2385d371f2109835921ff535406cac8fe8ff28f2f0b5fc7895bd
model-00003-of-00003.safetensors SHA256: 5f86e15cb3ed9078e30ae6e72445e109d0e337d9cde59b9aeea4ce8e44e54a5d

pytorch_model-00001-of-00003.bin SHA256: d8836f675fe1c4c43f3ff4e93f4cc0e97ef7a13e8c240fb39ad02d37ff303ef5
pytorch_model-00002-of-00003.bin SHA256: 58a7ddffb463397de5dbe1f1e2ec1ccf6aae2b549565f83f3ded124e0b4c5069
pytorch_model-00003-of-00003.bin SHA256: 75824d68dcf82d02b731b2bdfd3a9711acb7c58b8d566f4c0d3e9efac52f9a21

from transformers import AutoModelForCausalLM, AutoTokenizer

device = "cuda:7" # the device to load the model onto

model = AutoModelForCausalLM.from_pretrained("./")  # 优先加载 safetensors 模型文件,删除 model.safetensors.index.json 后才加载pytorch_model.bin
tokenizer = AutoTokenizer.from_pretrained("./")

messages = [
    {"role": "user", "content": "What is your favourite condiment?"},
    {"role": "assistant", "content": "Well, I'm quite partial to a good squeeze of fresh lemon juice. It adds just the right amount of zesty flavour to whatever I'm cooking up in the kitchen!"},
    {"role": "user", "content": "Do you have mayonnaise recipes?"}
]

encodeds = tokenizer.apply_chat_template(messages, return_tensors="pt")

model_inputs = encodeds.to(device)
model.to(device)

generated_ids = model.generate(model_inputs, max_new_tokens=1000, do_sample=True)
decoded = tokenizer.batch_decode(generated_ids)
print(decoded[0])
posted @ 2024-03-28 14:29  沙滩炒花蛤  阅读(343)  评论(0编辑  收藏  举报