午夜稻草人

  博客园 :: 首页 :: 博问 :: 闪存 :: 新随笔 :: 联系 :: 订阅 订阅 :: 管理 ::
  180 随笔 :: 0 文章 :: 8 评论 :: 26万 阅读
< 2025年3月 >
23 24 25 26 27 28 1
2 3 4 5 6 7 8
9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29
30 31 1 2 3 4 5

不同的模型需要的显存不同,下载前先查一下自己GPU能支持什么模型

 1. 用如下脚本可以下载HuggingFace上的各种模型, 网址 https://huggingface.co/models

download.py

复制代码
#coding=gbk
import time
from huggingface_hub import snapshot_download
#huggingface上的模型名称
repo_id = "LinkSoul/Chinese-Llama-2-7b-4bit"
#本地存储地址
local_dir = "E:\\work\\AI\\GPT\\llama_model_7b_4bit"
cache_dir = local_dir + "\\cache"
while True:
    try:
        snapshot_download(cache_dir=cache_dir,
        local_dir=local_dir,
        repo_id=repo_id,
        local_dir_use_symlinks=False,
        resume_download=True,
        allow_patterns=["*.model", "*.json", "*.bin",
        "*.py", "*.md", "*.txt"],
        ignore_patterns=["*.safetensors", "*.msgpack",
        "*.h5", "*.ot",],
        )
    except Exception as e :
        print(e)
        # time.sleep(5)
    else:
        print('下载完成')
        break
复制代码

2. 本地环境

要运行下载的llama模型需要先创建conda虚拟环境,博主是在windows机器上安装了anaconda, 创建一个虚拟环境,命令行输入

conda create -n LLM_env python=3.10

这里python版本选择了3.10, 后面要跟pytorch对应上

3. 查看cuda版本

 4. 安装pytorch,参考网页

 https://blog.csdn.net/threestooegs/article/details/119531414 

https://pytorch.org/get-started/previous-versions/

博主安装的是1.13.1这个版本

 当然,还可能缺少一些其他的包,就看少什么装什么吧。这边可能有个坑,windows安装bitsandbytes库的问题

importlib.metadata.PackageNotFoundError: No package metadata was found for bitsandbytes

bitsandbytes-windows版本过低,重新安装高版本

pip install --trusted-host github.com --trusted-host objects.githubusercontent.com https://github.com/jllllll/bitsandbytes-windows-webui/releases/download/wheels/bitsandbytes-0.41.0-py3-none-win_amd64.whl

 

5. pycharm配置

pycharm上新建项目,interpreter选择刚创建的虚拟环境

 6. 编写测试代码

test.py

复制代码
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, TextStreamer

#本地模型路径
# model_path = "E:\\work\\AI\\GPT\\llama_model"
model_path = "E:\\work\\AI\\GPT\\llama_model_4bit"
# model_path = "E:\\work\\AI\\GPT\\llama_model_7b_8bit"
print(torch.cuda.is_available())

if torch.cuda.is_available():
    print(torch.cuda.device_count())
    device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
    print(device)
else:
    print('没有GPU')

tokenizer = AutoTokenizer.from_pretrained(model_path, use_fast=False)
if model_path.endswith("4bit"):
    model = AutoModelForCausalLM.from_pretrained(
            model_path,
            load_in_4bit=True,
            torch_dtype=torch.float16,
            device_map='auto'
        )
elif model_path.endswith("8bit"):
        model = AutoModelForCausalLM.from_pretrained(
            model_path,
            load_in_8bit=True,
            torch_dtype=torch.float16,
            device_map='auto'
        )
else:
    model = AutoModelForCausalLM.from_pretrained(model_path).half().cuda()
streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)

instruction = """[INST] <<SYS>>\nYou are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe.  Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature.

            If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.\n<</SYS>>\n\n{} [/INST]"""

prompt = instruction.format("Hello, what the meaning of life?")
generate_ids = model.generate(tokenizer(prompt, return_tensors='pt').input_ids.cuda(), max_new_tokens=4096, streamer=streamer)
复制代码

 

webui

还有一种本地运行的方法,是网页形式的

参考:

https://www.cnblogs.com/zhizhixiaoxia/p/17414798.html

https://github.com/oobabooga/text-generation-webui/tree/main

 

 
posted on   午夜稻草人  阅读(8977)  评论(0编辑  收藏  举报
相关博文:
阅读排行:
· 全程不用写代码,我用AI程序员写了一个飞机大战
· DeepSeek 开源周回顾「GitHub 热点速览」
· 记一次.NET内存居高不下排查解决与启示
· MongoDB 8.0这个新功能碉堡了,比商业数据库还牛
· 白话解读 Dapr 1.15:你的「微服务管家」又秀新绝活了
点击右上角即可分享
微信分享提示