千问2模型使用transformers调用时的一种量化方法
github地址:千问2
调式千问7B的一些问题记录
官方源码:
from transformers import AutoModelForCausalLM, AutoTokenizer model_name = "Qwen/Qwen2-7B-Instruct" device = "cuda" # the device to load the model onto model = AutoModelForCausalLM.from_pretrained( model_name, torch_dtype="auto", device_map="auto" ) tokenizer = AutoTokenizer.from_pretrained(model_name) prompt = "Give me a short introduction to large language model." messages = [ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": prompt} ] text = tokenizer.apply_chat_template( messages, tokenize=False, add_generation_prompt=True ) model_inputs = tokenizer([text], return_tensors="pt").to(device) generated_ids = model.generate( **model_inputs, max_new_tokens=512 ) generated_ids = [ output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids) ] response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
配置参数
指定量化方式
""" int4 量化代码 """ from transformers import ( AutoConfig, AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig ) import sys model_name_or_path = sys.argv[1] quantization_config = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_compute_dtype=None, bnb_4bit_use_double_quant=True, bnb_4bit_quant_type='nf4' ) tokenizer = AutoTokenizer.from_pretrained( model_name_or_path, trust_remote_code=True ) model = AutoModelForCausalLM.from_pretrained( model_name_or_path, device_map="auto", quantization_config=quantization_config, trust_remote_code=True).eval() system = input('system:') history = None while True: question = input('user:') if question == 'clear': system = input('system:') history = None continue response, history = model.chat(tokenizer=tokenizer, query=query, system=system, history=history) print(response)
问题
由于此种方式需要使用bitsandbytes库,我在之前没有安装过,故而报错'importlib.metadata.PackageNotFoundError: No package metadata was found for bitsandbytes'。解决方案:直接pip install bitsandbytes
安装对应的库即可。
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】凌霞软件回馈社区,博客园 & 1Panel & Halo 联合会员上线
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】博客园社区专享云产品让利特惠,阿里云新客6.5折上折
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 一个奇形怪状的面试题:Bean中的CHM要不要加volatile?
· [.NET]调用本地 Deepseek 模型
· 一个费力不讨好的项目,让我损失了近一半的绩效!
· .NET Core 托管堆内存泄露/CPU异常的常见思路
· PostgreSQL 和 SQL Server 在统计信息维护中的关键差异
· CSnakes vs Python.NET:高效嵌入与灵活互通的跨语言方案对比
· DeepSeek “源神”启动!「GitHub 热点速览」
· 我与微信审核的“相爱相杀”看个人小程序副业
· Plotly.NET 一个为 .NET 打造的强大开源交互式图表库
· 上周热点回顾(2.17-2.23)