4090部署deepseek_r1(14B)

1. 安装WSL虚拟机

1.1. WSL安装

打开命令行/powershell, 执行:

wsl --install

会自动安装,然后重启。

1.2. 更改WSL所在路径

默认装到C盘,可以把他改到D盘。

  1. 查看WSL运行状态:
wsl -l -v
  1. 确保WSL处于关闭状态
wsl --shutdown
  1. 导出镜像
 wsl --export Ubuntu d:\image_ubuntu.tar
  1. 移除之前的WSL
wsl --unregister Ubuntu
  1. 再次查看WSL的运行状态
wsl -l -v

适用于 Linux 的 Windows 子系统没有已安装的分发版。
可以通过访问 Microsoft Store 来安装分发版:
https://aka.ms/wslstore

这样就可以了。

  1. 重新注册WSL:
# 语法说明
# wsl --port Ubuntu <WSL后续要放在哪个文件夹中> <镜像路径>
wsl --import Ubuntu d:\WSL-Ubuntu-22.04 d:\image_ubuntu.tar

  1. 重新查看WSL状态:
wsl -l -v

NAME            STATE           VERSION
* Ubuntu    Stopped         2

这样就可以了。

1.3. 配置Anaconda

sudo apt update

##安装Anaconda

# 1. 先 cd 到根目录下
cd

# 2. 下载安装包:在此地址 https://www.anaconda.com/download/success 中找到安装包的链接
wget https://repo.anaconda.com/archive/Anaconda3-2024.02-1-Linux-x86_64.sh

# 3. 安装 anaconda
bash Anaconda3-2024.02-1-Linux-x86_64.sh

# 4. 按照 anaconda 提示进行安装,默认安装到 /home/用户名/anaconda3

# 5. 配置环境变量
vim ~/.bashrc 
# 添加 Anaconda 环境变量 
export PATH="/home/用户名/anaconda3/bin:$PATH"

# 更新环境变量
source ~/.bashrc

# 验证是否添加完成
conda --version

2. 配置模型

2.1. 配置huggingface-cli

需要python版本大于3.8

pip install -U huggingface_hub

2.2. 下载模型

mkdir deepseek_r1_14B
cd ./deepseek_r1_14B
huggingfafce-cli download deepseek-ai/DeepSeek-R1-Distill-Qwen-14B --local-dir ./models/deepseek_r1_14B

新建一个requirements.txt:

accelerate==1.3.0
aiofiles==23.2.1
aiohappyeyeballs==2.4.4
aiohttp==3.11.11
aiohttp-cors==0.7.0
aiosignal==1.3.2
airportsdata==20241001
annotated-types==0.7.0
anyio==4.8.0
astor==0.8.1
asttokens==3.0.0
attrs==25.1.0
backcall==0.2.0
beautifulsoup4==4.12.3
bitsandbytes==0.45.1
blake3==1.0.4
bleach==6.2.0
cachetools==5.5.1
certifi==2024.12.14
charset-normalizer==3.4.1
click==8.1.8
cloudpickle==3.1.1
colorful==0.5.6
compressed-tensors==0.9.0
decorator==5.1.1
defusedxml==0.7.1
depyf==0.18.0
dill==0.3.9
diskcache==5.6.3
distlib==0.3.9
distro==1.9.0
docopt==0.6.2
einops==0.8.0
executing==2.2.0
fastapi==0.115.7
fastjsonschema==2.21.1
ffmpy==0.5.0
filelock==3.17.0
frozenlist==1.5.0
fsspec==2024.12.0
gguf==0.10.0
google-api-core==2.24.1
google-auth==2.38.0
googleapis-common-protos==1.66.0
gradio==5.13.1
gradio_client==1.6.0
grpcio==1.70.0
h11==0.14.0
httpcore==1.0.7
httptools==0.6.4
httpx==0.28.1
huggingface-hub==0.27.1
idna==3.10
importlib_metadata==8.6.1
iniconfig==2.0.0
interegular==0.3.3
ipython==8.12.3
jedi==0.19.2
Jinja2==3.1.5
jiter==0.8.2
jsonschema==4.23.0
jsonschema-specifications==2024.10.1
jupyter_client==8.6.3
jupyter_core==5.7.2
jupyterlab_pygments==0.3.0
lark==1.2.2
lm-format-enforcer==0.10.9
markdown-it-py==3.0.0
MarkupSafe==2.1.5
matplotlib-inline==0.1.7
mdurl==0.1.2
mistral_common==1.5.2
mistune==3.1.1
mpmath==1.3.0
msgpack==1.1.0
msgspec==0.19.0
multidict==6.1.0
nbclient==0.10.2
nbconvert==7.16.6
nbformat==5.10.4
nest-asyncio==1.6.0
networkx==3.4.2
numpy==1.26.4
nvidia-cublas-cu12==12.4.5.8
nvidia-cuda-cupti-cu12==12.4.127
nvidia-cuda-nvrtc-cu12==12.4.127
nvidia-cuda-runtime-cu12==12.4.127
nvidia-cudnn-cu12==9.1.0.70
nvidia-cufft-cu12==11.2.1.3
nvidia-curand-cu12==10.3.5.147
nvidia-cusolver-cu12==11.6.1.9
nvidia-cusparse-cu12==12.3.1.170
nvidia-ml-py==12.570.86
nvidia-nccl-cu12==2.21.5
nvidia-nvjitlink-cu12==12.4.127
nvidia-nvtx-cu12==12.4.127
openai==1.60.2
opencensus==0.11.4
opencensus-context==0.1.3
opencv-python-headless==4.11.0.86
orjson==3.10.15
outlines==0.1.11
outlines_core==0.1.26
packaging==24.2
pandas==2.2.3
pandocfilters==1.5.1
parso==0.8.4
partial-json-parser==0.2.1.1.post5
pexpect==4.9.0
pickleshare==0.7.5
pillow==10.4.0
pip-autoremove==0.10.0
platformdirs==4.3.6
pluggy==1.5.0
prometheus-fastapi-instrumentator==7.0.2
prometheus_client==0.21.1
prompt_toolkit==3.0.50
propcache==0.2.1
proto-plus==1.26.0
protobuf==5.29.3
psutil==6.1.1
ptyprocess==0.7.0
pure_eval==0.2.3
py-cpuinfo==9.0.0
py-spy==0.4.0
pyasn1==0.6.1
pyasn1_modules==0.4.1
pybind11==2.13.6
pycountry==24.6.1
pydantic==2.10.6
pydantic_core==2.27.2
pydub==0.25.1
Pygments==2.19.1
pytest==8.3.4
python-dateutil==2.9.0.post0
python-dotenv==1.0.1
python-multipart==0.0.20
pytz==2024.2
PyYAML==6.0.2
pyzmq==26.2.0
ray==2.41.0
referencing==0.36.2
regex==2024.11.6
requests==2.32.3
rich==13.9.4
rpds-py==0.22.3
rsa==4.9
ruff==0.9.3
safehttpx==0.1.6
safetensors==0.5.2
semantic-version==2.10.0
sentencepiece==0.2.0
setuptools==75.8.0
shellingham==1.5.4
six==1.17.0
smart-open==7.1.0
sniffio==1.3.1
soupsieve==2.6
stack-data==0.6.3
starlette==0.45.3
sympy==1.13.1
tiktoken==0.7.0
tinycss2==1.4.0
tokenizers==0.21.0
tomlkit==0.13.2
torch==2.5.1
torchaudio==2.5.1
torchvision==0.20.1
tornado==6.4.2
tqdm==4.67.1
traitlets==5.14.3
transformers==4.48.1
triton==3.1.0
typer==0.15.1
typing_extensions==4.12.2
tzdata==2025.1
urllib3==2.3.0
uvicorn==0.34.0
uvloop==0.21.0
virtualenv==20.29.1
watchfiles==1.0.4
wcwidth==0.2.13
webencodings==0.5.1
websockets==14.2
wrapt==1.17.2
xformers==0.0.28.post3
xgrammar==0.1.11
yarg==0.1.9
yarl==1.18.3
zipp==3.21.0

再创建一个主程序,deepseek_r1_14B.py

import gradio as gr
from transformers import (
    AutoModelForCausalLM,
    AutoTokenizer,
    BitsAndBytesConfig
)
import torch

# 量化配置,适配24G显存
quant_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16
)

# 加载模型,把这里的路径改成models的相对路径
model_path = './models/deepseek-ai/DeepSeek-R1-Distill-Qwen-14B'
model = AutoModelForCausalLM.from_pretrained(
    pretrained_model_name_or_path=model_path,
    quantization_config=quant_config,
    device_map="auto",
    trust_remote_code=True
)
tokenizer = AutoTokenizer.from_pretrained(pretrained_model_name_or_path=model_path)
if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token  # Set pad_token to eos_token

def generate_response(prompt, temperature=0.7, max_tokens=512):
    inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

    outputs = model.generate(
        **inputs,
        max_new_tokens=max_tokens,
        temperature=temperature,
        top_p=0.9,
        do_sample=True,
        use_cache=True,
        pad_token_id=tokenizer.eos_token_id,
    )

    return tokenizer.decode(outputs[0], skip_special_tokens=True)

# 创建 Gradio 界面
with gr.Blocks(theme=gr.themes.Soft(), title="DeepSeek-R1-14B 对话系统") as demo:
    gr.Markdown("#DeepSeek-R1-14B 对话系统")

    with gr.Row():
        with gr.Column():
            input_prompt = gr.Textbox(label="输入提示", lines=5)
            temp_slider = gr.Slider(0.1, 1.0, value=0.7, label="温度参数")
            max_token_slider = gr.Slider(128, 2048, value=512, step=128, label="最大生成长度")
            submit_btn = gr.Button('生成', variant="primary")

        with gr.Column():
            output_text = gr.Textbox(label="模型响应", interactive=False, lines=1000)

    submit_btn.click(
        fn=generate_response,
        inputs=[input_prompt, temp_slider, max_token_slider],
        outputs=output_text
    )

if __name__ == "__main__":
    demo.launch(server_name="0.0.0.0", server_port=8888, share=False)

接下来就可以运行了:

conda create -n deepseek python=3.12.3
conda init

# 需要重新开一下窗口
conda activate deepseek

pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple CUDA_VISIBLE_DEVICES=0 python deepseek_r1_14B.py

#由于我们是用的wsl虚拟机,所以打开应用的时候需要用wsl的IP地址
 ip addr show eth0 | grep "inet\b" | awk '{print $2}' | cut -d/ -f1
 # 访问方式:ip:8888

参考链接:
分享4090显卡24G显存部署DeepSeek-R1-14B/32B的代码分享我在4090显卡上部署deepseek-r - 掘金
WSL2的安装与配置(创建Anaconda虚拟环境、更新软件包、安装PyTorch、VSCode)-CSDN博客

posted @ 2025-02-07 16:17  肥仓鼠大魔王  阅读(507)  评论(0)    收藏  举报