本地部署DeepSeek-R1-AWQ

一、部署环境准备

系统信息：

主机名为 10-200-3-23
IP 地址为 10.200.3.23
操作系统为 ubuntu 22.04
配备 8 卡 A100。

二、驱动与桥接器安装

安装 gcc

执行命令
apt-get update -y 
apt install build-essential -y

安装驱动

下载驱动 
wget https://us.download.nvidia.com/tesla/560.35.03/NVIDIA-Linux-x86_64-560.35.03.r un
运行安装命令 
#注意在交互式安装时确认安装 32 位兼容库并 Rebuild initramfs。
sh NVIDIA-Linux-x86_64-560.35.03.run

安装桥接器

确保桥接器版本与驱动版本完全一致（包括次版本），下载桥接器 
# 确保桥接器版本与驱动版本完全⼀致（包括次版本）
wget https://developer.download.nvidia.cn/compute/cuda/repos/ubuntu2204/x86_64/nvidia-fabricmanager-560_560.35.03-1_amd64.deb
dpkg -i nvidia-fabricmanager-560_560.35.03-1_amd64.deb
systemctl enable nvidia-fabricmanager --now
systemctl status nvidia-fabricmanager
#重启服务器，配置持久模式
nvidia-smi -pm 1

三、docker 安装与配置

安装 docker

# step 1: 安装必要的⼀些系统⼯具
sudo apt-get update
sudo apt-get install ca-certificates curl gnupg
# step 2: 信任 Docker 的 GPG 公钥
sudo install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://mirrors.aliyun.com/docker-ce/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
sudo chmod a+r /etc/apt/keyrings/docker.gpg
# Step 3: 写入软件源信息
echo \
"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg]
https://mirrors.aliyun.com/docker-ce/linux/ubuntu \
"$(. /etc/os-release && echo "$VERSION_CODENAME")" stable" | \
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
# Step 4: 安装Docker
sudo apt-get update
sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
# 安装指定版本的Docker-CE:
# Step 1: 查找Docker-CE的版本:
# apt-cache madison docker-ce
# docker-ce | 17.03.1~ce-0~ubuntu-xenial | https://mirrors.aliyun.com/docker-ce/linux/ubuntu xenial/stable amd64 Packages
# docker-ce | 17.03.0~ce-0~ubuntu-xenial | https://mirrors.aliyun.com/docker-ce/linux/ubuntu xenial/stable amd64 Packages
# Step 2: 安装指定版本的Docker-CE: (VERSION例如上⾯的17.03.1~ce-0~ubuntu-xenial)
# sudo apt-get -y install docker-ce=[VERSION]

配置 docker 使⽤ nvidia-runtime-toolkit：

curl -fsSL https://mirrors.ustc.edu.cn/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
curl -s -L https://mirrors.ustc.edu.cn/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
sed 's#deb https://nvidia.github.io#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://mirrors.ustc.edu.cn#g' | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
apt update -y
apt install -y nvidia-container-toolkit nvidia-container-runtime

在/etc/docker/daemon.json 中添加如下配置：

{
"default-runtime": "nvidia",
"runtimes": {
"nvidia": {
"path": "/usr/bin/nvidia-container-runtime",
"runtimeArgs": []
}
},
}

重启docker:

systemctl restart docker

四、镜像与模型获取

1. 获取镜像

⽣成 vllm 镜像的 dockerfile 示例如下：

FROM docker.m.daocloud.io/nvidia/cuda:12.6.3-runtime-ubuntu22.04
RUN set -ex; \
apt update -y; \
apt install -y python3.10 python3-pip; \
pip install vllm==v0.7.2

构建镜像

docker build -t vllm:v0.7.2

获取open-webui镜像：

docker pull ghcr.m.daocloud.io/open-webui/open-webui:main

2. 获取模型

deepseek-r1-awq 是 671b 的参数，但是精度只有 int4，单台 8 卡 A100 可以完成部署。
下载模型：

mkdir /data
cd /data
git lfs install
git clone https://www.modelscope.cn/cognitivecomputations/DeepSeek-R1-awq.git

五、项目部署

部署 deepseek

docker run -d --runtime nvidia \
--gpus all \
-v /data:/mnt/models \
-p 12345:12345 \
--ipc=host \
hub.wanjiedata.com/models/vllm:v0.7.2 \
python3 -m vllm.entrypoints.openai.api_server --host 0.0.0.0 --port 12345 --max-model-len 65536 --trust-remote-code --tensor-parallel-size 8 --quantization moe_wna16
--gpu-memory-utilization 0.97 --kv-cache-dtype fp8_e5m2 --calculate-kv-scales --served-model-name deepseek-reasoner --model /mnt/models/DeepSeek-R1-AWQ

部署 open-webui

docker run -d -p 3030:8080 \
-e ENABLE_OLLAMA_API=false \
-e OPENAI_API_KEY=NULL \
-e OPENAI_API_BASE_URL=http://10.200.3.23:12345/v1 \
-e ENABLE_RAG_WEB_LOADER_SSL_VERIFICATION=false \
-v open-webui:/app/backend/data \
--name open-webui \
--restart always ghcr.m.daocloud.io/open-webui/open-webui:main

posted @ 2025-02-22 13:38 西门运维阅读(795) 评论(0) 收藏举报

刷新页面返回顶部

Jack He

本地部署DeepSeek-R1-AWQ

一、部署环境准备

二、驱动与桥接器安装

三、docker 安装与配置

四、镜像与模型获取

1. 获取镜像

2. 获取模型

五、项目部署

公告