AI大模型ChatGLM2-6B 第一篇 - 基础环境搭建
硬件环境
cpu i5-13600k
内存 64G
显卡 rtx3090
软件环境
window 11 专业版 22H2
n卡驱动:526.47
wsl2 ubuntu 22.04
安装nvidia-cuda-toolkit
打开wsl2的ubuntu,安装nvidia驱动程序
sudo apt update
sudo apt upgrade
sudo ubuntu-drivers devices
sudo apt install nvidia-driver-515
查看显卡驱动
nvidia-smi
安装nvidia-cuda-toolkit
官网:https://developer.nvidia.com/cuda-toolkit-archive
CUDA Toolkit与n卡驱动版本对应关系参考:
选择版本:nvidia-cuda-toolkit 11.7
wget https://developer.download.nvidia.com/compute/cuda/11.7.0/local_installers/cuda_11.7.0_515.43.04_linux.run
sudo sh cuda_11.7.0_515.43.04_linux.run
安装完之后,设置环境
vim /etc/profile
# 添加以下内容
export LD_LIBRARY_PATH=/usr/local/cuda/lib64
export PATH=$PATH:/usr/local/cuda/bin
# 退出vim
source /etc/profile
最后查看版本内容
nvcc -V
## 看到以下内容,说明nvidia-cuda-toolkit安装成功了
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Tue_May__3_18:49:52_PDT_2022
Cuda compilation tools, release 11.7, V11.7.64
Build cuda_11.7.r11.7/compiler.31294372_0
安装cudnn
官网:https://developer.nvidia.com/rdp/cudnn-download
因为上面nvidia-cuda-toolkit 11.7,是2022年5月左右的版本,这里选择cudnn也是2022年5月比较接近的版本: cudnn-local-repo-ubuntu2204-8.5.0.96_1.0-1_amd64.deb
安装cudnn的deb包
sudo dpkg -i cudnn-local-repo-ubuntu2204-8.5.0.96_1.0-1_amd64.deb
# 按提示添加key
sudo apt-key add ...
sudo apt-get update
# 查询可安装软件包
apt-cache search cudnn
可以看到:
# 之后就可以参考上面的libcudnn8、8.5.0.96,结合cuda版本 11.7,安装以下包
sudo dpkg -i libcudnn8_8.5.0.96-1+cuda11.7_amd64.deb
sudo dpkg -i libcudnn8-dev_8.5.0.96-1+cuda11.7_amd64.deb
sudo dpkg -i libcudnn8-samples_8.5.0.96-1+cuda11.7_amd64.deb
验证cudnn是否生效
当选择deb方式进行安装时,会在 /usr/src/cudnn_samples_v7 有一些cudnn的例子,编译mnistCUDNN sample进行验证。
# 复制cuDNN samples到home目录下
cp -r /usr/src/cudnn_samples_v8 /$HOME
# 进入home目录
cd $HOME/cudnn_samples_v8/mnistCUDNN/
# 编译mnistCUDNN
sudo make clean
sudo make
# 运行mnistCUDNN
# 如果出现Test passed!表明cuDNN已安装成功
sudo ./mnistCUDNN
如下图:
测试过程中,可以遇到报错:
这是因为系统缺少一些组件,输入以下命令, 之后重新make即可
sudo apt-get install libfreeimage3 libfreeimage-dev
安装conda
下载conda安装包
wget https://mirrors.bfsu.edu.cn/anaconda/archive/Anaconda3-2022.10-Linux-x86_64.sh --no-check-certificate
安装Anaconda
sudo bash Anaconda3-2021.11-Linux-x86_64.sh
之后,一路回车,遇到提示需要yes的输入yes,最终安装位置为:/root/anaconda3 , 安装之后会自动配置~/.bashrc
# >>> conda initialize >>>
# !! Contents within this block are managed by 'conda init' !!
__conda_setup="$('/root/anaconda3/bin/conda' 'shell.bash' 'hook' 2> /dev/null)"
if [ $? -eq 0 ]; then
eval "$__conda_setup"
else
if [ -f "/root/anaconda3/etc/profile.d/conda.sh" ]; then
. "/root/anaconda3/etc/profile.d/conda.sh"
else
export PATH="/root/anaconda3/bin:$PATH"
fi
fi
unset __conda_setup
# <<< conda initialize <<<
之后,输入conda list能看到输出内容,说明安装成功了。
安装pytorch
PYTORCH版本对应关系:
https://pytorch.org/get-started/previous-versions/
conda install pytorch==1.13.0 torchvision==0.14.0 torchaudio==0.13.0 pytorch-cuda=11.7 -c pytorch -c nvidia
其他
阿里大模型开发与训练镜像版本参考
dsw-registry-vpc.cn-shanghai.cr.aliyuncs.com/pai/pytorch:1.12-gpu-py39-cu113-ubuntu20.04
/mnt/workspace/ChatGLM2-6B> python --version
Python 3.9.15
/mnt/workspace/ChatGLM2-6B> nvidia-smi
Thu Jan 18 09:22:02 2024
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.82.01 Driver Version: 470.82.01 CUDA Version: 11.4 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla V100-SXM2... Off | 00000000:00:09.0 Off | 0 |
| N/A 32C P0 24W / 300W | 0MiB / 16160MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
/mnt/workspace/ChatGLM2-6B> nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Mon_May__3_19:15:13_PDT_2021
Cuda compilation tools, release 11.3, V11.3.109
Build cuda_11.3.r11.3/compiler.29920130_0
/mnt/workspace/ChatGLM2-6B> conda --version
conda 23.3.1
/mnt/workspace/ChatGLM2-6B> pytorch --version
bash: pytorch: command not found
/mnt/workspace/ChatGLM2-6B> pip show torch
Name: torch
Version: 1.12.1+cu113
Summary: Tensors and Dynamic neural networks in Python with strong GPU acceleration
Home-page: https://pytorch.org/
Author: PyTorch Team
Author-email: packages@pytorch.org
License: BSD-3
Location: /home/pai/lib/python3.9/site-packages
Requires: typing-extensions
Required-by: accelerate, fairscale, pytorch-metric-learning, sailfish, thop, timm, torchaudio, torchvision
阿里服务器上运行ChatGLM2-6B
列出pip安装的模块
/mnt/workspace/ChatGLM2-6B> pip list
Package Version
------------------------------ --------------------
absl-py 1.4.0
accelerate 0.26.1
addict 2.4.0
aiofiles 23.2.1
aiohttp 3.8.4
aiosignal 1.3.1
albumentations 1.3.0
alibabacloud-credentials 0.3.2
alibabacloud-endpoint-util 0.0.3
alibabacloud-gateway-spi 0.0.1
alibabacloud-openapi-util 0.2.1
alibabacloud-pai-dlc20201203 1.0.0
alibabacloud-paistudio20220112 1.0.9
alibabacloud-tea 0.3.1
alibabacloud-tea-openapi 0.3.7
alibabacloud-tea-util 0.3.8
alibabacloud-tea-xml 0.0.2
alipai 0.1.7
aliyun-log-python-sdk 0.8.5
aliyun-python-sdk-core 2.13.36
aliyun-python-sdk-kms 2.16.0
aliyun-python-sdk-sts 3.1.1
altair 5.2.0
annotated-types 0.6.0
anyio 3.6.2
argon2-cffi 21.3.0
argon2-cffi-bindings 21.2.0
arrow 1.2.3
astor 0.8.1
astroid 2.15.2
asttokens 2.2.1
astunparse 1.6.3
async-timeout 4.0.2
attrs 22.2.0
autopep8 1.7.0
backcall 0.2.0
beautifulsoup4 4.12.2
bleach 6.0.0
blinker 1.7.0
boltons 23.0.0
brotlipy 0.7.0
cachetools 5.3.0
certifi 2022.12.7
cffi 1.15.1
charset-normalizer 2.0.4
click 8.1.7
cloudpickle 2.2.1
colorama 0.4.6
comm 0.1.3
common-io 0.4.0+tunnel
conda 23.3.1
conda-content-trust 0.1.3
conda-package-handling 1.9.0
configparser 5.3.0
contextlib2 21.6.0
contourpy 1.0.7
cpm-kernels 1.0.11
crcmod 1.7
cryptography 37.0.4
cvxopt 1.2.6
cycler 0.11.0
Cython 0.29.33
dataclasses 0.6
dateparser 1.1.8
debugpy 1.6.7
decorator 5.1.1
decord 0.6.0
defusedxml 0.7.1
descartes 1.1.0
dill 0.3.6
dnspython 2.3.0
eas-prediction 0.12
easy-rec 0.1.6
einops 0.6.0
elastic-transport 8.4.0
elasticsearch 8.7.0
executing 1.2.0
fairscale 0.4.13
fastapi 0.109.0
fastjsonschema 2.16.3
ffmpy 0.3.1
filelock 3.11.0
fire 0.5.0
flake8 6.0.0
fonttools 4.39.3
fqdn 1.5.1
frozenlist 1.3.3
fsspec 2023.12.2
future 0.18.3
fvcore 0.1.5.post20221221
gast 0.4.0
gitdb 4.0.11
GitPython 3.1.41
google-auth 2.17.2
google-auth-oauthlib 1.0.0
googleapis-common-protos 1.59.0
gradio 3.39.0
gradio_client 0.6.1
graphviz 0.20.1
grpcio 1.53.0
h11 0.14.0
h5py 3.8.0
httpcore 1.0.2
httpx 0.26.0
huggingface-hub 0.20.2
hyperopt 0.1.2
idna 3.4
imageio 2.27.0
imgaug 0.4.0
importlib-metadata 6.2.0
importlib-resources 5.12.0
iopath 0.1.10
ipykernel 6.22.0
ipython 8.12.0
ipython-genutils 0.2.0
ipywidgets 8.0.6
isoduration 20.11.0
isort 5.12.0
jedi 0.18.2
Jinja2 3.1.2
jmespath 0.10.0
joblib 1.2.0
json-tricks 3.16.1
jsonpatch 1.32
jsonpointer 2.1
jsonschema 4.17.3
jupyter 1.0.0
jupyter_client 8.1.0
jupyter-console 6.6.3
jupyter_core 5.3.0
jupyter-events 0.6.3
jupyter_server 2.5.0
jupyter_server_terminals 0.4.4
jupyterlab-pygments 0.2.2
jupyterlab-widgets 3.0.7
kiwisolver 1.4.4
latex2mathml 3.77.0
lazy_loader 0.2
lazy-object-proxy 1.6.0
linkify-it-py 2.0.2
llvmlite 0.39.1
lmdb 1.4.1
Markdown 3.4.3
markdown-it-py 2.2.0
MarkupSafe 2.1.2
matplotlib 3.5.2
matplotlib-inline 0.1.6
mccabe 0.7.0
mdit-py-plugins 0.3.3
mdtex2html 1.2.0
mdurl 0.1.2
mistune 2.0.5
mmcv-full 1.7.0
mpmath 1.3.0
multidict 6.0.4
mysql-connector-python 8.0.32
nbclassic 0.5.5
nbclient 0.7.3
nbconvert 7.3.0
nbformat 5.8.0
nest-asyncio 1.5.6
networkx 3.1
notebook 6.5.4
notebook_shim 0.2.2
numba 0.56.4
numpy 1.23.5
nuscenes-devkit 1.1.10
nvidia-cublas-cu12 12.1.3.1
nvidia-cuda-cupti-cu12 12.1.105
nvidia-cuda-nvrtc-cu12 12.1.105
nvidia-cuda-runtime-cu12 12.1.105
nvidia-cudnn-cu12 8.9.2.26
nvidia-cufft-cu12 11.0.2.54
nvidia-curand-cu12 10.3.2.106
nvidia-cusolver-cu12 11.4.5.107
nvidia-cusparse-cu12 12.1.0.106
nvidia-dali-cuda110 1.24.0
nvidia-nccl-cu12 2.18.1
nvidia-nvjitlink-cu12 12.3.101
nvidia-nvtx-cu12 12.1.105
oauthlib 3.2.2
opencv-contrib-python 4.6.0.66
opencv-python 4.6.0.66
opencv-python-headless 4.6.0.66
opt-einsum 3.3.0
orjson 3.9.10
oss2 2.17.0
packaging 23.0
pai-automl 0.0.4
pai-easycv 0.9.0
pai-nni 2.6
pandas 2.0.0
pandocfilters 1.5.0
parso 0.8.3
patsy 0.5.3
pexpect 4.8.0
pickleshare 0.7.5
Pillow 9.5.0
pip 23.0.1
platformdirs 3.2.0
plotly 5.14.1
pluggy 1.0.0
portalocker 2.7.0
prettytable 3.6.0
prometheus-client 0.16.0
prompt-toolkit 3.0.38
protobuf 3.20.3
psutil 5.9.4
ptyprocess 0.7.0
pure-eval 0.2.2
pyaes 1.6.1
pyarrow 14.0.2
pyasn1 0.4.8
pyasn1-modules 0.2.8
pybind11 2.10.1
pybind11-global 2.10.1
pyclipper 1.3.0.post4
pycocotools 2.0.6
pycodestyle 2.10.0
pycosat 0.6.4
pycparser 2.21
pycryptodome 3.17
pydantic 2.5.3
pydantic_core 2.14.6
pydeck 0.8.1b0
pydub 0.25.1
pyflakes 3.0.1
Pygments 2.14.0
pylint 2.17.2
pymongo 4.3.3
pyodps 0.11.3.1
pyOpenSSL 22.0.0
pyparsing 3.0.9
pyquaternion 0.9.9
pyrsistent 0.19.3
PySocks 1.7.1
python-dateutil 2.8.2
python-json-logger 2.0.7
python-multipart 0.0.6
PythonWebHDFS 0.2.3
pytorch-metric-learning 2.1.0
pytz 2023.3
pytz-deprecation-shim 0.1.0.post0
PyWavelets 1.4.1
PyYAML 6.0
pyzmq 25.0.2
qtconsole 5.4.2
QtPy 2.3.1
qudida 0.0.4
rapidfuzz 2.15.0
regex 2023.3.23
requests 2.28.1
requests-oauthlib 1.3.1
responses 0.23.1
rfc3339-validator 0.1.4
rfc3986-validator 0.1.1
rich 13.7.0
rsa 4.9
ruamel.yaml 0.17.21
ruamel.yaml.clib 0.2.6
safetensors 0.4.1
sailfish 1.0.1
schema 0.7.5
scikit-image 0.20.0
scikit-learn 1.2.2
scipy 1.9.1
seaborn 0.12.2
semantic-version 2.10.0
Send2Trash 1.8.0
sentencepiece 0.1.99
setuptools 65.6.3
Shapely 1.8.5
shellingham 1.5.4
simplejson 3.19.1
six 1.16.0
sklearn 0.0.post1
smmap 5.0.1
sniffio 1.3.0
soupsieve 2.4
sse-starlette 1.8.2
stack-data 0.6.2
starlette 0.35.1
statsmodels 0.13.5
streamlit 1.30.0
sympy 1.12
tabulate 0.9.0
tenacity 8.2.2
tensorboard 2.12.1
tensorboard-data-server 0.7.0
tensorboard-plugin-wit 1.8.1
termcolor 2.2.0
terminado 0.17.1
thop 0.1.1.post2209072238
threadpoolctl 3.1.0
tifffile 2023.3.21
timm 0.5.4
tinycss2 1.2.1
tokenizers 0.13.3
toml 0.10.2
tomli 2.0.1
tomlkit 0.12.0
toolz 0.12.0
torch 1.12.1+cu113
torchaudio 0.12.1
torchvision 0.13.1+cu113
tornado 6.2
tqdm 4.65.0
training-utils 1.0.6
traitlets 5.9.0
transformers 4.30.2
triton 2.1.0
typer 0.9.0
types-PyYAML 6.0.12.9
typing_extensions 4.9.0
tzdata 2023.3
tzlocal 4.3
uc-micro-py 1.0.2
uri-template 1.2.0
urllib3 1.26.13
uvicorn 0.25.0
validators 0.22.0
watchdog 3.0.0
wcwidth 0.2.6
webcolors 1.13
webencodings 0.5.1
websocket-client 1.5.1
websockets 11.0.1
Werkzeug 2.2.3
wget 3.2
wheel 0.38.4
widgetsnbextension 4.0.7
wrapt 1.15.0
xgboost 1.7.5
xlrd 2.0.1
xtcocotools 1.13
yacs 0.1.8
yapf 0.32.0
yarl 1.8.2
zipp 3.15.0