[机器学习] 深度学习之caffe1——软件配置与测试
caffe的编译配置真的是很让人头疼啊,不知道试过多少次了~~~
重装系统了七八次,搞得linux的一些常用命令倒是很熟悉了~~~
我有洁癖~~~某一个点上出了错,我一定要把它搞好了,再重新来一次,我怕会因为某一点的小错误会影响到其它重要的地方。。。(有同感的默默在心里举个爪~~~^_^~~~)
又折腾了好几次,参考了很多的博客,总结出一整套的安装配置流程!
开始:
- 网络无问题即可,不用太纠结
- 需要更换默认的驱动和安装CUDA,但是如果你的cuda的计算能力达不到3.0及以上,请跳过本部分。
驱动安装过程中可能会出现问题:the nouveao kernel driver未禁用的错误。
sudo gedit /etc/modprobe.d/blacklist.conf
在最后加上两行:
1 blacklist nouveau 2 options nouveau modeset=0
然后执行:
sudo update-initramfs -u
reboot重启即可。重启后会发现字体变大了。
即是初始驱动已经禁用了。再次重试安装即可。
1 sudo chmod +x NVIDIA-Linux-x86_64-367.44.run 2 sudo ./NVIDIA-Linux-x86_64-367.44.run
1sudo dpkg -i cuda-repo-ubuntu1604-8-0-rc_8.0.27-1_amd64.deb 2sudo apt-get update 3sudo apt-get install cuda 4sudo dpkg -i cuda-misc-headers-8-0_8.0.27.1-1_amd64.deb
1、声明环境变量:
export PATH=/usr/local/cuda-8.0/bin${PATH:+:${PATH}}
export CUDA_PATH=/usr/local/cuda-8.0/lib64${CUDA_PATH:+:${CUDA_PATH}}
2、设置文件:
sudo gedit /etc/profile
3、在文件末尾添加:
export PATH=/usr/local/cuda/bin:$PATH
4、创建链接文件:
sudo gedit /etc/ld.so.conf.d/cuda.conf
5、在打开的文件中添加:
/usr/local/cuda/lib64
6、最后执行
sudo ldconfig
7、运行测试用例
cd /usr/local/cuda-8.0/samples/1_Utilities/deviceQuery
sudo make
sudo ./deviceQuery
然后即可显示出关于GPU的信息,则说明安装成功了
8、另外使用命令:nvidia-smi直接会输出支持cuda的GPU设备列表
- 这里设置使用cudnn加速,一定注意前面说的计算能力问题,后面还会提到!!!
cd cuda
sudo cp ./include/cudnn.h /usr/local/cuda/include/ #复制头文件
sudo cp ./lib64/lib* /usr/local/cuda/lib64/ #复制动态链接库
cd /usr/local/cuda/lib64/
sudo rm -rf libcudnn.so libcudnn.so.5 #删除原有动态文件
sudo ln -s libcudnn.so.5.0.5 libcudnn.so.5 #生成软衔接
sudo ln -s libcudnn.so.5 libcudnn.so #生成软链接
1 # cuDNN acceleration switch (uncomment to build with cuDNN). 2 USE_CUDNN := 1
查看CUDA计算容量:
sudo /usr/local/cuda/samples/bin/x86_64/linux/release/deviceQuery
2.1
在caffe的Makefile.config文件中,找到并修改:
CUDA_ARCH := -gencode arch=compute_20,code=sm_20 \
-gencode arch=compute_20,code=sm_21 \
-gencode arch=compute_21,code=sm_21 \
-gencode arch=compute_30,code=sm_30 \
-gencode arch=compute_35,code=sm_35 \
-gencode arch=compute_50,code=sm_50 \
-gencode arch=compute_50,code=compute_50
不知道是不是有什么用处,但是据说cudnn加速需要cuda计算能力在3.0以上才可以!
- 这里需要注意opencv的版本:最好使用2.4.13,其它版本会出错误!!!
注意编译之前确保numpy已经安装,否则最后不会生成cv2.so
sudo apt-get install python-numpy python3-numpy
可能会出现错误:error:1 /usr/include/string.h:652:42: error: ‘memcpy’ was not declared in this scope
原因是g++版本太新了,需要在CMakeLists.txt中前面几行添加
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -D_FORCE_INLINES")
然后再重新执行2中的cmake一次即可。
linux发行版通常会把类库的头文件和相关的pkg-config分拆成一个单独的xxx-dev(el)包.
以python为例, 以下情况你是需要python-dev的
你需要自己安装一个源外的python类库, 而这个类库内含需要编译的调用python api的c/c++文件
你自己写的一个程序编译需要链接libpythonXX.(a|so)
(注:以上不含使用ctypes/ffi或者裸dlsym方式直接调用libpython.so)
其他正常使用python或者通过安装源内的python类库的不需要python-dev.
cython>=0.19.2
numpy>=1.7.1
scipy>=0.13.2
scikit-image>=0.9.3
matplotlib>=1.3.1
ipython>=3.0.0
h5py>=2.2.0
leveldb>=0.191
networkx>=1.8.1
nose>=1.3.0
pandas>=0.12.0
python-dateutil>=1.4,<2
protobuf>=2.5.0
python-gflags>=2.0
pyyaml>=3.10
Pillow>=2.3.0
six>=1.1.0
-
-
-
- 这里Matlab engine是非常重要的步骤
-
-
PASS: protobuf-test
PASS: protobuf-lazy-descriptor-test
PASS: protobuf-lite-test
PASS: google/protobuf/compiler/zip_output_unittest.sh
PASS: google/protobuf/io/gzip_stream_unittest.sh
=================================
Testsuite summary for Protocol Buffers 2.5.0
=================================
# TOTAL: 5
# PASS: 5
# SKIP: 0
# XFAIL: 0
# FAIL: 0
# XPASS: 0
# ERROR: 0
=================================
-
-
- 这里的Makefile设置非常重要
-
CPU_ONLY := 1
USE_OPENCV := 0
USE_LEVELDB := 0
USE_LMDB := 0
USE_OPENCV := 1
USE_LEVELDB := 1
USE_LMDB := 1
CUSTOM_CXX := g++
WITH_PYTHON_LAYER := 1
# Whatever else you find you need goes here.
INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include /usr/include/hdf5/serial/
LIBRARY_DIRS := $(PYTHON_LIB) /usr/local/lib /usr/lib /usr/lib/x86_64-linux-gnu/hdf5/serial/
可能会出现错误:Check failed: status == CUDNN_STATUS_SUCCESS (6 vs. 0)
说明GPU的加速性能不够,CUDNN只支持CUDA Capability 3.0以上的GPU加速,所以不能使用CUDNN加速,需要在Makefile.config中注释掉USE_CUDNN := 1
一定要注意自己GPU硬件的计算能力问题!!!
[----------] Global test environment tear-down
[==========] 996 tests from 141 test cases ran. (45874 ms total)
[ PASSED ] 996 tests.
-
-
- 这里的pycaffe接口非常重要,一定要配置测试好!!!(先编译好caffe后再进行pycaffe接口编译)
-
LD -o .build_release/lib/libcaffe.so.1.0.0-rc3
CXX/LD -o python/caffe/_caffe.so python/caffe/_caffe.cpp
touch python/caffe/proto/__init__.py
PROTOC (python) src/caffe/proto/caffe.proto
Downloading...
--2016-10-07 23:44:11-- http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
Resolving yann.lecun.com (yann.lecun.com)... 128.122.47.89
Connecting to yann.lecun.com (yann.lecun.com)|128.122.47.89|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 9912422 (9.5M) [application/x-gzip]
Saving to: ‘train-images-idx3-ubyte.gz’
train-images-idx3-ubyte.gz 100%[=====================================>] 9.45M 39.5KB/s in 2m 42s
2016-10-07 23:46:53 (59.9 KB/s) - ‘train-images-idx3-ubyte.gz’ saved [9912422/9912422]
Creating lmdb...
I1007 23:47:04.655964 18706 db_lmdb.cpp:35] Opened lmdb examples/mnist/mnist_train_lmdb
I1007 23:47:04.656126 18706 convert_mnist_data.cpp:88] A total of 60000 items.
I1007 23:47:04.656134 18706 convert_mnist_data.cpp:89] Rows: 28 Cols: 28
I1007 23:47:09.992278 18706 convert_mnist_data.cpp:108] Processed 60000 files.
I1007 23:47:10.043660 18708 db_lmdb.cpp:35] Opened lmdb examples/mnist/mnist_test_lmdb
I1007 23:47:10.043848 18708 convert_mnist_data.cpp:88] A total of 10000 items.
I1007 23:47:10.043862 18708 convert_mnist_data.cpp:89] Rows: 28 Cols: 28
I1007 23:47:10.859005 18708 convert_mnist_data.cpp:108] Processed 10000 files.
Done.
USE_LEVELDB := 1
USE_LMDB := 1
2、修改配置
修改该目录下的prototxt扩展名配置文件
修改./examples/mnist/lenet_solver.prototxt
定位到最后一行:solver_mode: GPU,将GPU改为CPU。 直接先使用CPU进行测试
3、运行
1007 23:53:09.915892 18795 caffe.cpp:210] Use CPU.
I1007 23:53:09.916203 18795 solver.cpp:48] Initializing solver from parameters:
test_iter: 100
test_interval: 500
base_lr: 0.01
display: 100
max_iter: 10000
lr_policy: "inv"
gamma: 0.0001
power: 0.75
momentum: 0.9
weight_decay: 0.0005
snapshot: 5000
snapshot_prefix: "examples/mnist/lenet"
solver_mode: CPU
net: "examples/mnist/lenet_train_test.prototxt"
train_state {
level: 0
stage: ""
}
I1008 00:12:51.708220 18795 sgd_solver.cpp:106] Iteration 9800, lr = 0.00599102
I1008 00:13:02.717388 18795 solver.cpp:228] Iteration 9900, loss = 0.00611393
I1008 00:13:02.717483 18795 solver.cpp:244] Train net output #0: loss = 0.00611391 (* 1 = 0.00611391 loss)
I1008 00:13:02.717496 18795 sgd_solver.cpp:106] Iteration 9900, lr = 0.00596843
I1008 00:13:14.016697 18795 solver.cpp:454] Snapshotting to binary proto file examples/mnist/lenet_iter_10000.caffemodel
I1008 00:13:14.025446 18795 sgd_solver.cpp:273] Snapshotting solver state to binary proto file examples/mnist/lenet_iter_10000.solverstate
I1008 00:13:14.084300 18795 solver.cpp:317] Iteration 10000, loss = 0.00241856
I1008 00:13:14.084349 18795 solver.cpp:337] Iteration 10000, Testing net (#0)
I1008 00:13:21.108484 18795 solver.cpp:404] Test net output #0: accuracy = 0.9905
I1008 00:13:21.108542 18795 solver.cpp:404] Test net output #1: loss = 0.0295916 (* 1 = 0.0295916 loss)
I1008 00:13:21.108553 18795 solver.cpp:322] Optimization Done.
I1008 00:13:21.108559 18795 caffe.cpp:254] Optimization Done.