caffe学习--使用caffe中的imagenet对自己的图片进行分类训练(超级详细版) -----linux
版权声明:本文为博主原创文章,未经博主允许不得转载。
因为自己在网络上查到的资料对于一个新手来说虽然指明了方向,但是在细节上没有给出很好的实例,因此我把自己训练的过程记录下来。
【实验环境】
物理内存:64G Free:7.5G CPU个数:3,单个CPU物理核数:8
备注:具有GPU运算能力
【实验目标】
使用自己的图片集,以及caffe框架,对imagenet进行训练,得到自己的model。
【前期准备】
1. 安装并配置caffe环境
【实验过程】
1. 数据集准备
获取训练图片集与验证图片集,并产生train.txt与val.txt,内容为图片路径与分类标签;将图片进行大小重设,设置为256*256大小;使用create_imagenet.sh脚本将2组图片集转换为lmbp格式。
2. 计算图像均值
使用make_imagenet_mean.sh计算图像均值,产生imagenet_mean.binaryproto文件。
3. 设置网络参数
拷贝caffe-master/model/bvlc_reference_caffenet中的文件,修改train_val.prototxt,solver.prototxt中的运行参数,并进行路径的修改;拷贝caffe_master/examples/imagenet中的train_caffnet.sh文件,对路径进行修改。
4. 运行train_caffnet.sh
【实验过程详细版】
备注一下目录的情况,这样比较调理啦:
Caffe根目录:caffe_root=/home/james/caffe/
图片类数据:caffe_root/data/mydata
命令参数类数据:caffe_root/examples/mytask
注:默认我们手动添加的除图片以及.txt之外的文件都属于命令参数类数据,运行的时候注意路径就好,另外,我门在实验的时候换了别人的电脑,因此存在caffe根路径前后不一致的状况,大家注意一下就好。
1. 数据集准备
a. 准备训练图片集以及验证图片集
新建caffe_root/data/mydata,分别将图片集放置于caffe_root/data/mydata/train与caffe_root/data/mydata/val下面
b. 准备图片清单
在caffe_root/data/mydata下面新建两个文件train.txt与val.txt,train.txt中的内容为:
1.jpg 7
2.jpg7
3.jpg 7
…
以上格式为图片名称+空格+类标(数字)的格式,val.txt的格式也是一样的(同样需要类标)。
此步可以使用create_filelist.sh进行批量添加图片路径至train.txt。create_filelist.sh内容需要按照自身图片的名称与类标情况进行修改,并持续运行(因为是在文件后面追加)内容如下:
#!/usr/bin/env sh
#!/bin/bash
DATA=/home/james/caffe/data/mydata/val
MY=/home/james/caffe/data/mydata
for i in {3122..3221}
do
echo $i.jpg 3 >> $MY/val.txt
done
echo "All done"
以上命令意思是,在val文件夹下面的图片中,名称为3122.jpg至3221.jpg的图片都是第3类,因此就会在val.txt写入:
3122.jpg 3
3123.jpg 3
…
注意:此时可能会报出bad loop variable的错误,这是由于Ubuntu bash的版本的原因,可以自行查看如何解决。
c. 调整图片大小至256*256
因为之前没有仔细看caffe的相关文件,后来才知道可以使用之自动调整大小,因此此步采用的是自己调用命令进行调整大小。如果不调整图片大小的话,在运行后面命令的时候是会报错的。
可以使用convert256.sh进行转换。注意,该命令中用到了imagemagick工具,因此如果自己没有安装的话,还需要安装该工具(命令为:sudo apt-get install imagemagick)。convert256.sh内容如下:
for name in/home/james/caffe/data/mydata/train/*.jpg; do
convert -resize 256x256\! $name $name
done
d. 构建图片数据库
要让Caffe进行图片的训练,必须有图片数据库,并且也是使用其作为输入,而非直接使用图片作为输入。使用create_imagenet.sh脚本将train与val的2组图片集转换为lmbp格式。create_imagenet.sh内容如下:
#!/usr/bin/env sh
# Create the imagenet lmdb inputs
# N.B. set the path to the imagenet train +val data dirs
EXAMPLE=/home/james/caffe/examples/mytask
DATA=/home/james/caffe/data/mydata
TOOLS=/home/james/caffe/build/tools
TRAIN_DATA_ROOT=/home/james/caffe/data/mydata/train/
VAL_DATA_ROOT=/home/james/caffe/data/mydata/val/
# Set RESIZE=true to resize the images to256x256. Leave as false if images have
# already been resized using another tool.
RESIZE=false
if $RESIZE; then
RESIZE_HEIGHT=256
RESIZE_WIDTH=256
else
RESIZE_HEIGHT=0
RESIZE_WIDTH=0
fi
if [ ! -d "$TRAIN_DATA_ROOT" ];then
echo "Error: TRAIN_DATA_ROOT is not a path to a directory:$TRAIN_DATA_ROOT"
echo "Set the TRAIN_DATA_ROOT variable in create_imagenet.sh to thepath" \
"where the ImageNet training data is stored."
exit 1
fi
if [ ! -d "$VAL_DATA_ROOT" ]; then
echo "Error: VAL_DATA_ROOT is not a path to a directory:$VAL_DATA_ROOT"
echo "Set the VAL_DATA_ROOT variable in create_imagenet.sh to thepath" \
"where the ImageNet validation data is stored."
exit 1
fi
echo "Creating train lmdb..."
GLOG_logtostderr=1 $TOOLS/convert_imageset\
--resize_height=$RESIZE_HEIGHT \
--resize_width=$RESIZE_WIDTH \
--shuffle \
$TRAIN_DATA_ROOT \
$DATA/train.txt \
$EXAMPLE/ilsvrc12_train_lmdb
echo "Creating val lmdb..."
GLOG_logtostderr=1 $TOOLS/convert_imageset\
--resize_height=$RESIZE_HEIGHT \
--resize_width=$RESIZE_WIDTH \
--shuffle \
$VAL_DATA_ROOT \
$DATA/val.txt \
$EXAMPLE/ilsvrc12_val_lmdb
echo "Done."
注:将其中的地址均修改为自己的对应地址,不是地址的就不要强行修改啦。
2. 计算图像均值
据说计算图像均值之后的训练效果会更好,使用make_imagenet_mean.sh计算图像均值,产生imagenet_mean.binaryproto文件。make_imagenet_mean.sh文件内容如下:
#!/usr/bin/env sh
# Compute the mean image from the imagenettraining lmdb
# N.B. this is available in data/ilsvrc12
EXAMPLE=/home/james/caffe/examples/mytask
DATA=/home/james/caffe/data/mydata/
TOOLS=/home/james/caffe/build/tools
$TOOLS/compute_image_mean$EXAMPLE/ilsvrc12_train_lmdb \
$DATA/imagenet_mean.binaryproto
echo "Done."
注:将其中的地址修改为自己的地址,并且产生的imagenet_mean.binaryproto文件在data/mydata文件夹下,稍后设置的时候注意该路径。
3. 设置训练参数
拷贝caffe-master/model/bvlc_reference_caffenet中的文件,修改train_val.prototxt,solver.prototxt中的运行参数,并进行路径的修改;拷贝caffe_master/examples/imagenet中的train_caffnet.sh文件,对路径进行修改。
train_val.prototxt是网络的结构,内容如下:
name: "CaffeNet"
layer {
name: "data"
type: "Data"
top: "data"
top: "label"
include {
phase: TRAIN
}
transform_param {
mirror: true
crop_size: 227
mean_file:"/home/dina/caffe/examples/mytask/imagenet_mean.binaryproto"
}
# mean pixel / channel-wise mean instead ofmean image
# transform_param {
# crop_size: 227
# mean_value: 104
# mean_value: 117
# mean_value: 123
# mirror: true
# }
data_param {
source: "/home/dina/caffe/examples/mytask/ilsvrc12_train_lmdb"
batch_size: 256
backend: LMDB
}
}
layer {
name: "data"
type: "Data"
top: "data"
top: "label"
include {
phase: TEST
}
transform_param {
mirror: false
crop_size: 227
mean_file:"/home/dina/caffe/examples/mytask/imagenet_mean.binaryproto"
}
# mean pixel / channel-wise mean instead ofmean image
# transform_param {
# crop_size: 227
# mean_value: 104
# mean_value: 117
# mean_value: 123
# mirror: false
# }
data_param {
source: "/home/dina/caffe/examples/mytask/ilsvrc12_val_lmdb"
batch_size: 50
backend: LMDB
}
}
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 11
stride: 4
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "conv1"
top: "conv1"
}
layer {
name: "pool1"
type: "Pooling"
bottom: "conv1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "norm1"
type: "LRN"
bottom: "pool1"
top: "norm1"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "norm1"
top: "conv2"
param {
lr_mult:1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 2
kernel_size: 5
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 1
}
}
}
layer {
name: "relu2"
type: "ReLU"
bottom: "conv2"
top: "conv2"
}
layer {
name: "pool2"
type: "Pooling"
bottom: "conv2"
top: "pool2"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "norm2"
type: "LRN"
bottom: "pool2"
top: "norm2"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "conv3"
type: "Convolution"
bottom: "norm2"
top: "conv3"
param {
lr_mult:1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3"
type: "ReLU"
bottom: "conv3"
top: "conv3"
}
layer {
name: "conv4"
type: "Convolution"
bottom: "conv3"
top: "conv4"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 1
}
}
}
layer {
name: "relu4"
type: "ReLU"
bottom: "conv4"
top: "conv4"
}
layer {
name: "conv5"
type: "Convolution"
bottom: "conv4"
top: "conv5"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 1
}
}
}
layer {
name: "relu5"
type: "ReLU"
bottom: "conv5"
top: "conv5"
}
layer {
name: "pool5"
type: "Pooling"
bottom: "conv5"
top: "pool5"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "fc6"
type: "InnerProduct"
bottom: "pool5"
top: "fc6"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 4096
weight_filler {
type: "gaussian"
std: 0.005
}
bias_filler {
type: "constant"
value: 1
}
}
}
layer {
name: "relu6"
type: "ReLU"
bottom: "fc6"
top: "fc6"
}
layer {
name: "drop6"
type: "Dropout"
bottom: "fc6"
top: "fc6"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
name: "fc7"
type: "InnerProduct"
bottom: "fc6"
top: "fc7"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 4096
weight_filler {
type: "gaussian"
std: 0.005
}
bias_filler {
type: "constant"
value: 1
}
}
}
layer {
name: "relu7"
type: "ReLU"
bottom: "fc7"
top: "fc7"
}
layer {
name: "drop7"
type: "Dropout"
bottom: "fc7"
top: "fc7"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
name: "fc8"
type: "InnerProduct"
bottom: "fc7"
top: "fc8"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 1000
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "accuracy"
type: "Accuracy"
bottom: "fc8"
bottom: "label"
top: "accuracy"
include {
phase: TEST
}
}
layer {
name: "loss"
type: "SoftmaxWithLoss"
bottom: "fc8"
bottom: "label"
top: "loss"
}
solver.prototxt是网络参数的设置,内容如下:
net:"/home/dina/caffe/examples/mytask/train_val.prototxt"
test_iter: 2
test_interval: 50
base_lr: 0.001
lr_policy: "step"
gamma: 0.1
stepsize: 100
display: 20
max_iter: 1000
momentum: 0.9
weight_decay: 0.0005
snapshot: 500
snapshot_prefix:"models/bvlc_reference_caffenet/caffenet_train"
solver_mode: GPU
train_caffnet.sh是运行网络的命令,内容如下:
#!/usr/bin/env sh
./build/tools/caffe train \
--solver=./examples/mytask/solver.prototxt
好了,可以等待训练过程了,我们的训练图片是2000个训练图片,1000个验证图片,大约过了3-4个小时,就训练好了。
每一个不曾起舞的日子,都是对生命的辜负。
But it is the same with man as with the tree. The more he seeks to rise into the height and light, the more vigorously do his roots struggle earthward, downward, into the dark, the deep - into evil.
其实人跟树是一样的,越是向往高处的阳光,它的根就越要伸向黑暗的地底。----尼采