caffe训练自己的数据集
默认caffe已经编译好了,并且编译好了pycaffe
1 数据准备
首先准备训练和测试数据集,这里准备两类数据,分别放在文件夹0和文件夹1中(之所以使用0和1命名数据类别,是因为方便标注数据类别,直接用文件夹的名字即可)。即训练数据集:/data/train/0、/data/train/1 训练数据集:/data/val/0、/data/val/1。
数据准备好之后,创建记录数据文件和对应标签的txt文件
(1)创建训练数据集的train.txt
1 import os 2 f =open(r'train.txt',"w") 3 path = os.getcwd()+'/data/train/' 4 for filename in os.listdir(path) : 5 count = 0 6 for file in os.listdir(path+filename) : 7 count = count + 1 8 ff='/'+filename+"/"+file+" "+filename+"\n" 9 f.write(ff) 10 print '{} class: {}'.format(filename,count) 11 f.close()
(2)创建测试数据集val.txt
1 import os 2 f =open(r'val.txt',"w") 3 path = os.getcwd()+'/data/val/' 4 for filename in os.listdir(path) : 5 count = 0 6 for file in os.listdir(path+filename) : 7 count = count + 1 8 ff='/'+filename+"/"+file+" "+filename+"\n" 9 f.write(ff) 10 print '{} class: {}'.format(filename,count) 11 f.close()
注意,txt中文件的路径为: /类别文件夹名/文件名(空格,不能是制表符)类别
2 创建LMDB数据文件
创建createlmdb.sh使用caffe自带的(bulid/tools下的)convert_imageset创建LMDB数据文件,主要是注意数据文件以及上一步生成的txt文件的位置,注意数据文件的RESIZE,后边在进行训练和测试的时候还要用到,其余就是文件的路径的问题了。
1 #!/usr/bin/env sh 2 3 CAFFE_ROOT=/home/caf/object/caffe 4 TOOLS=$CAFFE_ROOT/build/tools 5 TRAIN_DATA_ROOT=/home/caf/wk/learn/data/train 6 VAL_DATA_ROOT=/home/caf/wk/learn/data/val 7 DATA=/home/caf/wk/learn/data 8 EXAMPLE=/home/caf/wk/learn/data/lmdb 9 # Set RESIZE=true to resize the images to 60 x 60. Leave as false if images have 10 # already been resized using another tool. 11 RESIZE=true 12 if $RESIZE; then 13 RESIZE_HEIGHT=227 14 RESIZE_WIDTH=227 15 else 16 RESIZE_HEIGHT=0 17 RESIZE_WIDTH=0 18 fi 19 20 if [ ! -d "$TRAIN_DATA_ROOT" ]; then 21 echo "Error: TRAIN_DATA_ROOT is not a path to a directory: $TRAIN_DATA_ROOT" 22 echo "Set the TRAIN_DATA_ROOT variable in create_face_48.sh to the path" \ 23 "where the face_48 training data is stored." 24 exit 1 25 fi 26 27 if [ ! -d "$VAL_DATA_ROOT" ]; then 28 echo "Error: VAL_DATA_ROOT is not a path to a directory: $VAL_DATA_ROOT" 29 echo "Set the VAL_DATA_ROOT variable in create_face_48.sh to the path" \ 30 "where the face_48 validation data is stored." 31 exit 1 32 fi 33 34 echo "Creating train lmdb..." 35 36 GLOG_logtostderr=1 $TOOLS/convert_imageset \ 37 --resize_height=$RESIZE_HEIGHT \ 38 --resize_width=$RESIZE_WIDTH \ 39 --shuffle \ 40 $TRAIN_DATA_ROOT \ 41 $DATA/train.txt \ 42 $EXAMPLE/face_train_lmdb 43 44 echo "Creating val lmdb..." 45 46 GLOG_logtostderr=1 $TOOLS/convert_imageset \ 47 --resize_height=$RESIZE_HEIGHT \ 48 --resize_width=$RESIZE_WIDTH \ 49 --shuffle \ 50 $VAL_DATA_ROOT \ 51 $DATA/val.txt \ 52 $EXAMPLE/face_val_lmdb 53 54 echo "Done."
3 定义网络
caffe接受的网络模型是prototxt文件,对于caffe网络的定义语法有详细的解释,本次实验用的是AlexNet,保存在train_val.prototxt
1 name: "AlexNet" 2 layer { 3 name: "data" 4 type: "Data" 5 top: "data" 6 top: "label" 7 include { 8 phase: TRAIN 9 } 10 data_param { 11 source: "/home/caf/wk/learn/data/lmdb/face_train_lmdb" 12 batch_size: 256 13 backend: LMDB 14 } 15 } 16 layer { 17 name: "data" 18 type: "Data" 19 top: "data" 20 top: "label" 21 include { 22 phase: TEST 23 } 24 data_param { 25 source: "/home/caf/wk/learn/data/lmdb/face_val_lmdb" 26 batch_size: 50 27 backend: LMDB 28 } 29 } 30 layer { 31 name: "conv1" 32 type: "Convolution" 33 bottom: "data" 34 top: "conv1" 35 param { 36 lr_mult: 1 37 decay_mult: 1 38 } 39 param { 40 lr_mult: 2 41 decay_mult: 0 42 } 43 convolution_param { 44 num_output: 96 45 kernel_size: 11 46 stride: 4 47 weight_filler { 48 type: "gaussian" 49 std: 0.01 50 } 51 bias_filler { 52 type: "constant" 53 value: 0 54 } 55 } 56 } 57 layer { 58 name: "relu1" 59 type: "ReLU" 60 bottom: "conv1" 61 top: "conv1" 62 } 63 layer { 64 name: "norm1" 65 type: "LRN" 66 bottom: "conv1" 67 top: "norm1" 68 lrn_param { 69 local_size: 5 70 alpha: 0.0001 71 beta: 0.75 72 } 73 } 74 layer { 75 name: "pool1" 76 type: "Pooling" 77 bottom: "norm1" 78 top: "pool1" 79 pooling_param { 80 pool: MAX 81 kernel_size: 3 82 stride: 2 83 } 84 } 85 layer { 86 name: "conv2" 87 type: "Convolution" 88 bottom: "pool1" 89 top: "conv2" 90 param { 91 lr_mult: 1 92 decay_mult: 1 93 } 94 param { 95 lr_mult: 2 96 decay_mult: 0 97 } 98 convolution_param { 99 num_output: 256 100 pad: 2 101 kernel_size: 5 102 group: 2 103 weight_filler { 104 type: "gaussian" 105 std: 0.01 106 } 107 bias_filler { 108 type: "constant" 109 value: 0.1 110 } 111 } 112 } 113 layer { 114 name: "relu2" 115 type: "ReLU" 116 bottom: "conv2" 117 top: "conv2" 118 } 119 layer { 120 name: "norm2" 121 type: "LRN" 122 bottom: "conv2" 123 top: "norm2" 124 lrn_param { 125 local_size: 5 126 alpha: 0.0001 127 beta: 0.75 128 } 129 } 130 layer { 131 name: "pool2" 132 type: "Pooling" 133 bottom: "norm2" 134 top: "pool2" 135 pooling_param { 136 pool: MAX 137 kernel_size: 3 138 stride: 2 139 } 140 } 141 layer { 142 name: "conv3" 143 type: "Convolution" 144 bottom: "pool2" 145 top: "conv3" 146 param { 147 lr_mult: 1 148 decay_mult: 1 149 } 150 param { 151 lr_mult: 2 152 decay_mult: 0 153 } 154 convolution_param { 155 num_output: 384 156 pad: 1 157 kernel_size: 3 158 weight_filler { 159 type: "gaussian" 160 std: 0.01 161 } 162 bias_filler { 163 type: "constant" 164 value: 0 165 } 166 } 167 } 168 layer { 169 name: "relu3" 170 type: "ReLU" 171 bottom: "conv3" 172 top: "conv3" 173 } 174 layer { 175 name: "conv4" 176 type: "Convolution" 177 bottom: "conv3" 178 top: "conv4" 179 param { 180 lr_mult: 1 181 decay_mult: 1 182 } 183 param { 184 lr_mult: 2 185 decay_mult: 0 186 } 187 convolution_param { 188 num_output: 384 189 pad: 1 190 kernel_size: 3 191 group: 2 192 weight_filler { 193 type: "gaussian" 194 std: 0.01 195 } 196 bias_filler { 197 type: "constant" 198 value: 0.1 199 } 200 } 201 } 202 layer { 203 name: "relu4" 204 type: "ReLU" 205 bottom: "conv4" 206 top: "conv4" 207 } 208 layer { 209 name: "conv5" 210 type: "Convolution" 211 bottom: "conv4" 212 top: "conv5" 213 param { 214 lr_mult: 1 215 decay_mult: 1 216 } 217 param { 218 lr_mult: 2 219 decay_mult: 0 220 } 221 convolution_param { 222 num_output: 256 223 pad: 1 224 kernel_size: 3 225 group: 2 226 weight_filler { 227 type: "gaussian" 228 std: 0.01 229 } 230 bias_filler { 231 type: "constant" 232 value: 0.1 233 } 234 } 235 } 236 layer { 237 name: "relu5" 238 type: "ReLU" 239 bottom: "conv5" 240 top: "conv5" 241 } 242 layer { 243 name: "pool5" 244 type: "Pooling" 245 bottom: "conv5" 246 top: "pool5" 247 pooling_param { 248 pool: MAX 249 kernel_size: 3 250 stride: 2 251 } 252 } 253 layer { 254 name: "fc6" 255 type: "InnerProduct" 256 bottom: "pool5" 257 top: "fc6" 258 param { 259 lr_mult: 1 260 decay_mult: 1 261 } 262 param { 263 lr_mult: 2 264 decay_mult: 0 265 } 266 inner_product_param { 267 num_output: 4096 268 weight_filler { 269 type: "gaussian" 270 std: 0.005 271 } 272 bias_filler { 273 type: "constant" 274 value: 0.1 275 } 276 } 277 } 278 layer { 279 name: "relu6" 280 type: "ReLU" 281 bottom: "fc6" 282 top: "fc6" 283 } 284 layer { 285 name: "drop6" 286 type: "Dropout" 287 bottom: "fc6" 288 top: "fc6" 289 dropout_param { 290 dropout_ratio: 0.5 291 } 292 } 293 layer { 294 name: "fc7" 295 type: "InnerProduct" 296 bottom: "fc6" 297 top: "fc7" 298 param { 299 lr_mult: 1 300 decay_mult: 1 301 } 302 param { 303 lr_mult: 2 304 decay_mult: 0 305 } 306 inner_product_param { 307 num_output: 4096 308 weight_filler { 309 type: "gaussian" 310 std: 0.005 311 } 312 bias_filler { 313 type: "constant" 314 value: 0.1 315 } 316 } 317 } 318 layer { 319 name: "relu7" 320 type: "ReLU" 321 bottom: "fc7" 322 top: "fc7" 323 } 324 layer { 325 name: "drop7" 326 type: "Dropout" 327 bottom: "fc7" 328 top: "fc7" 329 dropout_param { 330 dropout_ratio: 0.5 331 } 332 } 333 layer { 334 name: "fc8" 335 type: "InnerProduct" 336 bottom: "fc7" 337 top: "fc8" 338 param { 339 lr_mult: 1 340 decay_mult: 1 341 } 342 param { 343 lr_mult: 2 344 decay_mult: 0 345 } 346 inner_product_param { 347 num_output: 2 348 weight_filler { 349 type: "gaussian" 350 std: 0.01 351 } 352 bias_filler { 353 type: "constant" 354 value: 0 355 } 356 } 357 } 358 layer { 359 name: "accuracy" 360 type: "Accuracy" 361 bottom: "fc8" 362 bottom: "label" 363 top: "accuracy" 364 include { 365 phase: TEST 366 } 367 } 368 layer { 369 name: "loss" 370 type: "SoftmaxWithLoss" 371 bottom: "fc8" 372 bottom: "label" 373 top: "loss" 374 } 375 layer { 376 name: "prob" 377 type: "Softmax" 378 bottom: "fc8" 379 top: "prob" 380 }
创建超参数文件slover.prototxt,主要定义训练的参数,包括迭代次数,每迭代多少次保存模型文件,学习率等等,net就是刚才定义的训练网络,这里训练和测试使用同一个网络。
1 net: "train_val.prototxt" 2 test_iter: 2 3 test_interval: 10 4 base_lr: 0.001 5 lr_policy: "step" 6 gamma: 0.1 7 stepsize: 100 8 display: 20 9 max_iter: 100 10 momentum: 0.9 11 weight_decay: 0.005 12 solver_mode: GPU 13 snapshot: 20 14 snapshot_prefix: "model/"
4 训练模型
创建train.sh使用GPU进行训练,否则太慢!!!
1 #!/usr/bin/env sh 2 CAFFE_ROOT=/home/caf/object/caffe 3 SLOVER_ROOT=/home/caf/wk/learn 4 $CAFFE_ROOT/build/tools/caffe train --solver=$SLOVER_ROOT/slover.prototxt --gpu=0
在model文件夹下会生成caffemodel文件,使用这些文件用于图像的分类等操作。
4 测试
创建deploy.prototxt进行测试,和训练网络一样,只不过用于实际分类的网络并不需要训练网络那些参数了,因此需要重新定义一个模型文件,测试的图片在该模型中进行。
deploy.prototxt文件和train_val.prototxt文件不同的地方在于:
(1)输入的数据不再是LMDB,也不分为测试集和训练集,输入的类型为Input,定义的维度,和训练集的数据维度保持一致,227*227,否则会报错;
(2)去掉weight_filler和bias_filler,这些参数已经存在于caffemodel中了,由caffemodel进行初始化。
(3)去掉最后的Accuracy层和loss层,换位Softmax层,表示分为某一类的概率。
1 name: "AlexNet" 2 layer { 3 name: "data" 4 type: "Input" 5 top: "data" 6 input_param { shape: { dim: 1 dim: 3 dim: 227 dim: 227 } } 7 } 8 layer { 9 name: "conv1" 10 type: "Convolution" 11 bottom: "data" 12 top: "conv1" 13 param { 14 lr_mult: 1 15 decay_mult: 1 16 } 17 param { 18 lr_mult: 2 19 decay_mult: 0 20 } 21 convolution_param { 22 num_output: 96 23 kernel_size: 11 24 stride: 4 25 } 26 } 27 layer { 28 name: "relu1" 29 type: "ReLU" 30 bottom: "conv1" 31 top: "conv1" 32 } 33 layer { 34 name: "norm1" 35 type: "LRN" 36 bottom: "conv1" 37 top: "norm1" 38 lrn_param { 39 local_size: 5 40 alpha: 0.0001 41 beta: 0.75 42 } 43 } 44 layer { 45 name: "pool1" 46 type: "Pooling" 47 bottom: "norm1" 48 top: "pool1" 49 pooling_param { 50 pool: MAX 51 kernel_size: 3 52 stride: 2 53 } 54 } 55 layer { 56 name: "conv2" 57 type: "Convolution" 58 bottom: "pool1" 59 top: "conv2" 60 param { 61 lr_mult: 1 62 decay_mult: 1 63 } 64 param { 65 lr_mult: 2 66 decay_mult: 0 67 } 68 convolution_param { 69 num_output: 256 70 pad: 2 71 kernel_size: 5 72 group: 2 73 } 74 } 75 layer { 76 name: "relu2" 77 type: "ReLU" 78 bottom: "conv2" 79 top: "conv2" 80 } 81 layer { 82 name: "norm2" 83 type: "LRN" 84 bottom: "conv2" 85 top: "norm2" 86 lrn_param { 87 local_size: 5 88 alpha: 0.0001 89 beta: 0.75 90 } 91 } 92 layer { 93 name: "pool2" 94 type: "Pooling" 95 bottom: "norm2" 96 top: "pool2" 97 pooling_param { 98 pool: MAX 99 kernel_size: 3 100 stride: 2 101 } 102 } 103 layer { 104 name: "conv3" 105 type: "Convolution" 106 bottom: "pool2" 107 top: "conv3" 108 param { 109 lr_mult: 1 110 decay_mult: 1 111 } 112 param { 113 lr_mult: 2 114 decay_mult: 0 115 } 116 convolution_param { 117 num_output: 384 118 pad: 1 119 kernel_size: 3 120 } 121 } 122 layer { 123 name: "relu3" 124 type: "ReLU" 125 bottom: "conv3" 126 top: "conv3" 127 } 128 layer { 129 name: "conv4" 130 type: "Convolution" 131 bottom: "conv3" 132 top: "conv4" 133 param { 134 lr_mult: 1 135 decay_mult: 1 136 } 137 param { 138 lr_mult: 2 139 decay_mult: 0 140 } 141 convolution_param { 142 num_output: 384 143 pad: 1 144 kernel_size: 3 145 group: 2 146 } 147 } 148 layer { 149 name: "relu4" 150 type: "ReLU" 151 bottom: "conv4" 152 top: "conv4" 153 } 154 layer { 155 name: "conv5" 156 type: "Convolution" 157 bottom: "conv4" 158 top: "conv5" 159 param { 160 lr_mult: 1 161 decay_mult: 1 162 } 163 param { 164 lr_mult: 2 165 decay_mult: 0 166 } 167 convolution_param { 168 num_output: 256 169 pad: 1 170 kernel_size: 3 171 group: 2 172 } 173 } 174 layer { 175 name: "relu5" 176 type: "ReLU" 177 bottom: "conv5" 178 top: "conv5" 179 } 180 layer { 181 name: "pool5" 182 type: "Pooling" 183 bottom: "conv5" 184 top: "pool5" 185 pooling_param { 186 pool: MAX 187 kernel_size: 3 188 stride: 2 189 } 190 } 191 layer { 192 name: "fc6" 193 type: "InnerProduct" 194 bottom: "pool5" 195 top: "fc6" 196 param { 197 lr_mult: 1 198 decay_mult: 1 199 } 200 param { 201 lr_mult: 2 202 decay_mult: 0 203 } 204 inner_product_param { 205 num_output: 4096 206 } 207 } 208 layer { 209 name: "relu6" 210 type: "ReLU" 211 bottom: "fc6" 212 top: "fc6" 213 } 214 layer { 215 name: "drop6" 216 type: "Dropout" 217 bottom: "fc6" 218 top: "fc6" 219 dropout_param { 220 dropout_ratio: 0.5 221 } 222 } 223 layer { 224 name: "fc7" 225 type: "InnerProduct" 226 bottom: "fc6" 227 top: "fc7" 228 param { 229 lr_mult: 1 230 decay_mult: 1 231 } 232 param { 233 lr_mult: 2 234 decay_mult: 0 235 } 236 inner_product_param { 237 num_output: 4096 238 } 239 } 240 layer { 241 name: "relu7" 242 type: "ReLU" 243 bottom: "fc7" 244 top: "fc7" 245 } 246 layer { 247 name: "drop7" 248 type: "Dropout" 249 bottom: "fc7" 250 top: "fc7" 251 dropout_param { 252 dropout_ratio: 0.5 253 } 254 } 255 layer { 256 name: "fc8" 257 type: "InnerProduct" 258 bottom: "fc7" 259 top: "fc8" 260 param { 261 lr_mult: 1 262 decay_mult: 1 263 } 264 param { 265 lr_mult: 2 266 decay_mult: 0 267 } 268 inner_product_param { 269 num_output: 2 270 } 271 } 272 layer { 273 name: "prob" 274 type: "Softmax" 275 bottom: "fc8" 276 top: "prob" 277 }
用于训练的python代码,使用caffe中python的接口,主要定义好自己训练好的参数文件,模型文件的位置,以及均值文件的位置。
1 import numpy as np 2 import matplotlib.pyplot as plt 3 4 import sys 5 caffe_root="/home/caf/object/caffe/" 6 sys.path.insert(0,caffe_root+'python') 7 import caffe 8 caffe.set_device(0) 9 caffe.set_mode_gpu() 10 model_def = 'deploy.prototxt' 11 model_weights = 'model/_iter_100.caffemodel' 12 net = caffe.Net(model_def, 13 model_weights, 14 caffe.TEST) 15 mu = np.load(caffe_root + 'python/caffe/imagenet/ilsvrc_2012_mean.npy') 16 mu = mu.mean(1).mean(1) 17 #print 'mean-subtracted values:', zip('BGR', mu) 18 transformer = caffe.io.Transformer({'data': net.blobs['data'].data.shape}) 19 transformer.set_transpose('data', (2,0,1)) 20 transformer.set_mean('data', mu) 21 transformer.set_raw_scale('data', 255) 22 transformer.set_channel_swap('data', (2,1,0)) 23 net.blobs['data'].reshape(3,227, 227) 24 image = caffe.io.load_image('test.jpg') 25 transformed_image = transformer.preprocess('data', image) 26 #plt.imshow(image) 27 #plt.show() 28 net.blobs['data'].data[...] = transformed_image 29 output = net.forward() 30 output_prob = output['prob'] 31 print output_prob 32 print 'predicted class is:', output_prob.argmax()
遇到的问题
(1)标签文件不能用制表符,必须是空格,否则会找不到数据文件
(2)CUDA问题,报一个类似叫CUDASuccess的错误,说明GPU空间不够,需要释放空间,使用 nvidia-smi 命令查看那个程序占用GPU过高,使用 kill -9 PID结束掉即可
(3)由于caffe版本的问题,层的定义 有layer和layers,使用layer定义,type需要加双引号,是字符格式;使用layers定义,type不用加双引号,变为全大写字母