首先得到了https://blog.csdn.net/gybheroin/article/details/72581318系列博客的帮助。表示感激。
关于安装caffe已在之前的博客介绍,自用可行,https://www.cnblogs.com/MY0213/p/9225310.html
1.数据源
首先使用的数据集为人脸数据集,可在百度云自行下载:
链接:https://pan.baidu.com/s/156DiOuB46wKrM0cEaAgfMw 密码:1ap0
将train.zip解压可得数据源,label文件是val.txt和train.txt。
2.将图片数据做成lmdb数据源
SET GLOG_logtostderr=1 SET RESIZE_HEIGHT=227 SET RESIZE_WIDTH=227 "convert_imageset" --resize_height=227 --resize_width=227 --shuffle "train/" "train.txt" "mtraindb" "convert_imageset" --resize_height=227 --resize_width=227 --shuffle "val/" "val.txt" "mvaldb" pause
详见face_lmdb.bat,将数据做成同等大小的数据。
3. 得到图像均值
SET GLOG_logtostderr=1 "compute_image_mean" "mtraindb" "train_mean.binaryproto" pause
详见mean_face.bat
训练时先做减均值的操作,可能对训练效果有好处
这里可以用固定的图片均值,是多少可以直接百度谷歌,这一步也可以不做,唐宇迪大神说影响不大。
4. 图像训练
SET GLOG_logostderr=1 caffe train --solver=solver.prototxt pause
详见train.bat
net: "train.prototxt" test_iter: 100 test_interval: 1000 # lr for fine-tuning should be lower than when starting from scratch base_lr: 0.001 lr_policy: "step" gamma: 0.1 # stepsize should also be lower, as we're closer to being done stepsize: 1000 display: 50 max_iter: 10000 momentum: 0.9 weight_decay: 0.0005 snapshot: 1000 snapshot_prefix: "model" # uncomment the following to default to CPU mode solving # solver_mode: CPU
详见solver.prototxt
关于solver.prototxt的内涵可查看
https://blog.csdn.net/qq_27923041/article/details/55211808
############################# DATA Layer ############################# name: "face_train_val" layer { top: "data" top: "label" name: "data" type: "Data" data_param { source: "mtraindb" backend:LMDB batch_size: 64 } transform_param { mean_file: "train_mean.binaryproto" mirror: true } include: { phase: TRAIN } } layer { top: "data" top: "label" name: "data" type: "Data" data_param { source: "mvaldb" backend:LMDB batch_size: 64 } transform_param { mean_file: "train_mean.binaryproto" mirror: true } include: { phase: TEST } } layer { name: "conv1" type: "Convolution" bottom: "data" top: "conv1" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 96 kernel_size: 11 stride: 4 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu1" type: "ReLU" bottom: "conv1" top: "conv1" } layer { name: "norm1" type: "LRN" bottom: "conv1" top: "norm1" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "pool1" type: "Pooling" bottom: "norm1" top: "pool1" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } layer { name: "conv2" type: "Convolution" bottom: "pool1" top: "conv2" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 256 pad: 2 kernel_size: 5 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu2" type: "ReLU" bottom: "conv2" top: "conv2" } layer { name: "norm2" type: "LRN" bottom: "conv2" top: "norm2" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "pool2" type: "Pooling" bottom: "norm2" top: "pool2" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } layer { name: "conv3" type: "Convolution" bottom: "pool2" top: "conv3" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 384 pad: 1 kernel_size: 3 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu3" type: "ReLU" bottom: "conv3" top: "conv3" } layer { name: "conv4" type: "Convolution" bottom: "conv3" top: "conv4" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 384 pad: 1 kernel_size: 3 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu4" type: "ReLU" bottom: "conv4" top: "conv4" } layer { name: "conv5" type: "Convolution" bottom: "conv4" top: "conv5" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 256 pad: 1 kernel_size: 3 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu5" type: "ReLU" bottom: "conv5" top: "conv5" } layer { name: "pool5" type: "Pooling" bottom: "conv5" top: "pool5" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } layer { name: "fc6" type: "InnerProduct" bottom: "pool5" top: "fc6" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } inner_product_param { num_output: 4096 weight_filler { type: "gaussian" std: 0.005 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu6" type: "ReLU" bottom: "fc6" top: "fc6" } layer { name: "drop6" type: "Dropout" bottom: "fc6" top: "fc6" dropout_param { dropout_ratio: 0.5 } } layer { name: "fc7" type: "InnerProduct" bottom: "fc6" top: "fc7" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } inner_product_param { num_output: 4096 weight_filler { type: "gaussian" std: 0.005 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu7" type: "ReLU" bottom: "fc7" top: "fc7" } layer { name: "drop7" type: "Dropout" bottom: "fc7" top: "fc7" dropout_param { dropout_ratio: 0.5 } } layer { name: "fc8-expr" type: "InnerProduct" bottom: "fc7" top: "fc8-expr" param { lr_mult: 10 decay_mult: 1 } param { lr_mult: 20 decay_mult: 0 } inner_product_param { num_output: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "accuracy" type: "Accuracy" bottom: "fc8-expr" bottom: "label" top: "accuracy" include { phase: TEST } } layer { name: "loss" type: "SoftmaxWithLoss" bottom: "fc8-expr" bottom: "label" top: "loss" }
详见train.prototxt,也就是将alexnet中最后的1000变为2就可以了。
这个过程需要5天左右(我用的cpu),可以直接用已有模型alexnet_iter_50000_full_conv.caffemodel
5. 测试
可用run_face_detect_batch.py测试人脸检测效果。
6. 总结
这个网络测试时特别慢,用的是slipping window的方法。下面的文章再介绍快速一点的faster rcnn 及FPN。
slipping window中用了Casting a Classifier into a Fully Convolutional Network 的方法。这一方法在其他网络中也可用。
关于rcnn的演进,可见https://www.cnblogs.com/MY0213/p/9460562.html
欢迎批评指正。