下载caltech101数据集
https://www.vision.caltech.edu/Image
生成train.txt,val.txt文件
在root/caffe/data目录下运行下面的Python文件
#!/usr/bin/env python # -*- coding:utf-8 -*- import os root = os.getcwd() #获取当前路径 data = '101_ObjectCategories' #101数据集的文件夹名称 path = os.listdir(root+'/'+ data) #显示该路径下所有文件 path.sort() vp = 0.1 #测试集合取总数据前10% ftr = open('train.txt','w') fva = open('val.txt','w') i = 0 for line in path: subdir = root+'/'+ data +'/'+line childpath = os.listdir(subdir) mid = int(vp*len(childpath)) for child in childpath[:mid]: subpath = data+'/'+line+'/'+child; d = ' %s' %(i) t = subpath + d fva.write(t +'\n') for child in childpath[mid:]: subpath = data+'/'+line+'/'+child; d = ' %s' %(i) t = subpath + d ftr.write(t +'\n') i=i+1 ftr.close() #关闭文件流 fva.close()
将图片转换为lmdb格式
在caffe/examples下创建一个文件夹calt101net,修改create_imagenet.sh,并在caffe目录下运行./examples/calt101net/create_imagenet.sh
#!/usr/bin/env sh # Create the imagenet lmdb inputs # N.B. set the path to the imagenet train + val data dirs set -e EXAMPLE=/home/yangjiacheng/caffe/examples/MyNet DATA=/home/yangjiacheng/caffe/data/ TOOLS=/home/yangjiacheng/caffe/build/tools TRAIN_DATA_ROOT=/home/yangjiacheng/caffe/data/ VAL_DATA_ROOT=/home/yangjiacheng/caffe/data/ # Set RESIZE=true to resize the images to 256x256. Leave as false if images have # already been resized using another tool. RESIZE=true if $RESIZE; then RESIZE_HEIGHT=256 RESIZE_WIDTH=256 else RESIZE_HEIGHT=0 RESIZE_WIDTH=0 fi if [ ! -d "$TRAIN_DATA_ROOT" ]; then echo "Error: TRAIN_DATA_ROOT is not a path to a directory: $TRAIN_DATA_ROOT" echo "Set the TRAIN_DATA_ROOT variable in create_imagenet.sh to the path" \ "where the ImageNet training data is stored." exit 1 fi if [ ! -d "$VAL_DATA_ROOT" ]; then echo "Error: VAL_DATA_ROOT is not a path to a directory: $VAL_DATA_ROOT" echo "Set the VAL_DATA_ROOT variable in create_imagenet.sh to the path" \ "where the ImageNet validation data is stored." exit 1 fi echo "Creating train lmdb..." GLOG_logtostderr=1 $TOOLS/convert_imageset \ --resize_height=$RESIZE_HEIGHT \ --resize_width=$RESIZE_WIDTH \ --shuffle \ $TRAIN_DATA_ROOT \ $DATA/train.txt \ $EXAMPLE/MyNet_train_lmdb echo "Creating val lmdb..." GLOG_logtostderr=1 $TOOLS/convert_imageset \ --resize_height=$RESIZE_HEIGHT \ --resize_width=$RESIZE_WIDTH \ --shuffle \ $VAL_DATA_ROOT \ $DATA/val.txt \ $EXAMPLE/MyNet_val_lmdb echo "Done."
主要是各个路径,可以用全局路径
还有line 16的RESIZE标志位一定要改,否则会产生Incorrect data field size的错误。
修改make_imagenet_mean.sh文件并运行,生成均值文件
#!/usr/bin/env sh # Compute the mean image from the imagenet training lmdb # N.B. this is available in data/ilsvrc12 EXAMPLE=/home/yangjiacheng/caffe/examples/MyNet DATA=/home/yangjiacheng/caffe/examples/MyNet TOOLS=/home/yangjiacheng/caffe/build/tools $TOOLS/compute_image_mean $EXAMPLE/MyNet_train_lmdb \ $DATA/mean.binaryproto echo "Done."
生成的均值文件为binaryproto格式,如果用python的话接下来要把它转化成npy格式,代码如下:
# coding:utf-8
#!/usr/bin/env python--将mean.binaryproto文件转为python可以使用的mean.npy文件
import numpy as np
import caffe,sys
root='/home/yangjiacheng/caffe/examples/MyNet/' #设置根目录
mean_proto_path=root+'mean.binaryproto' #mean.binaryproto路径
mean_npy_path=root+'mean.npy' #mean.npy路径
blob=caffe.proto.caffe_pb2.BlobProto() #创建protobuf blob
data=open(mean_proto_path,'rb').read() #读入mean.binaryproto文件内容
blob.ParseFromString(data) #解析文件内容到blob
array=np.array(caffe.io.blobproto_to_array(blob)) #将blob中的均值转换称numpy格式,array的shape(mean_number,channel,hight,width)
mean_npy=array[0] #一个array中可以有多组均值存在,故需要通过下标选择一组均值
np.save(mean_npy_path,mean_npy) #保存
在import caffe时报错说no module named caffe,解决方法:
vim ~/.bashrc
在最后一行写入
export PYTHONPATH=~/caffe/python:$PYTHONPATH
保存后运行
source ~/.bashrc
在datasets上进行finetune
首先直接用原始的slover.prototxtx进行训练,结果loss一直是87.3356,查阅资料发现是发生了溢出,原因是softmax在计算过程中出现了inf,nan等异常值,解决办法通常有以下几种:
-
减小初始化权重,以使得softmax的输入feature处于一个比较小的范围
-
降低学习率,这样可以减小权重的波动范围
-
如果有BN(batch normalization)层,finetune时最好不要冻结BN的参数,否则数据分布不一致时很容易使输出值变得很大(注意将
batch_norm_param
中的use_global_stats
设置为false
)。 -
观察数据中是否有异常样本或异常label导致数据读取异常
这里由于我是在原有的squeezenet model上进行finetune,因些无法改变初始的权重,尝试更改初始的learning rate,将solver.prototxt中的base_lr由0.04改为0.01,在训练大约200个iter后收敛,accuracy约为80%,top5 accuracy约为95%。
又尝试了将lr_policy改为“inv”,gamma设置为1,power设置为1,也就是用一个反比例函数型的下凸函数来减小learning rate,此时learning rate减小太快,收敛太慢了,因此还是使用原来的“poly”的方式来减小learning rate。
测试accuracy
./build/tools/caffe.bin test -model=/home/yangjiacheng/caffe/examples/MyNet/SqueezeNet/SqueezeNet_v1.1/train_val.prototxt -weights=/home/yangjiacheng/caffe/examples/MyNet/SqueezeNet/SqueezeNet_v1.1/squeezenet_v1.1.caffemodel -gpu=0
测试了200个iteration,400个iteration等几个时间点的accuracy,目前是200个iters后效果较好。结果一会儿贴上来。