Editing model parameters
本节的参考网页是http://nbviewer.ipython.org/github/BVLC/caffe/blob/master/examples/net_surgery.ipynb,对于已有模型我们可以根据自己的需要将其进行微调,然后在用现有数据进行fine-tuning,本节将RCNN模型的全连接层使用卷积来实现,卷积具有平移不变,且卷积计算是element-wise操作,因此通过这个小改变可以提高速度。通过卷积实现全连接层最后得到的输出是64个大小为8*8的分类图,这相当于对大小为451*451的输入,使用大小为227*227,步长为32的窗口作卷积(output = (input - kernel_size) / stride + 1)
这个变化相当于对原图提取dense feature,dense feature 与类别息息相关,因此不同类的图片提取到的特征可区分必比较大。。将分类训练的模型用到特征表示,太机智了!
1、输出"fc6“,”fc7" , "fc8"层权重的大小
# Make sure that caffe is on the python path: caffe_root = '../' # this file is expected to be in {caffe_root}/examples import sys sys.path.insert(0, caffe_root + 'python') import caffe # Load the original network and extract the fully-connected layers' parameters. net = caffe.Net('../models/bvlc_reference_caffenet/deploy.prototxt', '../models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel') params = ['fc6', 'fc7', 'fc8'] # fc_params = {name: (weights, biases)} fc_params = {pr: (net.params[pr][0].data, net.params[pr][1].data) for pr in params} for fc in params: print '{} weights are {} dimensional and biases are {} dimensional'.format(fc, fc_params[fc][0].shape, fc_params[fc][1].shape)
fc6 weights are (1, 1, 4096, 9216) dimensional and biases are (1, 1, 1, 4096) dimensional fc7 weights are (1, 1, 4096, 4096) dimensional and biases are (1, 1, 1, 4096) dimensional fc8 weights are (1, 1, 1000, 4096) dimensional and biases are (1, 1, 1, 1000) dimensional
全连接层输出只有一个输出一个输出,因为前面两个都为1
2、输出“fc6-conv" , "fc7-conv" , "fc8-conv"权重大小,即将全连接换成卷积
net_full_conv = caffe.Net('imagenet/bvlc_caffenet_full_conv.prototxt', '../models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel') params_full_conv = ['fc6-conv', 'fc7-conv', 'fc8-conv'] # conv_params = {name: (weights, biases)} conv_params = {pr: (net_full_conv.params[pr][0].data, net_full_conv.params[pr][1].data) for pr in params_full_conv} for conv in params_full_conv: print '{} weights are {} dimensional and biases are {} dimensional'.format(conv, conv_params[conv][0].shape, conv_params[conv][1].shape)
fc6-conv weights are (4096, 256, 6, 6) dimensional and biases are (1, 1, 1, 4096) dimensional fc7-conv weights are (4096, 4096, 1, 1) dimensional and biases are (1, 1, 1, 4096) dimensional fc8-conv weights are (1000, 4096, 1, 1) dimensional and biases are (1, 1, 1, 1000) dimensional
3、全连接与卷积的映射
卷积的权重格式是[ output , input , height , width],而全连接的权重格式是[ channel , height , width ],因此需要奖全连接的权重映射到卷积的权重格式。因此对于451*451的输入,输出为64 * 8 * 8,即64个feature map,相当于拿227*227,步长为32的窗口在原图上做卷积。
偏置项是一样的
for pr, pr_conv in zip(params, params_full_conv): conv_params[pr_conv][1][...] = fc_params[pr][1]
权重映射
for pr, pr_conv in zip(params, params_full_conv): out, in_, h, w = conv_params[pr_conv][0].shape W = fc_params[pr][0].reshape((out, in_, h, w)) conv_params[pr_conv][0][...] = W
4、保存模型
net_full_conv.save('imagenet/bvlc_caffenet_full_conv.caffemodel')
5、利用新模型对样本分类
import numpy as np import matplotlib.pyplot as plt %matplotlib inline caffe.set_phase_test() # load input and configure preprocessing im = caffe.io.load_image('images/cat.jpg') net_full_conv.set_mean('data', np.load('../python/caffe/imagenet/ilsvrc_2012_mean.npy')) net_full_conv.set_channel_swap('data', (2,1,0)) net_full_conv.set_raw_scale('data', 255.0) # make classification map by forward and print prediction indices at each location out = net_full_conv.forward_all(data=np.asarray([net_full_conv.preprocess('data', im)])) print out['prob'][0].argmax(axis=0) # show net input and confidence map (probability of the top prediction at each location) plt.subplot(1, 2, 1) plt.imshow(net_full_conv.deprocess('data', net_full_conv.blobs['data'].data[0])) plt.subplot(1, 2, 2) plt.imshow(out['prob'][0].max(axis=0))
6、对输出的64全feature map作max操作,输出分类图
import numpy as np import matplotlib.pyplot as plt caffe.set_phase_test() # load input and configure preprocessing im = caffe.io.load_image('images/cat.jpg') net_full_conv.set_mean('data', np.load('../python/caffe/imagenet/ilsvrc_2012_mean.npy')) net_full_conv.set_channel_swap('data', (2,1,0)) net_full_conv.set_raw_scale('data', 255.0) # make classification map by forward and print prediction indices at each location out = net_full_conv.forward_all(data=np.asarray([net_full_conv.preprocess('data', im)])) print out['prob'][0].argmax(axis=0) # show net input and confidence map (probability of the top prediction at each location) plt.subplot(1, 2, 1) plt.imshow(net_full_conv.deprocess('data', net_full_conv.blobs['data'].data[0])) plt.subplot(1, 2, 2)
[[282 282 281 281 281 281 277 282] [281 283 281 281 281 281 281 282] [283 283 283 283 283 283 281 282] [283 283 283 281 283 283 283 259] [283 283 283 283 283 283 283 259] [283 283 283 283 283 283 259 259] [283 283 283 283 259 259 259 277] [335 335 283 283 263 263 263 277]]
分类结果包括 -- 282 = tiger cat, 281 = tabby, 283 = persian -- and foxes and other mammals.
以上操作相当于对原图提取dense feature,dense feature 与类别息息相关,因此不同类的图片提取到的特征可区分必比较大