PY-R-FCN+ResNet101训练自己数据集时的问题

py-R-FCN源码下载地址:

https://github.com/Orpine/py-R-FCN

配置环境省略

1、测试Demo

下载已训练模型(自己云盘)将模型放入data/rfcn_models中

运行:

./tools/demo_rfcn.py --net ResNet-101

2、拷贝数据集

我这里训练选择的是在VOC2007的基础上添加一些数据集:

2.1首先将我要添加的几个数据集重新排编号:rename.py

import os
i=16900
for files in os.listdir("./2018/"):
    os.rename(os.path.join("./2018/",files),os.path.join("./2018/","0"+str(i)+".jpg"))
    for files1 in os.listdir("./20181/"):
        if(files[:-4]==files1[:-4]):
            os.rename(os.path.join("./20181/",files1),os.path.join("./20181/","0"+str(i)+".xml"))
    i=i+1

i表示需要的起始编号.这里需要注意,更改过后图片与XML文件名是否连续,是否一一对应,否则训练是会出错。

2.2、如果遇到XML文件中标签大小写问题,将大写标签改为小写

命令格式:find -name '要查找的文件名' | xargs perl -pi -e 's|被替换的字符串|替换后的字符串|g'

2.3、如果只需要检测几类目标,可以将数据集中不需要的目标类的标签删除

import xml.etree.cElementTree as ET
import os
path_root = [
             'Annotations']
     
CLASSES = [
           "aeroplane", "bicycle",
           "cat", "car",  "dog",
           "motorbike", "person",
           "horse","train",
           "bus"]
for anno_path in path_root:
    xml_list = os.listdir(anno_path)
    for axml in xml_list:
        path_xml = os.path.join(anno_path, axml)
        tree = ET.parse(path_xml)
        root = tree.getroot()
     
        for child in root.findall('object'):
            name = child.find('name').text
            if not name in CLASSES:
                root.remove(child)
     
        tree.write(os.path.join('Annotations', axml))

上面标签里是自己需要的几类,这里一定要注意,备份数据集(压缩)最好是删除不需要的几类时,将数据集重新命名,因为此处代码会将电脑中所有相同名字的XML文件都给更改了,绝望吧。将所有的xml放到VOC2007下的Annotations中。

2.4、将所有的训练图片放到JPEGImages文件夹中,重新生成ImageSet\Main里的四个txt文件,分别是:trainval.txt(训练和验证集总和)、train.txt(训练集)、val.txt(验证集)、test.txt(测试集)。

import os  
import random  
      
trainval_percent = 0.66  
train_percent = 0.5  
xmlfilepath = 'Annotations'  
txtsavepath = 'ImageSets\Main'  
total_xml = os.listdir(xmlfilepath)  
      
num=len(total_xml)  
list=range(num)  
tv=int(num*trainval_percent)  
tr=int(tv*train_percent)  
trainval= random.sample(list,tv)  
train=random.sample(trainval,tr)  
      
ftrainval = open('ImageSets/Main/trainval.txt', 'w')  
ftest = open('ImageSets/Main/test.txt', 'w')  
ftrain = open('ImageSets/Main/train.txt', 'w')  
fval = open('ImageSets/Main/val.txt', 'w')  
      
for i  in list:  
    name=total_xml[i][:-4]+'\n'  
    if i in trainval:  
        ftrainval.write(name)  
        if i in train:  
            ftrain.write(name)  
        else:  
            fval.write(name)  
    else:  
        ftest.write(name)    
      
ftrainval.close()  
ftrain.close()  
fval.close()  
ftest .close()

3、修改配置文件

3.1、修改class-aware/train_ohem.prototxt

layer {

  name: 'input-data'

  type: 'Python'

  top: 'data'

  top: 'im_info'

  top: 'gt_boxes'

  python_param {

    module: 'roi_data_layer.layer'

    layer: 'RoIDataLayer'

    param_str: "'num_classes': 16" #cls_num

  }

}
layer {

  name: 'roi-data'

  type: 'Python'

  bottom: 'rpn_rois'

  bottom: 'gt_boxes'

  top: 'rois'

  top: 'labels'

  top: 'bbox_targets'

  top: 'bbox_inside_weights'

  top: 'bbox_outside_weights'

  python_param {

    module: 'rpn.proposal_target_layer'

    layer: 'ProposalTargetLayer'

    param_str: "'num_classes': 16" #cls_num

  }

}
layer {

    bottom: "conv_new_1"

    top: "rfcn_cls"

    name: "rfcn_cls"

    type: "Convolution"

    convolution_param {

        num_output: 784 #cls_num*(score_maps_size^2)

        kernel_size: 1

        pad: 0

        weight_filler {

            type: "gaussian"

            std: 0.01

        }

        bias_filler {

            type: "constant"

            value: 0

        }

    }

    param {

        lr_mult: 1.0

    }

    param {

        lr_mult: 2.0

    }

}
layer {

    bottom: "conv_new_1"

    top: "rfcn_bbox"

    name: "rfcn_bbox"

    type: "Convolution"

    convolution_param {

        num_output: 3136 #4*cls_num*(score_maps_size^2)

        kernel_size: 1

        pad: 0

        weight_filler {

            type: "gaussian"

            std: 0.01

        }

        bias_filler {

            type: "constant"

            value: 0

        }

    }

    param {

        lr_mult: 1.0

    }

    param {

        lr_mult: 2.0

    }

}
layer {

    bottom: "rfcn_cls"

    bottom: "rois"

    top: "psroipooled_cls_rois"

    name: "psroipooled_cls_rois"

    type: "PSROIPooling"

    psroi_pooling_param {

        spatial_scale: 0.0625

        output_dim: 16  #cls_num

        group_size: 7

    }

}
layer {

    bottom: "rfcn_bbox"

    bottom: "rois"

    top: "psroipooled_loc_rois"

    name: "psroipooled_loc_rois"

    type: "PSROIPooling"

    psroi_pooling_param {

        spatial_scale: 0.0625

        output_dim: 64 #4*cls_num

        group_size: 7

    }

}

3.2、修改class-aware/test.prototxt

layer {

    bottom: "conv_new_1"

    top: "rfcn_cls"

    name: "rfcn_cls"

    type: "Convolution"

    convolution_param {

        num_output: 784 #cls_num*(score_maps_size^2)

        kernel_size: 1

        pad: 0

        weight_filler {

            type: "gaussian"

            std: 0.01

        }

        bias_filler {

            type: "constant"

            value: 0

        }

    }

    param {

        lr_mult: 1.0

    }

    param {

        lr_mult: 2.0

    }

}
layer {

    bottom: "conv_new_1"

    top: "rfcn_bbox"

    name: "rfcn_bbox"

    type: "Convolution"

    convolution_param {

        num_output: 3136 #4*cls_num*(score_maps_size^2)

        kernel_size: 1

        pad: 0

        weight_filler {

            type: "gaussian"

            std: 0.01

        }

        bias_filler {

            type: "constant"

            value: 0

        }

    }

    param {

        lr_mult: 1.0

    }

    param {

        lr_mult: 2.0

    }

}
layer {

    bottom: "rfcn_cls"

    bottom: "rois"

    top: "psroipooled_cls_rois"

    name: "psroipooled_cls_rois"

    type: "PSROIPooling"

    psroi_pooling_param {

        spatial_scale: 0.0625

        output_dim: 16  #cls_num

        group_size: 7

    }

}
layer {

    bottom: "rfcn_bbox"

    bottom: "rois"

    top: "psroipooled_loc_rois"

    name: "psroipooled_loc_rois"

    type: "PSROIPooling"

    psroi_pooling_param {

        spatial_scale: 0.0625

        output_dim: 64  #4*cls_num

        group_size: 7

    }

}
layer {

    name: "cls_prob_reshape"

    type: "Reshape"

    bottom: "cls_prob_pre"

    top: "cls_prob"

    reshape_param {

        shape {

            dim: -1

            dim: 16  #cls_num

        }

    }

}
layer {

    name: "bbox_pred_reshape"

    type: "Reshape"

    bottom: "bbox_pred_pre"

    top: "bbox_pred"

    reshape_param {

        shape {

            dim: -1

            dim: 64  #4*cls_num

        }

    }

}

3.3、修改train_agnostic.prototxt

layer {

  name: 'input-data'

  type: 'Python'

  top: 'data'

  top: 'im_info'

  top: 'gt_boxes'

  python_param {

    module: 'roi_data_layer.layer'

    layer: 'RoIDataLayer'

    param_str: "'num_classes': 16"  #cls_num

  }

}
layer {

    bottom: "conv_new_1"

    top: "rfcn_cls"

    name: "rfcn_cls"

    type: "Convolution"

    convolution_param {

        num_output: 784 #cls_num*(score_maps_size^2)   ###

        kernel_size: 1

        pad: 0

        weight_filler {

            type: "gaussian"

            std: 0.01

        }

        bias_filler {

            type: "constant"

            value: 0

        }

    }

    param {

        lr_mult: 1.0

    }

    param {

        lr_mult: 2.0

    }

}
layer {

    bottom: "rfcn_cls"

    bottom: "rois"

    top: "psroipooled_cls_rois"

    name: "psroipooled_cls_rois"

    type: "PSROIPooling"

    psroi_pooling_param {

        spatial_scale: 0.0625

        output_dim: 16 #cls_num   ###

        group_size: 7

    }

}

3.4、修改train_agnostic_ohem.prototxt

layer {

  name: 'input-data'

  type: 'Python'

  top: 'data'

  top: 'im_info'

  top: 'gt_boxes'

  python_param {

    module: 'roi_data_layer.layer'

    layer: 'RoIDataLayer'

    param_str: "'num_classes': 16" #cls_num ###

  }

}
layer {

    bottom: "conv_new_1"

    top: "rfcn_cls"

    name: "rfcn_cls"

    type: "Convolution"

    convolution_param {

        num_output: 784 #cls_num*(score_maps_size^2)   ###

        kernel_size: 1

        pad: 0

        weight_filler {

            type: "gaussian"

            std: 0.01

        }

        bias_filler {

            type: "constant"

            value: 0

        }

    }

    param {

        lr_mult: 1.0

    }

    param {

        lr_mult: 2.0

    }

}
layer {

    bottom: "rfcn_cls"

    bottom: "rois"

    top: "psroipooled_cls_rois"

    name: "psroipooled_cls_rois"

    type: "PSROIPooling"

    psroi_pooling_param {

        spatial_scale: 0.0625

        output_dim: 16 #cls_num   ###

        group_size: 7

    }

}

3.5、修改test_agnostic.prototxt

layer {

    bottom: "conv_new_1"

    top: "rfcn_cls"

    name: "rfcn_cls"

    type: "Convolution"

    convolution_param {

        num_output: 784 #cls_num*(score_maps_size^2) ###

        kernel_size: 1

        pad: 0

        weight_filler {

            type: "gaussian"

            std: 0.01

        }

        bias_filler {

            type: "constant"

            value: 0

        }

    }

    param {

        lr_mult: 1.0

    }

    param {

        lr_mult: 2.0

    }

}
layer {

    bottom: "rfcn_cls"

    bottom: "rois"

    top: "psroipooled_cls_rois"

    name: "psroipooled_cls_rois"

    type: "PSROIPooling"

    psroi_pooling_param {

        spatial_scale: 0.0625

        output_dim: 16 #cls_num   ###

        group_size: 7

    }

}
layer {

    name: "cls_prob_reshape"

    type: "Reshape"

    bottom: "cls_prob_pre"

    top: "cls_prob"

    reshape_param {

        shape {

            dim: -1

            dim: 16 #cls_num   ###

        }

    }

}

3.6、$RFCN/lib/datasets/pascal_voc.py

class pascal_voc(imdb):

    def __init__(self, image_set, year, devkit_path=None):

        imdb.__init__(self, 'voc_' + year + '_' + image_set)

        self._year = year

        self._image_set = image_set

        self._devkit_path = self._get_default_path() if devkit_path is None \

                            else devkit_path

        self._data_path = os.path.join(self._devkit_path, 'VOC' + self._year)

        self._classes = ('__background__', # always index 0

                         '你的标签1','你的标签2',你的标签3','你的标签4'

                      )

4、开始训练

./experiments/scripts/rfcn_end2end_ohem.sh 0 ResNet-101 pascal_voc

训练时遇见的几个不常见的坑:

1、训练完模型,在自己测试阶段出错

File "/home/nextcar/Py-rfcn/py-R-FCN/tools/../lib/datasets/voc_eval.py", line 20, in parse_rec
obj_struct['truncated'] = int(obj.find('truncated').text)
AttributeError: 'NoneType' object has no attribute 'text'

我采取的方法时将相应文件中此行注释掉,应为后加的一部分图片中没有truncated标签。

2、int(bbox.find('ymin').text), ValueError: invalid literal for int() with base 10: '45.70000076293945'

解决方案:/lib/datasets/voc_eval.py改为:

obj_struct['bbox'] = [int(float(bbox.find('xmin').text)),
                               int(float(bbox.find('ymin').text)),
                               int(float(bbox.find('xmax').text)),
                               int(float(bbox.find('ymax').text))]

 

因为后加的一部分数据,相机标定框含有浮点型

posted @ 2018-07-25 15:43  信阳毛毛虫  阅读(800)  评论(0编辑  收藏  举报