有雾环境下的目标检测
这学期选了《计算智能》,要做一个有雾环境下的目标检测的作业。百度了一下没什么相关的博客,把自己做作业的过程记录一下。
由于自己没有可以用的GPU设备,而且Google colab上已经配置好了很多深度学习需要的框架如pytorch、tensorflow等,因此直接在colab上跑模型。关于colab怎么用的教程百度上很多,这里就不多说了。这里主要介绍怎么在colab上用mmdetection跑通这个模型。
数据集使用的是RTTS数据集,数据集是VOC格式的。在mmdetection中只要修改一部分代码就可以直接使用,下面是在Cola上的操作过程。代码默认是python代码,以!或%开头的代码是linux命令行。先看看白嫖到什么GPU吧。
!nvidia-smi
输出如下:
Fri Dec 20 00:57:04 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.44 Driver Version: 418.67 CUDA Version: 10.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla P100-PCIE... Off | 00000000:00:04.0 Off | 0 |
| N/A 34C P0 26W / 250W | 0MiB / 16280MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
我把数据集存在谷歌硬盘里了,所以这里要挂载一下谷歌硬盘。colab也支持直接上传文件,不用谷歌硬盘的话也可以直接上传。
# 挂载Google drive
from google.colab import drive
drive.mount('/content/drive')
从github上把mmdetection克隆下来
!git clone https://github.com/open-mmlab/mmdetection.git
接下来开始安装mmdetection
%cd /content/mmdetection/
!pip install mmcv
!python setup.py develop
等这个安装好之后就可以开始用了。不得不说colab真香,在自己机子上配置环境要花不少时间,在colab上安装一下就能用了。接下来把数据集从谷歌硬盘copy过来,再解压。
%cd /content/mmdetection/
!mkdir data
%cd data
# 将数据集从谷歌硬盘上复制过来
!cp '/content/drive/My Drive/VOC2007/RTTS_.zip' RTTS_.zip
# 解压
!unzip RTTS_.zip
数据集准备完毕,接下来需要修改一部分代码来跑通这个数据集。用的模型是基于resnet101的Faster R-CNN,因此需要修改对应的参数./configs/faster_rcnn_r101_fpn_1x.py。mmdetection默认的数据集是coco,所以首先需要修改数据集的格式以及路径:
dataset_type = 'VOCDataset'
data_root = '/content/mmdetection/data/'
接着修改数据集中训练集和交叉验证集的路径
data = dict(
imgs_per_gpu=5,
workers_per_gpu=5,
train=dict(
type=dataset_type,
#训练
ann_file=data_root + 'RTTS/ImageSets/Main/train.txt',
img_prefix=data_root + 'RTTS/',
pipeline=train_pipeline),
val=dict(
type=dataset_type,
#交叉验证
ann_file=data_root + 'RTTS/ImageSets/Main/val.txt',
img_prefix=data_root + 'RTTS/',
pipeline=test_pipeline),
test=dict(
type=dataset_type,
#测试
ann_file=data_root + 'RTTS/ImageSets/Main/val.txt',
img_prefix=data_root + 'RTTS/',
pipeline=test_pipeline))
由于在训练时加入了--validate参数,就把交叉验证集当作测试集,因此测试集用不到,怎么分配都无所谓。同时colab提供的GPU有16G的显存,不容易爆显存,于是将imgs_per_gpu和workers_per_gpu修改为5。这个要看每次colab分配给你的GPU型号,如果显存太小的话不建议修改这个参数。
然后修改日志显示间隔为100,50次迭代就显示一次太频繁。
log_config = dict(
interval=100,
hooks=[
dict(type='TextLoggerHook'),
])
最后修改epoch数、class数和工作路径:
num_classes=6
total_epochs = 20
work_dir = './work_dirs/faster_rcnn_r101_fpn_1x/hzdtc'
到这里模型的训练参数就已经修改完毕了。但是我们的数据集与标准的VOC2007还有一些区别,还需要对部分代码进行修改。
- 修改/mmdetection/mmdet/datasets/voc.py,修改里面的CLASSES和year,不改year会报错(可能是因为我改了数据集里的文件结构吧,具体还是得看数据集里面的文件结构)。
class VOCDataset(XMLDataset):
CLASSES = ('bicycle', 'bus', 'car', 'motorbike', 'person')
def __init__(self, **kwargs):
super(VOCDataset, self).__init__(**kwargs)
self.year = 2007
# if 'VOC2007' in self.img_prefix:
# self.year = 2007
# elif 'VOC2012' in self.img_prefix:
# self.year = 2012
# else:
# raise ValueError('Cannot infer dataset year from img_prefix')
- 修改/mmdetection/mmdet/core/evaluation/class_names.py
def voc_classes():
return [
'bicycle', 'bus', 'car', 'motorbike', 'person'
]
- 修改/mmdetection/mmdet/datasets/xml_style.py,数据集中的图片是.png格式的,标准的VOC数据集是.jpg格式的。不改的话无法读取数据。
def load_annotations(self, ann_file):
img_infos = []
img_ids = mmcv.list_from_file(ann_file)
for img_id in img_ids:
# 修改此处的.jpg为.png
filename = 'JPEGImages/{}.png'.format(img_id)
xml_path = osp.join(self.img_prefix, 'Annotations',
'{}.xml'.format(img_id))
tree = ET.parse(xml_path)
root = tree.getroot()
size = root.find('size')
width = int(size.find('width').text)
height = int(size.find('height').text)
img_infos.append(
dict(id=img_id, filename=filename, width=width, height=height))
return img_infos
完成以上的修改后,就可以开始训练模型了。虽然只有一张GPU,还是建议使用分布式的训练方法,因为分布式训练方法才有--validate参数,可以在每个epoch跑完后看到模型此时的mAP。
%cd /content/mmdetection
!CUDA_VISIBLE_DEVICES=0 ./tools/dist_train.sh configs/faster_rcnn_r101_fpn_1x.py 1 --validate
训练开始后会先输出faster_rcnn_r101_fpn_1x.py中的配置,每训练一个epoch会输出一次mAP,效果如下:
2019-12-19 05:01:42,257 - INFO - load model from: torchvision://resnet101
2019-12-19 05:01:42,782 - WARNING - The model and loaded state dict do not match exactly
unexpected key in source state_dict: fc.weight, fc.bias
2019-12-19 05:01:50,571 - INFO - Start running, host: root@ad882785deec, work_dir: /content/mmdetection/work_dirs/faster_rcnn_r101_fpn_1x/hzdtc
2019-12-19 05:01:50,571 - INFO - workflow: [('train', 1)], max: 20 epochs
2019-12-19 05:04:06,393 - INFO - Epoch [1][100/779] lr: 0.00931, eta: 5:50:24, time: 1.358, data_time: 0.035, memory: 13119, loss_rpn_cls: 0.1735, loss_rpn_bbox: 0.0350, loss_cls: 0.3752, acc: 90.8566, loss_bbox: 0.1861, loss: 0.7698
2019-12-19 05:06:19,575 - INFO - Epoch [1][200/779] lr: 0.01197, eta: 5:44:45, time: 1.332, data_time: 0.014, memory: 13119, loss_rpn_cls: 0.0913, loss_rpn_bbox: 0.0369, loss_cls: 0.3376, acc: 89.8496, loss_bbox: 0.2308, loss: 0.6967
2019-12-19 05:08:32,287 - INFO - Epoch [1][300/779] lr: 0.01464, eta: 5:41:00, time: 1.327, data_time: 0.014, memory: 13119, loss_rpn_cls: 0.0504, loss_rpn_bbox: 0.0313, loss_cls: 0.3012, acc: 89.9437, loss_bbox: 0.2278, loss: 0.6106
2019-12-19 05:10:44,460 - INFO - Epoch [1][400/779] lr: 0.01731, eta: 5:37:40, time: 1.322, data_time: 0.014, memory: 13119, loss_rpn_cls: 0.0474, loss_rpn_bbox: 0.0313, loss_cls: 0.2860, acc: 90.3688, loss_bbox: 0.2042, loss: 0.5689
2019-12-19 05:12:56,712 - INFO - Epoch [1][500/779] lr: 0.01997, eta: 5:34:50, time: 1.323, data_time: 0.014, memory: 13119, loss_rpn_cls: 0.0533, loss_rpn_bbox: 0.0311, loss_cls: 0.2851, acc: 90.4473, loss_bbox: 0.1882, loss: 0.5577
2019-12-19 05:15:09,968 - INFO - Epoch [1][600/779] lr: 0.02000, eta: 5:32:38, time: 1.333, data_time: 0.014, memory: 13119, loss_rpn_cls: 0.0441, loss_rpn_bbox: 0.0287, loss_cls: 0.2779, acc: 90.5734, loss_bbox: 0.1895, loss: 0.5403
2019-12-19 05:17:22,536 - INFO - Epoch [1][700/779] lr: 0.02000, eta: 5:30:10, time: 1.326, data_time: 0.014, memory: 13119, loss_rpn_cls: 0.0372, loss_rpn_bbox: 0.0275, loss_cls: 0.2568, acc: 91.1914, loss_bbox: 0.1738, loss: 0.4953
terminal width is too small (0), please consider widen the terminal for better progressbar visualization
[>>>>>>>>>>] 433/433, 6.8 task/s, elapsed: 64s, ETA: 0s
+-----------+------+-------+--------+-----------+-------+
| class | gts | dets | recall | precision | ap |
+-----------+------+-------+--------+-----------+-------+
| bicycle | 52 | 1109 | 0.673 | 0.032 | 0.222 |
| bus | 175 | 3312 | 0.731 | 0.039 | 0.249 |
| car | 1820 | 12465 | 0.902 | 0.136 | 0.755 |
| motorbike | 101 | 2383 | 0.901 | 0.039 | 0.463 |
| person | 853 | 10286 | 0.884 | 0.075 | 0.617 |
+-----------+------+-------+--------+-----------+-------+
| mAP | | | | | 0.461 |
+-----------+------+-------+--------+-----------+-------+
2019-12-19 05:20:13,279 - INFO - Epoch [1][779/779] lr: 0.02000, mAP: 0.4612
等模型训练完毕,可以用自带的日志分析功能对模型的训练过程进行可视化。本实验只是看一下模型的mAP和loss的变化,效果如下。
%cd /content/mmdetection
!python tools/analyze_logs.py plot_curve ./work_dirs/faster_rcnn_r101_fpn_1x/hzdtc/20191219_050150.log.json --keys mAP --legend mAP --out mAP.jpg
!python tools/analyze_logs.py plot_curve ./work_dirs/faster_rcnn_r101_fpn_1x/hzdtc/20191219_050150.log.json --keys loss --legend loss --out loss.jpg
输出如下:
/content/mmdetection
plot curve of ./work_dirs/faster_rcnn_r101_fpn_1x/hzdtc/20191219_050150.log.json, metric is mAP
save curve to: mAP.jpg
plot curve of ./work_dirs/faster_rcnn_r101_fpn_1x/hzdtc/20191219_050150.log.json, metric is loss
save curve to: loss.jpg
colab没有图形界面,因此这里图片显示不出来。我是通过把图片输出为.jpg格式的文件,再用PIL模块显示图片。可能还有更好的方法,但是我不会。
from PIL import Image
mAP = Image.open('mAP.jpg')
mAP
loss = Image.open('loss.jpg')
loss
这里图片就不放出来了,你们要是自己跑的话是可以看得见的。到这里就结束了。希望对大家有所帮助吧。