【深度学习系列】车牌识别实践(二)

  上节我们讲了第一部分,如何用生成简易的车牌,这节课中我们会用PaddlePaddle来识别生成的车牌。


 数据读取

  在上一节生成车牌时,我们可以分别生成训练数据和测试数据,方法如下(完整代码在这里):

 1 # 将生成的车牌图片写入文件夹,对应的label写入label.txt
 2 def genBatch(self, batchSize,pos,charRange, outputPath,size):
 3     if (not os.path.exists(outputPath)):
 4         os.mkdir(outputPath)
 5     outfile = open('label.txt','w')
 6     for i in xrange(batchSize):
 7             plateStr,plate = G.genPlateString(-1,-1)
 8             print plateStr,plate
 9             img =  G.generate(plateStr);
10             img = cv2.resize(img,size);
11             cv2.imwrite(outputPath + "/" + str(i).zfill(2) + ".jpg", img);
12             outfile.write(str(plate)+"\n")        

  生成好数据后,我们写一个reader来读取数据 ( reador.py )

1 def reader_creator(data,label):
2     def reader():
3         for i in xrange(len(data)):
4             yield data[i,:],int(label[i])
5     return reader

  灌入模型时,我们需要调用paddle.batch函数,将数据shuffle后批量灌入模型中:

 1 # 读取训练数据
 2 train_reader = paddle.batch(paddle.reader.shuffle(
 3                 reador.reader_creator(X_train,Y_train),buf_size=200),
 4                 batch_size=16)
 5 
 6 # 读取验证数据
 7  val_reader = paddle.batch(paddle.reader.shuffle(
 8                 reador.reader_creator(X_val,Y_val),buf_size=200),
 9                 batch_size=16)
10        trainer.train(reader=train_reader,num_passes=20,event_handler=event_handler)

 


 构建网络模型

  因为我们训练的是端到端的车牌识别,所以一开始构建了两个卷积-池化层训练,训练完后同步训练7个全连接层,分别对应车牌的7位字符,最后将其拼接起来,与原始的label计算Softmax值,预测训练结果。 

 1 def get_network_cnn(self):
 2    # 加载data和label     
 3     x = paddle.layer.data(name='x', type=paddle.data_type.dense_vector(self.data))
 4     y = paddle.layer.data(name='y', type=paddle.data_type.integer_value(self.label))
 5     # 构建卷积-池化层-1
 6     conv_pool_1 = paddle.networks.simple_img_conv_pool(
 7             input=x,
 8             filter_size=12,
 9             num_filters=50,
10             num_channel=1,
11             pool_size=2,
12             pool_stride=2,
13             act=paddle.activation.Relu())
14     drop_1 = paddle.layer.dropout(input=conv_pool_1, dropout_rate=0.5)
15     # 构建卷积-池化层-2
16     conv_pool_2 = paddle.networks.simple_img_conv_pool(
17             input=drop_1,
18             filter_size=5,
19             num_filters=50,
20             num_channel=20,
21             pool_size=2,
22             pool_stride=2,
23             act=paddle.activation.Relu())
24     drop_2 = paddle.layer.dropout(input=conv_pool_2, dropout_rate=0.5)
25 
26     # 全连接层
27     fc = paddle.layer.fc(input = drop_2, size = 120)
28     fc1_drop = paddle.layer.dropout(input = fc,dropout_rate = 0.5)
29     fc1 = paddle.layer.fc(input = fc1_drop,size = 65,act = paddle.activation.Linear())
30     
31     fc2_drop = paddle.layer.dropout(input = fc,dropout_rate = 0.5)
32     fc2 = paddle.layer.fc(input = fc2_drop,size = 65,act = paddle.activation.Linear())
33     
34     fc3_drop = paddle.layer.dropout(input = fc,dropout_rate = 0.5)
35     fc3 = paddle.layer.fc(input = fc3_drop,size = 65,act = paddle.activation.Linear())
36     
37     fc4_drop = paddle.layer.dropout(input = fc,dropout_rate = 0.5)
38     fc4 = paddle.layer.fc(input = fc4_drop,size = 65,act = paddle.activation.Linear())
39     
40     fc5_drop = paddle.layer.dropout(input = fc,dropout_rate = 0.5)
41     fc5 = paddle.layer.fc(input = fc5_drop,size = 65,act = paddle.activation.Linear())
42     
43     fc6_drop = paddle.layer.dropout(input = fc,dropout_rate = 0.5)
44     fc6 = paddle.layer.fc(input = fc6_drop,size = 65,act = paddle.activation.Linear())
45 
46     fc7_drop = paddle.layer.dropout(input = fc,dropout_rate = 0.5)    
47     fc7 = paddle.layer.fc(input = fc7_drop,size = 65,act = paddle.activation.Linear())
48     
49     # 将训练好的7个字符的全连接层拼接起来
50     fc_concat = paddle.layer.concact(input = [fc21, fc22, fc23, fc24,fc25,fc26,fc27], axis = 0)
51     predict = paddle.layer.classification_cost(input = fc_concat,label = y,act=paddle.activation.Softmax())
52     return predict

 

 


 

 

训练模型

   构建好网络模型后,就是比较常见的步骤了,譬如初始化,定义优化方法, 定义训练参数,定义训练器等等,再把第一步里我们写好的数据读取的方式放进去,就可以正常跑模型了。

  1 class NeuralNetwork(object):
  2     def __init__(self,X_train,Y_train,X_val,Y_val):
  3         paddle.init(use_gpu = with_gpu,trainer_count=1)
  4 
  5         self.X_train = X_train
  6         self.Y_train = Y_train
  7         self.X_val = X_val
  8         self.Y_val = Y_val
  9 
 10     
 11     def get_network_cnn(self):
 12         
 13         x = paddle.layer.data(name='x', type=paddle.data_type.dense_vector(self.data))
 14         y = paddle.layer.data(name='y', type=paddle.data_type.integer_value(self.label))
 15         conv_pool_1 = paddle.networks.simple_img_conv_pool(
 16             input=x,
 17             filter_size=12,
 18             num_filters=50,
 19             num_channel=1,
 20             pool_size=2,
 21             pool_stride=2,
 22             act=paddle.activation.Relu())
 23         drop_1 = paddle.layer.dropout(input=conv_pool_1, dropout_rate=0.5)
 24         conv_pool_2 = paddle.networks.simple_img_conv_pool(
 25             input=drop_1,
 26             filter_size=5,
 27             num_filters=50,
 28             num_channel=20,
 29             pool_size=2,
 30             pool_stride=2,
 31             act=paddle.activation.Relu())
 32         drop_2 = paddle.layer.dropout(input=conv_pool_2, dropout_rate=0.5)
 33 
 34         fc = paddle.layer.fc(input = drop_2, size = 120)
 35         fc1_drop = paddle.layer.dropout(input = fc,dropout_rate = 0.5)
 36         fc1 = paddle.layer.fc(input = fc1_drop,size = 65,act = paddle.activation.Linear())
 37         
 38         fc2_drop = paddle.layer.dropout(input = fc,dropout_rate = 0.5)
 39         fc2 = paddle.layer.fc(input = fc2_drop,size = 65,act = paddle.activation.Linear())
 40         
 41         fc3_drop = paddle.layer.dropout(input = fc,dropout_rate = 0.5)
 42         fc3 = paddle.layer.fc(input = fc3_drop,size = 65,act = paddle.activation.Linear())
 43         
 44         fc4_drop = paddle.layer.dropout(input = fc,dropout_rate = 0.5)
 45         fc4 = paddle.layer.fc(input = fc4_drop,size = 65,act = paddle.activation.Linear())
 46         
 47         fc5_drop = paddle.layer.dropout(input = fc,dropout_rate = 0.5)
 48         fc5 = paddle.layer.fc(input = fc5_drop,size = 65,act = paddle.activation.Linear())
 49         
 50         fc6_drop = paddle.layer.dropout(input = fc,dropout_rate = 0.5)
 51         fc6 = paddle.layer.fc(input = fc6_drop,size = 65,act = paddle.activation.Linear())
 52 
 53         fc7_drop = paddle.layer.dropout(input = fc,dropout_rate = 0.5)    
 54         fc7 = paddle.layer.fc(input = fc7_drop,size = 65,act = paddle.activation.Linear())
 55         
 56         fc_concat = paddle.layer.concact(input = [fc21, fc22, fc23, fc24,fc25,fc26,fc27], axis = 0)
 57         predict = paddle.layer.classification_cost(input = fc_concat,label = y,act=paddle.activation.Softmax())
 58         return predict
 59 
 60     # 定义训练器
 61     def get_trainer(self):
 62 
 63         cost = self.get_network()
 64 
 65         #获取参数
 66         parameters = paddle.parameters.create(cost)
 67 
 68 
 69         optimizer = paddle.optimizer.Momentum(
 70                                 momentum=0.9,
 71                                 regularization=paddle.optimizer.L2Regularization(rate=0.0002 * 128),
 72                                 learning_rate=0.001,
 73                                 learning_rate_schedule = "pass_manual")
 74     
 75 
 76         # 创建训练器
 77         trainer = paddle.trainer.SGD(
 78                 cost=cost, parameters=parameters, update_equation=optimizer)
 79         return trainer
 80 
 81 
 82     # 开始训练
 83     def start_trainer(self,X_train,Y_train,X_val,Y_val):
 84         trainer = self.get_trainer()
 85 
 86         result_lists = []
 87         def event_handler(event):
 88             if isinstance(event, paddle.event.EndIteration):
 89                 if event.batch_id % 10 == 0:
 90                     print "\nPass %d, Batch %d, Cost %f, %s" % (
 91                         event.pass_id, event.batch_id, event.cost, event.metrics)
 92             if isinstance(event, paddle.event.EndPass):
 93                     # 保存训练好的参数
 94                 with open('params_pass_%d.tar' % event.pass_id, 'w') as f:
 95                     parameters.to_tar(f)
 96                 # feeding = ['x','y']
 97                 result = trainer.test(
 98                         reader=val_reader)
 99                             # feeding=feeding)
100                 print "\nTest with Pass %d, %s" % (event.pass_id, result.metrics)
101 
102                 result_lists.append((event.pass_id, result.cost,
103                         result.metrics['classification_error_evaluator']))
104 
105         # 开始训练
106         train_reader = paddle.batch(paddle.reader.shuffle(
107                 reador.reader_creator(X_train,Y_train),buf_size=200),
108                 batch_size=16)
109 
110         val_reader = paddle.batch(paddle.reader.shuffle(
111                 reador.reader_creator(X_val,Y_val),buf_size=200),
112                 batch_size=16)
113         # val_reader = paddle.reader(reador.reader_creator(X_val,Y_val),batch_size=16)
114 
115         trainer.train(reader=train_reader,num_passes=20,event_handler=event_handler)

 


输出结果

  上一步训练完以后,保存训练完的模型,然后写一个test.py进行预测,需要注意的是,在预测时,构建的网络结构得和训练的网络结构相同。

 

#批量预测测试图片准确率
python test.py /Users/shelter/test

##输出结果示例
output:
预测车牌号码为:津 K 4 2 R M Y
输入图片数量:100
输入图片行准确率:0.72
输入图片列准确率:0.86

 

  如果是一次性只预测一张的话,在终端里会显示原始的图片与预测的值,如果是批量预测的话,会打印出预测的总准确率,包括行与列的准确率。

 


 

总结

   车牌识别的方法有很多,商业化落地的方法也很成熟,传统的方法需要对图片灰度化,字符进行切分等,需要很多数据预处理的过程,端到端的方法可以直接将原始的图片灌进去进行训练,最后出来预测的车牌字符的结果,这个方法在构建了两层卷积-池化网络结构后,并行训练了7个全连接层来进行车牌的字符识别,可以实现端到端的识别。但是在实际训练过程中,仍然有一些问题,譬如前几个训练的全连接层的准确率要比最后一两个的准确率高,大家可以分别打印出每一个全连接层的训练结果准确率对比一下,可能是由于训练还没有收敛导致的,也可能有其他原因,如果在做的过程中发现有什么问题,或者有更好的方法,欢迎留言~

 

 

参考文献:

1.我的github:https://github.com/huxiaoman7/mxnet-cnn-plate-recognition

 

posted @ 2018-03-25 21:56  Charlotte77  阅读(10496)  评论(13编辑  收藏  举报