深度学习-yolo2-目标检测
代码是在吴恩达深度学习作业的基础上完成的。
问题1:下载的 yolo.h5 无法使用
开始想原来下载的不行我就换个地方下载呗,就绕到 git lfs 去了,然后发现这个条路好像不大行得通...
最后参考 yolo.h5文件问题的解决 - 吴恩达深度学习:目标检测之YOLO算法 把问题解决了。
问题2:如何实现单目标检测
从 yolo_model.summary() 其实可以看见,最后的输出是 (None, 19, 19, 425), 这里的 425 来自于 pc, x, y, w, h 与类别个数之和再乘以 5 个 anchor_box。
那我就想着能不能把 425 给弄成 30,这样应该就可以实现单目标检测了。接着就是去代码里面找,哪里有相关的字眼可以做出这些修改了。然后在 keras_yolo.py 里发现了
voc_classes = ["aeroplane", "bicycle", "bird", "boat", "bottle", "bus", "car", "cat", "chair", "cow", "diningtable", "dog", "horse","motorbike", "person", "pottedplant", "sheep", "sofa", "train", "tvmonitor"]
猜测这里应该就是可识别的多类别,然后随便找了找🐖的图片,发现果然检测不到结果,我就把这个列表改成只有一个元素了。
然后在 yolo.cfg 里把 filters=425 改成 filters=30,最后用新数据重新生成一下h5文件就可以了。
对了,这里说一下,在进行单目标检测的时候记得把 threshold 设置大一点。
最后简单贴下代码:
1、选出每个 anchor_box 里可能性最大的对象
1 def yolo_filter_boxes(box_confidence, boxes, box_class_probs, threshold):
2 """Filters YOLO boxes by thresholding on object and class confidence.
3
4 Arguments:
5 box_confidence -- tensor of shape (19, 19, 5, 1)
6 boxes -- tensor of shape (19, 19, 5, 4)
7 box_class_probs -- tensor of shape (19, 19, 5, 80)
8 threshold -- real value, if [ highest class probability score < threshold], then get rid of the corresponding box
9
10 Returns:
11 scores -- tensor of shape (None,), containing the class probability score for selected boxes
12 # 这里好像有问题, 应该是containing (y1, x1, y2, x2)
13 boxes -- tensor of shape (None, 4), containing (b_x, b_y, b_h, b_w) coordinates of selected boxes
14 classes -- tensor of shape (None,), containing the index of the class detected by the selected boxes
15 """
16 # Step 1: Compute box scores
17 box_scores = box_confidence * box_class_probs
18
19 # Step 2: Find the box_classes thanks to the max box_scores, keep track of the corresponding score
20 max_box_scores = tf.reduce_max(box_scores, axis=-1)
21 max_box_indices = tf.argmax(box_scores, axis=-1)
22
23 # Step 3: Create a filtering mask based on "box_class_scores" by using "threshold". The mask should have the
24 # same dimension as box_class_scores, and be True for the boxes you want to keep (with probability >= threshold)
25 mask = max_box_scores >= threshold
26
27 # Step 4: Apply the mask to scores, boxes and classes
28 scores = tf.boolean_mask(max_box_scores, mask)
29 boxes = tf.boolean_mask(boxes, mask)
30 classes = tf.boolean_mask(max_box_indices, mask)
31
32 return scores, boxes, classes
2、非极大值抑制
1 def yolo_non_max_suppression(scores, boxes, classes, max_boxes, iou_threshold):
2 """
3 Applies Non-max suppression (NMS) to set of boxes
4
5 Arguments:
6 scores -- tensor of shape (None,), output of yolo_filter_boxes()
7 boxes -- tensor of shape (None, 4), output of yolo_filter_boxes() that have been scaled to the image size (see later)
8 classes -- tensor of shape (None,), output of yolo_filter_boxes()
9 max_boxes -- integer, maximum number of predicted boxes you'd like
10 iou_threshold -- real value, "intersection over union" threshold used for NMS filtering
11
12 Returns:
13 scores -- tensor of shape (, None), predicted score for each box
14 boxes -- tensor of shape (4, None), predicted box coordinates
15 classes -- tensor of shape (, None), predicted class for each box
16 """
17
18 # Use tf.image.non_max_suppression() to get the list of indices corresponding to boxes you keep
19 selected_indices = tf.image.non_max_suppression(boxes, scores, max_boxes, iou_threshold)
20
21 # Use K.gather() to select only nms_indices from scores, boxes and classes
22 scores = tf.gather(scores, selected_indices)
23 boxes = tf.gather(boxes, selected_indices)
24 classes = tf.gather(classes, selected_indices)
3、数据预处理
1 def yolo_eval(yolo_outputs, image_shape, max_boxes, score_threshold, iou_threshold):
2 """
3 Converts the output of YOLO encoding (a lot of boxes) to your predicted boxes along with their scores, box coordinates and classes.
4
5 Arguments:
6 yolo_outputs -- output of the encoding model (for image_shape of (608, 608, 3)), contains 4 tensors:
7 box_confidence: tensor of shape (None, 19, 19, 5, 1)
8 box_xy: tensor of shape (None, 19, 19, 5, 2)
9 box_wh: tensor of shape (None, 19, 19, 5, 2)
10 box_class_probs: tensor of shape (None, 19, 19, 5, 80)
11 image_shape -- tensor of shape (2,) containing the input shape, in this notebook we use (608., 608.) (has to be float32 dtype)
12 max_boxes -- integer, maximum number of predicted boxes you'd like
13 score_threshold -- real value, if [ highest class probability score < threshold], then get rid of the corresponding box
14 iou_threshold -- real value, "intersection over union" threshold used for NMS filtering
15
16 Returns:
17 scores -- tensor of shape (None, ), predicted score for each box
18 boxes -- tensor of shape (None, 4), predicted box coordinates
19 classes -- tensor of shape (None,), predicted class for each box
20 """
21
22 # Retrieve outputs of the YOLO model
23 box_xy, box_wh, box_confidence, box_class_probs = yolo_outputs
24
25 # Convert boxes to be ready for filtering functions
26 boxes = yolo_boxes_to_corners(box_xy, box_wh)
27
28 # Use one of the functions you've implemented to perform Score-filtering with a threshold of score_threshold
29 scores, boxes, classes = yolo_filter_boxes(box_confidence, boxes, box_class_probs, score_threshold)
30
31 # Scale boxes back to original image shape.
32 boxes = scale_boxes(boxes, image_shape)
33
34 # Use one of the functions you've implemented to perform Non-max suppression with a threshold of iou_threshold
35 scores, boxes, classes = yolo_non_max_suppression(scores, boxes, classes, max_boxes, iou_threshold)
36
37 return scores, boxes, classes
4、加载模型
1 # 多目标检测
2 yolo_model = load_model("model_data/yolo.h5", compile=False)
3 class_names = read_classes("model_data/classes.txt")
4 anchors = read_anchors("model_data/anchors.txt")
5 # 单目标检测
6 # yolo_model = load_model("model_data/yolo_horse.h5", compile=False)
7 # class_names = ['horse']
8 # anchors = read_anchors("model_data/anchors.txt")
5、预测并显示结果
1 def predict(image_file, max_boxes=10, score_threshold=0.6, iou_threshold=0.5):
2 """
3 Runs the graph stored in "sess" to predict boxes for "image_file". Prints and plots the preditions.
4
5 Arguments:
6 sess -- your tensorflow/Keras session containing the YOLO graph
7 image_file -- name of an image stored in the "images" folder.
8
9 Returns:
10 out_scores -- tensor of shape (None, ), scores of the predicted boxes
11 out_boxes -- tensor of shape (None, 4), coordinates of the predicted boxes
12 out_classes -- tensor of shape (None, ), class index of the predicted boxes
13
14 """
15
16 # Preprocess your image
17 image, image_data = preprocess_image("images/" + image_file, model_image_size=(608, 608))
18
19 # Run the session with the correct tensors and choose the correct placeholders in the feed_dict.
20 # You'll need to use feed_dict={yolo_model.input: ... , K.learning_phase(): 0})
21 re = yolo_model.predict(image_data)
22 re = tf.convert_to_tensor(re)
23 yolo_outputs = yolo_head(re, anchors, len(class_names))
24 out_scores, out_boxes, out_classes = yolo_eval(yolo_outputs, tuple(map(float, image.size)), max_boxes,
25 score_threshold, iou_threshold)
26
27 # Print predictions info
28 print('Found {} boxes for {}'.format(out_boxes.shape[0], image_file))
29 # Generate colors for drawing bounding boxes.
30 colors = generate_colors(class_names)
31 # Draw bounding boxes on the image file
32 draw_boxes(image, out_scores, out_boxes, out_classes, class_names, colors)
33 # Save the predicted bounding box on the image
34 image.save(os.path.join("out", image_file), quality=90)
35 # Display the results in the notebook
36 output_image = imageio.v2.imread(os.path.join("out", image_file))
37 imshow(output_image)
38
39 return out_scores, out_boxes, out_classes
40
41
42 out_scores, out_boxes, out_classes = predict("horse.jpg", max_boxes=15, score_threshold=0.85, iou_threshold=0.1)
完整的文件有需要的可以自取,点这里