深度学习-yolo2-目标检测

代码是在吴恩达深度学习作业的基础上完成的。

问题1:下载的 yolo.h5 无法使用

开始想原来下载的不行我就换个地方下载呗,就绕到 git lfs 去了,然后发现这个条路好像不大行得通...

最后参考  yolo.h5文件问题的解决 - 吴恩达深度学习:目标检测之YOLO算法 把问题解决了。

问题2:如何实现单目标检测

从 yolo_model.summary() 其实可以看见,最后的输出是 (None, 19, 19, 425), 这里的 425 来自于 pc, x, y, w, h 与类别个数之和再乘以 5 个 anchor_box。

那我就想着能不能把 425 给弄成 30,这样应该就可以实现单目标检测了。接着就是去代码里面找,哪里有相关的字眼可以做出这些修改了。然后在 keras_yolo.py 里发现了

voc_classes = ["aeroplane", "bicycle", "bird", "boat", "bottle", "bus", "car", "cat", "chair", "cow", "diningtable", "dog", "horse","motorbike", "person", "pottedplant", "sheep", "sofa", "train", "tvmonitor"]

猜测这里应该就是可识别的多类别,然后随便找了找🐖的图片,发现果然检测不到结果,我就把这个列表改成只有一个元素了。

然后在 yolo.cfg 里把 filters=425 改成 filters=30,最后用新数据重新生成一下h5文件就可以了。

对了,这里说一下,在进行单目标检测的时候记得把 threshold 设置大一点。

最后简单贴下代码:

1、选出每个 anchor_box 里可能性最大的对象

 1 def yolo_filter_boxes(box_confidence, boxes, box_class_probs, threshold):
 2     """Filters YOLO boxes by thresholding on object and class confidence.
 3 
 4     Arguments:
 5     box_confidence -- tensor of shape (19, 19, 5, 1)
 6     boxes -- tensor of shape (19, 19, 5, 4)
 7     box_class_probs -- tensor of shape (19, 19, 5, 80)
 8     threshold -- real value, if [ highest class probability score < threshold], then get rid of the corresponding box
 9 
10     Returns:
11     scores -- tensor of shape (None,), containing the class probability score for selected boxes
12     # 这里好像有问题, 应该是containing (y1, x1, y2, x2)
13     boxes -- tensor of shape (None, 4), containing (b_x, b_y, b_h, b_w) coordinates of selected boxes
14     classes -- tensor of shape (None,), containing the index of the class detected by the selected boxes
15     """
16     # Step 1: Compute box scores
17     box_scores = box_confidence * box_class_probs
18 
19     # Step 2: Find the box_classes thanks to the max box_scores, keep track of the corresponding score
20     max_box_scores = tf.reduce_max(box_scores, axis=-1)
21     max_box_indices = tf.argmax(box_scores, axis=-1)
22 
23     # Step 3: Create a filtering mask based on "box_class_scores" by using "threshold". The mask should have the
24     # same dimension as box_class_scores, and be True for the boxes you want to keep (with probability >= threshold)
25     mask = max_box_scores >= threshold
26 
27     # Step 4: Apply the mask to scores, boxes and classes
28     scores = tf.boolean_mask(max_box_scores, mask)
29     boxes = tf.boolean_mask(boxes, mask)
30     classes = tf.boolean_mask(max_box_indices, mask)
31 
32     return scores, boxes, classes

2、非极大值抑制

 1 def yolo_non_max_suppression(scores, boxes, classes, max_boxes, iou_threshold):
 2     """
 3     Applies Non-max suppression (NMS) to set of boxes
 4 
 5     Arguments:
 6     scores -- tensor of shape (None,), output of yolo_filter_boxes()
 7     boxes -- tensor of shape (None, 4), output of yolo_filter_boxes() that have been scaled to the image size (see later)
 8     classes -- tensor of shape (None,), output of yolo_filter_boxes()
 9     max_boxes -- integer, maximum number of predicted boxes you'd like
10     iou_threshold -- real value, "intersection over union" threshold used for NMS filtering
11 
12     Returns:
13     scores -- tensor of shape (, None), predicted score for each box
14     boxes -- tensor of shape (4, None), predicted box coordinates
15     classes -- tensor of shape (, None), predicted class for each box
16     """
17 
18     # Use tf.image.non_max_suppression() to get the list of indices corresponding to boxes you keep
19     selected_indices = tf.image.non_max_suppression(boxes, scores, max_boxes, iou_threshold)
20 
21     # Use K.gather() to select only nms_indices from scores, boxes and classes
22     scores = tf.gather(scores, selected_indices)
23     boxes = tf.gather(boxes, selected_indices)
24     classes = tf.gather(classes, selected_indices)

3、数据预处理

 1 def yolo_eval(yolo_outputs, image_shape, max_boxes, score_threshold, iou_threshold):
 2     """
 3     Converts the output of YOLO encoding (a lot of boxes) to your predicted boxes along with their scores, box coordinates and classes.
 4 
 5     Arguments:
 6     yolo_outputs -- output of the encoding model (for image_shape of (608, 608, 3)), contains 4 tensors:
 7                     box_confidence: tensor of shape (None, 19, 19, 5, 1)
 8                     box_xy: tensor of shape (None, 19, 19, 5, 2)
 9                     box_wh: tensor of shape (None, 19, 19, 5, 2)
10                     box_class_probs: tensor of shape (None, 19, 19, 5, 80)
11     image_shape -- tensor of shape (2,) containing the input shape, in this notebook we use (608., 608.) (has to be float32 dtype)
12     max_boxes -- integer, maximum number of predicted boxes you'd like
13     score_threshold -- real value, if [ highest class probability score < threshold], then get rid of the corresponding box
14     iou_threshold -- real value, "intersection over union" threshold used for NMS filtering
15 
16     Returns:
17     scores -- tensor of shape (None, ), predicted score for each box
18     boxes -- tensor of shape (None, 4), predicted box coordinates
19     classes -- tensor of shape (None,), predicted class for each box
20     """
21 
22     # Retrieve outputs of the YOLO model
23     box_xy, box_wh, box_confidence, box_class_probs = yolo_outputs
24 
25     # Convert boxes to be ready for filtering functions
26     boxes = yolo_boxes_to_corners(box_xy, box_wh)
27 
28     # Use one of the functions you've implemented to perform Score-filtering with a threshold of score_threshold
29     scores, boxes, classes = yolo_filter_boxes(box_confidence, boxes, box_class_probs, score_threshold)
30 
31     # Scale boxes back to original image shape.
32     boxes = scale_boxes(boxes, image_shape)
33 
34     # Use one of the functions you've implemented to perform Non-max suppression with a threshold of iou_threshold
35     scores, boxes, classes = yolo_non_max_suppression(scores, boxes, classes, max_boxes, iou_threshold)
36 
37     return scores, boxes, classes

4、加载模型

1 # 多目标检测
2 yolo_model = load_model("model_data/yolo.h5", compile=False)
3 class_names = read_classes("model_data/classes.txt")
4 anchors = read_anchors("model_data/anchors.txt")
5 # 单目标检测
6 # yolo_model = load_model("model_data/yolo_horse.h5", compile=False)
7 # class_names = ['horse']
8 # anchors = read_anchors("model_data/anchors.txt")

5、预测并显示结果

 1 def predict(image_file, max_boxes=10, score_threshold=0.6, iou_threshold=0.5):
 2     """
 3     Runs the graph stored in "sess" to predict boxes for "image_file". Prints and plots the preditions.
 4 
 5     Arguments:
 6     sess -- your tensorflow/Keras session containing the YOLO graph
 7     image_file -- name of an image stored in the "images" folder.
 8 
 9     Returns:
10     out_scores -- tensor of shape (None, ), scores of the predicted boxes
11     out_boxes -- tensor of shape (None, 4), coordinates of the predicted boxes
12     out_classes -- tensor of shape (None, ), class index of the predicted boxes
13 
14     """
15 
16     # Preprocess your image
17     image, image_data = preprocess_image("images/" + image_file, model_image_size=(608, 608))
18 
19     # Run the session with the correct tensors and choose the correct placeholders in the feed_dict.
20     # You'll need to use feed_dict={yolo_model.input: ... , K.learning_phase(): 0})
21     re = yolo_model.predict(image_data)
22     re = tf.convert_to_tensor(re)
23     yolo_outputs = yolo_head(re, anchors, len(class_names))
24     out_scores, out_boxes, out_classes = yolo_eval(yolo_outputs, tuple(map(float, image.size)), max_boxes,
25                                                    score_threshold, iou_threshold)
26 
27     # Print predictions info
28     print('Found {} boxes for {}'.format(out_boxes.shape[0], image_file))
29     # Generate colors for drawing bounding boxes.
30     colors = generate_colors(class_names)
31     # Draw bounding boxes on the image file
32     draw_boxes(image, out_scores, out_boxes, out_classes, class_names, colors)
33     # Save the predicted bounding box on the image
34     image.save(os.path.join("out", image_file), quality=90)
35     # Display the results in the notebook
36     output_image = imageio.v2.imread(os.path.join("out", image_file))
37     imshow(output_image)
38 
39     return out_scores, out_boxes, out_classes
40 
41 
42 out_scores, out_boxes, out_classes = predict("horse.jpg", max_boxes=15, score_threshold=0.85, iou_threshold=0.1)

完整的文件有需要的可以自取,点这里

 

posted @ 2022-11-01 00:15  Sly_Yang  阅读(57)  评论(0编辑  收藏  举报