[Mobilar] Labelme For Real-time Segmentation

Edge AI: Semantic Segmentation on Nvidia Jetson

一、Label Color的选择

生成 cmap的一段sample code。

 import numpy as np
 
 
 def uint82bin(n, count=8):
     """returns the binary of integer n, count refers to amount of bits"""
     return ''.join([str((n >> y) & 1) for y in range(count-1, -1, -1)])
 
 def labelcolormap(N):
     cmap = np.zeros((N, 3), dtype = np.uint8)
     for i in range(N):
         r = 0
         g = 0
         b = 0
         id = i
         for j in range(7):
             str_id = uint82bin(id)
             r = r ^ ( np.uint8(str_id[-1]) << (7-j))
             g = g ^ ( np.uint8(str_id[-2]) << (7-j))
             b = b ^ ( np.uint8(str_id[-3]) << (7-j))
             id = id >> 3
         cmap[i, 0] = r
         cmap[i, 1] = g
         cmap[i, 2] = b
     print("R G B:\n{}".format(cmap))
     return cmap
 
 labelcolormap(2)

但，还是用标准的比较放心，例如 labelme的api。

安装参考：语义分割-在ubuntu16.04下安装labelme与使用

Indexing color image.

二、Use existing config file for your model

Ref: https://github.com/opencv/opencv/wiki/TensorFlow-Object-Detection-API

You can use one of the configs that has been tested in OpenCV. This choice depends on your model and TensorFlow version:

Model	Version
MobileNet-SSD v1	2017_11_17	weights	config
MobileNet-SSD v1 PPN	2018_07_03	weights	config
MobileNet-SSD v2	2018_03_29	weights	config
Inception-SSD v2	2017_11_17	weights	config
MobileNet-SSD v3 (see #16760)	2020_01_14	weights	config
Faster-RCNN Inception v2	2018_01_28	weights	config
Faster-RCNN ResNet-50	2018_01_28	weights	config
Mask-RCNN Inception v2	2018_01_28	weights	config
EfficientDet-D0 (see #17384)		weights	config

Ref: 目标检测(YOLO,SSD,Efficientdet,RCNN系列)

Ref: 深度学习目标检测网络汇总对比

最准确的模型

- 最准确的单一模型，使用FasterRCNN，使用InceptionResNet，和300个候选。一张图片的检测需要1秒钟。
- 最准确的模型是一个多次裁剪inference的模型集合。它使用平均准确率向量来选取5个最不同的模型

最快的模型

- 使用mobilenet的SSD是在最快速度和最佳准确率之间一个最好的均衡
- SSD表现卓越，但是对小目标较差
- 对于大目标，SSD可以达到与FasterRCNN和R-FCN一样的准确率，但是用的是更小更轻的特征抽取器。

三、语义分割模型

ENet CPU时间

"一流科技" 的东西

Ref: https://www.jianshu.com/p/00215e4ceef7

一个正向传播在我的（垃圾）笔记本CPU（i5-6200）上花费了0.5S左右的时间，如果使用GPU将更快。Paszke等人在The Cityscapes Dataset训练了他们的数据集，你可以根据需求选择你需要的数据集进行训练。并且这个数据集还带有用于城市场景理解的图像示例。

如果是短视频，可以考虑 separate frames and process them independently by Lamda.

参考一：Keras-Sematic-Segmentation

基本实时的实践

总的执行时间也大大减少，主要去除了一些无谓的循环解析输出数据部分。CPU上10+FPS 应该没问题！实时get！

Ref: 详解ENet | CPU可以实时的道路分割网络

四、 OpenCV DNN

Ref: OpenCV134 OpenCV DNN ENet实现图像分割

可以实践的指导：

Ref: Semantic segmentation with OpenCV and deep learning【python】

Ref: 基于OpenCV的dnn模块使用ENet进行语义分割【c++】

训练过程貌似比较特殊：

TensorFlow-ENet， https://github.com/kwotsin/TensorFlow-ENet/blob/master/train.sh【TF】

Tutorial on how to train and test ENet on Cityscapes dataset【pytorch】

continue ...

posted @ 2021-05-26 16:02 郝壹贰叁阅读(90) 评论(0) 收藏举报

刷新页面返回顶部