YOLOV5训练与部署实战(TorchScript & TensorRT)

一、前言

YOLOv5是一个在COCO数据集上预训练的物体检测架构和模型系列，它是YOLO系列的一个延申，其网络结构共分为：input、backbone、neck和head四个模块，yolov5对yolov4网络的四个部分都进行了修改，并取得了较大的提升，在input端使用了Mosaic数据增强、自适应锚框计算、自适应图片缩放；在backbone端使用了Focus结构与CSP结构；在neck端添加了FPN+PAN结构；在head端改进了训练时的损失函数，使用GIOU_Loss，以及预测框筛选的DIOU_nms。本文旨在根据官方代码，进行一些实践。

二、环境介绍

Ubuntu20.04 + Python 3.7.11 + torch 1.10.0 + cuda 11.3 + opencv-python 4.7

三、YOLOV5简介

YOLOv5是一种单阶段目标检测算法，该算法在YOLOv4的基础上添加了一些新的改进思路，使其速度与精度都得到了极大的性能提升。主要的改进思路如下所示：

input：Mosaic数据增强、自适应锚框计算、自适应图片缩放；
backbone：Focus结构与CSP结构；
neck：添加了FPN+PAN结构；
head：改进了训练时的损失函数，使用GIOU_Loss，以及预测框筛选的DIOU_nms。

网络结构

yolov5的网络结构如上图所示，yolo系列的算法基本可以分为四个模块，input、neck和head。下面对着4个模块依次进行分析。

input

输入端主要对输入的图片进行预处理。该网络的输入图像大小为

Mosaic数据增强：Mosaic数据增强方法采用了4张图片，按照随机缩放、随机裁剪和随机排布的方式进行拼接而成，这种增强方法可以将几张图片组合成一张，这样不仅可以丰富数据集的同时极大的提升网络的训练速度，而且可以降低模型的内存需求。

自适应锚框计算：在YOLO系列算法中，针对不同的数据集，都需要设定特定长宽的Anchor。在网络训练阶段，模型在初始锚点框的基础上输出对应的预测框，计算其与真实框之间的差距，并执行反向更新操作，从而更新整个网络的参数，因此设定初始锚点框也是比较关键的一环。YOLOv5模型在每次训练时，根据数据集的名称自适应的计算出最佳的锚点框。

自适应图片缩放：传统的缩放方式都是按原始比例缩放图像并用黑色填充至目标大小，由于在实际的使用中的很多图片的长宽比不同，因此缩放填充之后，两端的黑边大小都不相同，然而如果填充的过多，则会存在大量的信息冗余，从而影响整个算法的推理速度。为了进一步提升YOLOv5算法的推理速度，该算法提出一种方法能够自适应的添加最少的黑边到缩放之后的图片中。具体的实现步骤如下所述：

根据原始图片大小与输入到网络图片大小计算缩放比例；
根据原始图片大小与缩放比例计算缩放后的图片大小；
计算黑边填充数值，该黑边数值不要求一定使图像缩放至指定大小，而是自适应模型中卷积和池化的大小。

backbone

主干网络部分主要引入了focus结构和CSP结构。

focus结构

Focus重要的是切片操作，如下图所示，4x4x3的图像切片后变成2x2x12的特征图。

在yolov5网络模型中，原始608x608x3的图像输入Focus结构，采用切片操作，先变成304x304x12的特征图，再经过一次32个卷积核的卷积操作，最终变成304x304x32的特征图。

CSP结构

yolov4网络结构中，借鉴了CSPNet的设计思路，仅仅在主干网络中设计了CSP结构。而yolov5中设计了两种CSP结构，CSP1_X结构应用于主干网络中，另一种CSP2_X结构则应用于Neck网络中。

neck

yolov5的Neck网络仍然使用了FPN+PAN结构，但是在它的基础上做了一些改进操作，yolov4的Neck结构中，采用的都是普通的卷积操作。而YOLOv5的Neck网络中，采用借鉴CSPnet设计的CSP2结构，从而加强网络特征融合能力。

head

在head部分，yolov5改进了损失函数，采用GIoU_Lossounding box的损失函数并添加了预测框筛选的DIOU_nms，这两个点并不是yolov5的原创内容，如果想深入了解可以参考相关论文，这里不再赘述。

四、YOLOV5代码部署

4.1 安装

环境要求是在 Python>=3.7.0 环境中安装 requirements.txt ，且要求 PyTorch>=1.7：

git clone https://github.com/ultralytics/yolov5  # clone
cd yolov5
pip install -r requirements.txt  # install

4.2 使用detect.py 推理

detect.py 在各种来源上运行推理，模型自动从最新的YOLOv5 release 中下载，并将结果保存到 runs/detect ：

python detect.py --weights yolov5s.pt --source 0                               # webcam
                                               img.jpg                         # image
                                               vid.mp4                         # video
                                               screen                          # screenshot
                                               path/                           # directory
                                               list.txt                        # list of images
                                               list.streams                    # list of streams
                                               'path/*.jpg'                    # glob
                                               'https://youtu.be/Zgi9g1ksQHc'  # YouTube
                                               'rtsp://example.com/media.mp4'  # RTSP, RTMP, HTTP stream

4.3 训练

4.3.1 用coco数据集训练

下面的命令重现 YOLOv5 在 COCO 数据集上的结果。最新的模型和数据集将自动的从 YOLOv5 release 中下载。 YOLOv5n/s/m/l/x 在 V100 GPU 的训练时间为 1/2/4/6/8 天（多GPU 训练速度更快）。尽可能使用更大的 --batch-size ，或通过 --batch-size -1 实现 YOLOv5 自动批处理。下方显示的 batchsize 适用于 V100-16GB。　

python train.py --data coco.yaml --epochs 300 --weights '' --cfg yolov5n.yaml  --batch-size 128
                                                                 yolov5s                    64
                                                                 yolov5m                    40
                                                                 yolov5l                    24
                                                                 yolov5x                    16

4.3.2 用自定义数据集训练

当我们用labelme标记好数据后，可使用下面的脚本convertLabelmeToYolov5.py去转换成YOLOV5需要的格式：

import os
import numpy as np
import json
from glob import glob
import cv2
from sklearn.model_selection import train_test_split
from shutil import copyfile
import argparse

obj_classes = []

# Labelme坐标到YOLO V5坐标的转换
def convert(size, box):
    dw = 1. / (size[0])
    dh = 1. / (size[1])
    x = (box[0] + box[1]) / 2.0 - 1
    y = (box[2] + box[3]) / 2.0 - 1
    w = box[1] - box[0]
    h = box[3] - box[2]
    x = x * dw
    w = w * dw
    y = y * dh
    h = h * dh
    return (x, y, w, h)

# 样本转换
def convertToYolo5(fileList, output_dir, labelme_path):
    # 创建指定样本的父目录
    if not os.path.exists(output_dir):
        os.makedirs(output_dir)
    
    # 创建指定样本的images和labels子目录
    yolo_images_dir = '{}/images/'.format(output_dir)
    yolo_labels_dir = '{}/labels/'.format(output_dir)
    
    if not os.path.exists(yolo_images_dir):
        os.makedirs(yolo_images_dir)
    if not os.path.exists(yolo_labels_dir):
        os.makedirs(yolo_labels_dir)
    
    # 一个样本图片一个样本图片地转换
    for json_file_ in fileList:
        # 1. 生成YOLO样本图片
        # 构建json图片文件的全路径名
        imagePath = labelme_path +'/'+ json_file_ + ".png"
        # 构建Yolo图片文件的全路径名
        yolo_image_file_path = yolo_images_dir + json_file_ + ".png"
        # copy样本图片
        copyfile (imagePath, yolo_image_file_path)
        
        # 2. 生成YOLO样本标签
        # 构建json标签文件的全路径名
        json_filename = labelme_path +'/'+ json_file_ + ".json"
        # 构建Yolo标签文件的全路径名
        yolo_label_file_path = yolo_labels_dir + json_file_ + ".txt"
        # 创建新的Yolo标签文件
        yolo_label_file = open(yolo_label_file_path, 'w')
        
        # 获取当前图片的Json标签文件
        json_obj = json.load(open(json_filename, "r", encoding="utf-8"))

        # 获取当前图片的长度、宽度信息
        height = json_obj['imageHeight']
        width  = json_obj['imageWidth']
        
        # 依次读取json文件中所有目标的shapes信息
        for shape in json_obj["shapes"]:
            # 获取shape中的物体分类信息
            label = shape["label"]
            if (label not in obj_classes):
                obj_classes.append(label)
            
            # 获取shape中的物体坐标信息
            if (1): #shape["shape_type"] == 'polygon'
                points = np.array(shape["points"])
                xmin = min(points[:, 0]) if min(points[:, 0]) > 0 else 0
                xmax = max(points[:, 0]) if max(points[:, 0]) > 0 else 0
                ymin = min(points[:, 1]) if min(points[:, 1]) > 0 else 0
                ymax = max(points[:, 1]) if max(points[:, 1]) > 0 else 0
            
                # 对坐标信息进行合法性检查
                if xmax <= xmin:
                    pass
                elif ymax <= ymin:
                    pass
                else:
                    # Labelme坐标转换成YOLO V5坐标
                    bbox_labelme_float   = (float(xmin), float(xmax), float(ymin), float(ymax))
                    bbox_yolo_normalized = convert((width, height), bbox_labelme_float)
                    
                    # 把分类标签转换成分类id
                    class_id = obj_classes.index(label)
                    
                    # 生成YOLO V5的标签文件
                    yolo_label_file.write(str(class_id) + " " + " ".join([str(a) for a in bbox_yolo_normalized]) + '\n')
        yolo_label_file.close()
    
def check_output_directory(output = ""):
    # 创建保存输出图片的目录
    save_path = output + '/'
    is_exists = os.path.exists(save_path)
    
    if is_exists:
        print('Warning: path of %s already exist, please remove it firstly by manual' % save_path)
        #shutil.rmtree(save_path)  # 避免误删除已有的文件
        return ""
    
    #print('create output path %s' % save_path)
    os.makedirs(save_path)
    
    return save_path


def create_yolo_dataset_cfg(output_dir='', label_class = []):
    # 创建文件
    data_cfg_file = open(output_dir + '/data.yaml', 'w')
    
    # 创建文件内容
    data_cfg_file.write('train:  ../train/images\n')
    data_cfg_file.write("val:    ../valid/images\n")
    data_cfg_file.write("test:   ../test/images\n")
    data_cfg_file.write("\n")
    data_cfg_file.write("# Classes\n")
    data_cfg_file.write("nc: %s\n" %len(label_class))
    data_cfg_file.write('names: ')
    i = 0
    for label in label_class:
        if (i == 0):
            data_cfg_file.write("[")
        else:
            data_cfg_file.write(", ")
            if  (i % 10 == 0):
                data_cfg_file.write("\n        ")
        i += 1
        data_cfg_file.write("'" + label + "'")
    data_cfg_file.write(']  # class names')
    data_cfg_file.close()
    #关闭文件

def labelme2yolo(input = '', output = ''):

    outputdir_root = check_output_directory(output)
    if outputdir_root == "":
        print("No valid output directory, Do Nothing!")
        return -1
    
    labelme_path = input
    
    # 1.获取input目录中所有的json标签文件全路径名
    files = glob(labelme_path + "/*.json")
    
    # 2.获取所有标签文件的短文件名称
    files = [i.replace("\\", "/").split("/")[-1].split(".json")[0] for i in files]
    
    # 3. 按比例随机切分数据集，获取训练集样本
    train_files, valid_test_files = train_test_split(files, test_size=0.3, random_state=55)
    
    # 4. 按比例随机切分数据集，获取验证集和测试集样本
    valid_files, test_files     = train_test_split(valid_test_files, test_size=0.3, random_state=55)

    # 5. 构建YOLO数据集目录
    train_path = outputdir_root+'/train'
    valid_path = outputdir_root+'/valid'
    test_path  = outputdir_root+'/test'
    
    # 6. 生成YOLO 训练、验证、测试数据集：图片+标签
    convertToYolo5(train_files, train_path, labelme_path)
    convertToYolo5(valid_files, valid_path, labelme_path)
    convertToYolo5(test_files,  test_path,  labelme_path)
    
    # 7. 创建YOLO数据集配置文件
    create_yolo_dataset_cfg(output, obj_classes)
    
    print("Classes:", obj_classes)
    print('Finished, output path =', outputdir_root)
    
    return 0
    
def parse_opt():
    # define argparse object
    parser = argparse.ArgumentParser()
    
    # add argument for command line
    parser.add_argument('--input',      type=str, help='The input Labelme directory')
    parser.add_argument('--output',     type=str, help='The output YOLO V5 directory')
    
    # parse arges from command line
    opt = parser.parse_args()
    print("input  =", opt.input)
    print("output =", opt.output)
    
    # return opt
    return opt

def main(opt):
    labelme2yolo(**vars(opt))

if __name__ == '__main__':
    opt = parse_opt()
    main(opt)

执行示例：

python convertLabelmeToYolov5.py --input all_imgs/ --output output/

然后将data.yaml文件放到YOLOV5代码的data文件夹下。后面训练的流程就跟coco数据集一样了。

4.4 部署

4.4.1 导出为TorchScipt模型

python export.py --weights yolov5s.pt --include torchscript --device 0

导出的关键代码：

def export_torchscript(model, im, file, optimize, prefix=colorstr('TorchScript:')):
    # YOLOv5 TorchScript model export
    LOGGER.info(f'\n{prefix} starting export with torch {torch.__version__}...')
    f = file.with_suffix('.torchscript')

    ts = torch.jit.trace(model, im, strict=False)
    d = {"shape": im.shape, "stride": int(max(model.stride)), "names": model.names}
    extra_files = {'config.txt': json.dumps(d)}  # torch._C.ExtraFilesMap()
    if optimize:  # https://pytorch.org/tutorials/recipes/mobile_interpreter.html
        optimize_for_mobile(ts)._save_for_lite_interpreter(str(f), _extra_files=extra_files)
    else:
        ts.save(str(f), _extra_files=extra_files)
    return f, None

4.4.2 chatGPT牛刀小试

本着好奇的心情，本人使用当前火热的chatGPT提问了如下较为开放性的问题：

4.4.3 导出为TensorRT模型

安装TensorRT

首先根据自己的cuda版本下载对应tensorRT版本（https://developer.nvidia.com/nvidia-tensorrt-8x-download）：

这里我根据自己的情况选择的TensorRT8.0，执行下面的命令解压：

tar -zxvf TensorRT-8.0.0.3.Linux.x86_64-gnu.cuda-11.3.cudnn8.2.tar.gz

添加环境变量：

vim ~/.bashrc
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/software/TensorRT-8.0.0.3/lib
source ~/.bashrc

安装 Python TensorRT wheel 文件：

cd TensorRT-8.0.0.3/python
pip install tensorrt-8.0.0.3-cp37-none-linux_x86_64.whl

安装 Python UFF wheel 文件:

cd ../uff/
pip install uff-0.6.9-py2.py3-none-any.whl

安装 Python graphsurgeon wheel 文件:

cd ../graphsurgeon/
pip install graphsurgeon-0.4.5-py2.py3-none-any.whl

导出模型

python export.py --weights yolov5s.pt --device 0 --include engine

导出的关键代码：

def export_engine(model, im, file, half, dynamic, simplify, workspace=4, verbose=False, prefix=colorstr('TensorRT:')):
    # YOLOv5 TensorRT export https://developer.nvidia.com/tensorrt
    assert im.device.type != 'cpu', 'export running on CPU but must be on GPU, i.e. `python export.py --device 0`'
    try:
        import tensorrt as trt
    except Exception:
        if platform.system() == 'Linux':
            check_requirements('nvidia-tensorrt', cmds='-U --index-url https://pypi.ngc.nvidia.com')
        import tensorrt as trt

    if trt.__version__[0] == '7':  # TensorRT 7 handling https://github.com/ultralytics/yolov5/issues/6012
        grid = model.model[-1].anchor_grid
        model.model[-1].anchor_grid = [a[..., :1, :1, :] for a in grid]
        export_onnx(model, im, file, 12, dynamic, simplify)  # opset 12
        model.model[-1].anchor_grid = grid
    else:  # TensorRT >= 8
        check_version(trt.__version__, '8.0.0', hard=True)  # require tensorrt>=8.0.0
        export_onnx(model, im, file, 12, dynamic, simplify)  # opset 12
    onnx = file.with_suffix('.onnx')

    LOGGER.info(f'\n{prefix} starting export with TensorRT {trt.__version__}...')
    assert onnx.exists(), f'failed to export ONNX file: {onnx}'
    f = file.with_suffix('.engine')  # TensorRT engine file
    logger = trt.Logger(trt.Logger.INFO)
    if verbose:
        logger.min_severity = trt.Logger.Severity.VERBOSE

    builder = trt.Builder(logger)
    config = builder.create_builder_config()
    config.max_workspace_size = workspace * 1 << 30
    # config.set_memory_pool_limit(trt.MemoryPoolType.WORKSPACE, workspace << 30)  # fix TRT 8.4 deprecation notice

    flag = (1 << int(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH))
    network = builder.create_network(flag)
    parser = trt.OnnxParser(network, logger)
    if not parser.parse_from_file(str(onnx)):
        raise RuntimeError(f'failed to load ONNX file: {onnx}')

    inputs = [network.get_input(i) for i in range(network.num_inputs)]
    outputs = [network.get_output(i) for i in range(network.num_outputs)]
    for inp in inputs:
        LOGGER.info(f'{prefix} input "{inp.name}" with shape{inp.shape} {inp.dtype}')
    for out in outputs:
        LOGGER.info(f'{prefix} output "{out.name}" with shape{out.shape} {out.dtype}')

    if dynamic:
        if im.shape[0] <= 1:
            LOGGER.warning(f"{prefix} WARNING ⚠️ --dynamic model requires maximum --batch-size argument")
        profile = builder.create_optimization_profile()
        for inp in inputs:
            profile.set_shape(inp.name, (1, *im.shape[1:]), (max(1, im.shape[0] // 2), *im.shape[1:]), im.shape)
        config.add_optimization_profile(profile)

    LOGGER.info(f'{prefix} building FP{16 if builder.platform_has_fast_fp16 and half else 32} engine as {f}')
    if builder.platform_has_fast_fp16 and half:
        config.set_flag(trt.BuilderFlag.FP16)
    with builder.build_engine(network, config) as engine, open(f, 'wb') as t:
        t.write(engine.serialize())
    return f, None

五、实战问题分析

问题分析1：error while loading shared libraries: libopencv_imgproc.so.405: cannot open shared object file

配置cmake文件，里面指定了opencv的library，在run可执行文件时遇到了上面的问题。error while loading shared libraries：说明共享库出现问题。

检查下动态库（共享库）配置文件：

mulan@mulan-PowerEdge-R7525:~/MulanAlgo/yolov5_test$ cat /etc/ld.so.conf
include /etc/ld.so.conf.d/*.conf

更新共享库：

sudo ldconfig

查看下可执行文件的共享库：

mulan@mulan-PowerEdge-R7525:~/MulanAlgo/yolov5_test$ ldd find_defect
        linux-vdso.so.1 (0x00007ffd8df83000)
        libtorchvision.so => /home/mulan/MulanAlgo/deploy_env/torchvision/lib/libtorchvision.so (0x00007f5b680df000)
        libc10.so => /home/mulan/MulanAlgo/libtorch/lib/libc10.so (0x00007f5b67e67000)
        libtorch_cuda.so => /home/mulan/MulanAlgo/libtorch/lib/libtorch_cuda.so (0x00007f5b67c65000)
        libtorch_cuda_cpp.so => /home/mulan/MulanAlgo/libtorch/lib/libtorch_cuda_cpp.so (0x00007f5af6920000)
        libtorch_cpu.so => /home/mulan/MulanAlgo/libtorch/lib/libtorch_cpu.so (0x00007f5adf95b000)
        libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f5adf924000)
        libtorch_cuda_cu.so => /home/mulan/MulanAlgo/libtorch/lib/libtorch_cuda_cu.so (0x00007f5a9bff2000)
        libtorch.so => /home/mulan/MulanAlgo/libtorch/lib/libtorch.so (0x00007f5a9bdf0000)
        libopencv_imgcodecs.so.405 => /home/mulan/MulanAlgo/deploy_env/opencv/lib/libopencv_imgcodecs.so.405 (0x00007f5a9bab9000)
        libopencv_core.so.405 => /home/mulan/MulanAlgo/deploy_env/opencv/lib/libopencv_core.so.405 (0x00007f5a9aa82000)
        libstdc++.so.6 => /lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f5a9a8a0000)
        libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f5a9a883000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f5a9a691000)
        libcudart.so.11.0 => /usr/local/cuda-11.3/lib64/libcudart.so.11.0 (0x00007f5a9a3f8000)
        libc10_cuda.so => /home/mulan/MulanAlgo/libtorch/lib/libc10_cuda.so (0x00007f5a9a192000)
        libnvToolsExt.so.1 => /usr/local/cuda-11.3/lib64/libnvToolsExt.so.1 (0x00007f5a99f89000)
        libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f5a99e3a000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f5b68437000)
        libgomp-52f2fd74.so.1 => /home/mulan/MulanAlgo/libtorch/lib/libgomp-52f2fd74.so.1 (0x00007f5a99c05000)
        libcudart-a7b20f20.so.11.0 => /home/mulan/MulanAlgo/libtorch/lib/libcudart-a7b20f20.so.11.0 (0x00007f5a99968000)
        libnvToolsExt-24de1d56.so.1 => /home/mulan/MulanAlgo/libtorch/lib/libnvToolsExt-24de1d56.so.1 (0x00007f5a9975e000)
        libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f5a99758000)
        librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f5a9974e000)
        libopencv_imgproc.so.405 => not found
        libjpeg.so.8 => /lib/x86_64-linux-gnu/libjpeg.so.8 (0x00007f5a996c7000)
        libpng16.so.16 => /lib/x86_64-linux-gnu/libpng16.so.16 (0x00007f5a9968f000)
        libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f5a99673000)
        libmvec.so.1 => /lib/x86_64-linux-gnu/libmvec.so.1 (0x00007f5a99647000)

果然找不到libopencv_imgproc.so.405，利用locate命令定位：

#安装locate
$ sudo apt install mlocate
#定位缺乏的库
$ locate libopencv_imgproc.so.405
/home/opencv/lib/libopencv_imgproc.so.405

路径能找到，那就添加路径，进入动态库配置文件夹：

$ cd /etc/ld.so.conf.d

新建一个 opencv.conf 文件，添加相关路径即可：

sudo vim opencv.conf

保存文件后，就可以更新共享库链接：

sudo ldconfig

或者，方法2：直接在共享库配置文件中加入目标动态库目录，再更新即可：

$ sudo vim /etc/ld.so.conf
include /etc/ld.so.conf.d/*.conf
/home/opencv/lib
$ sudo ldconfig

参考：

深入浅出Yolo系列之Yolov5核心基础知识完整讲解

yolov5算法详解

posted @ 2023-01-10 17:27 小金乌会发光－Z&M 阅读(3714) 评论(0) 编辑收藏举报

刷新页面返回顶部

小金乌会发光－Z&M

欲无杂草，先种庄稼，用心呵护好自己那一亩三分地！