Pytorch_YOLO-V8

历史版本

2016年,Joseph Redmon提出了他至今以来最有名的个人项目:Joseph RedmonYou Only Look Once: Unified, Real-Time Object Detectionhttps://pjreddie.com/
2017年,Joseph Redmon与导师合著,发表了论文《YOLO9000: Better, Faster, Stronger》,也就是YOLOv2YOLOv2能够检测9000种不同对象
      YOLOv3之前的所有YOLO对象检测模型都是用C语言编写的,并使用了Darknet框架
	     darknet是一个较为轻型的完全基于CCUDA的开源深度学习框架,其主要特点就是容易安装,没有任何依赖 	  
2018Joseph Redmon 提出的YOLOv3 Darknet框架
    Ultralytics 发布了第一个使用PyTorch框架实现的YOLO (YOLOv3)-基于PyTorch复现的YOLOv3
 20202Joseph Redmon 突然在推特上发布声明——,
     出于道德上的考虑,他决定停止一切有关计算机视觉的研究
 20204月,另一位曾经参与YOLO项目维护的大神Alexey Bochkovskiy,在arXiv上提交了YOLO v4
    Chien-Yao Wang, Alexey Bochkovskiy, Hong-Yuan Mark Liao.作者与YOLOv4的团队相同,可以认为是YOLO的官方发布。
    source code - Pytorch (use to reproduce results): https://github.com/WongKinYiu/ScaledYOLOv4
    source code - Darknet: https://github.com/AlexeyAB/darknet
	https://github.com/AlexeyAB/Yolo_mark
 202069Ultralytics发布了YOLOv5Ultralytics公司开源了YOLOv5 
      Ultralytics放出的YOLOv5,是YOLOv4的一个实现版本(性能有改进),只不过是完全用Pyotorch实现了。并且支持转ONNX以及CoreML等
	  这个「YOLOv5」的名号,是Ultralytics团队自封的
 2022/09
    YOLOv6 / MT-YOLOv6
    Meituan, China.
    “YOLOv6: A Single-Stage Object Detection Framework for Industrial
    Applications”, https://arxiv.org/pdf/2209.02976.pdf
 2022710YOLOv7官方开源 | Alexey Bochkovskiy站台,精度速度超越所有YOLO,还得是ABYOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors2022/07, 
	 https://arxiv.org/pdf/2207.02696.pdf	
     https://github.com/WongKinYiu/yolov7 
   20231月,Ultralytics发布了YOLOv8
   https://github.com/ultralytics
       https://github.com/ultralytics/yolov3
	   https://github.com/ultralytics/yolov5
	   https://github.com/ultralytics/ultralytics
	   https://ultralytics.com/yolov8

YOLO国内版本

2020 百度发布了 PP-YOLO 
2021 旷视发布了 YOLOX,
2022 百度又发布了 PP-YOLOE及 PP-YOLOE+ 
随后又有美团、OpenMMLab、阿里达摩院等相继推出了各自的YOLO模型版本

yolov8

 ultralytics团队希望将这个项目设计和建成一个集合分类,检测,分割等视觉任务的集成训练推理框架,而不仅仅只是yolov
  模型结构设计、Loss 计算、训练数据增强、训练策略和模型推理过程共 5 个部分
  模型结构设计
       backbone Neck Head
  Loss 设计
      Loss 计算过程包括 2 个部分: 正负样本分配策略和 Loss 计算
  
 升级
   同时支持目标检测、实例分割和图像分类三种任务+  用户友好的API(命令行+PythonYOLOv8的开发者脱离了标准YOLO项目的设计,将train.py、detect.py、val.py和export.py这四个脚本进行了分离	   

数据集

 DOTA数据集全称:Dataset for Object deTection in Aerial images  
   https://captain-whu.github.io/DOTA/dataset.html	 

Microsoft COCO: Common Objects in Context
   https://cocodataset.org/#home
   https://cocodataset.org/#download
   
state-of-the-art (SOTA) model  应用最先进技术
yolo格式的标签和其目录结构格式
  yolo官方的txt格式。
    每行代表一个物体的类别和位置,第一列代表物体的类别,后面四列代表物体的位置信息,
     分别为x,y,w,h。每张图片对应一个txt文件,
	 yolo标注格式保存在.txt文件中,一共5个数据,用空格隔开
	 一个文件中有多个类别的物体。x,y,w,h分别为相对大小,即相当于原图的比例大小
	    这四个值都是要做标准化处理,也就是除以原图的尺寸,它们的取值都是在0~1之
  标签文件夹名称必须是labels  不是lables  el和le

数据集转换

构建自定义数据集
 1.Create Dataset
 2.Create dataset.yaml
 3.Create Labels
 4. Organize Directories
 https://github.com/ultralytics/yolov5/wiki/Train-Custom-Data
 How to Train YOLOv8 Object Detection on a Custom Dataset
 https://blog.roboflow.com/how-to-train-yolov8-on-a-custom-dataset/

转换后的数据集
 YOLO_dir
 └─ mydata
        ├─ images
        │    ├─ test # 下面放测试集图片
        │    ├─ train # 下面放训练集图片
        │    └─ val # 下面放验证集图片
        └─ labels
               ├─ test # 下面放测试集标签
               ├─ train # 下面放训练集标签
               ├─ val # 下面放验证集标签
原始标注的数据集 以及 注意事项
  yolo格式的数据转coco格式

   其中图片名和标签文件(.txt)名一一对应,且标签文件中保存的是对应图片中各个目标的类别和坐标
     yolo的数据格式为 (x_center, y_center, w, h); 
   而coco里面的bbox格式为(x_left, y_top, w, h) 。这是不一样的

01.目标

 读入自定义数据集json格式的标注,输出可供yolo训练的标签
images 目录包含所有图片 (目前支持png和jpg格式数据)
labels 目录包含所有标签(与图片同名的txt格式数据)

02.自定义数据集结构YOLO format

区分训练集和验证集
../datasets/
    company/images
	        train/im0.jpg 
			train/im1.jpg
			val/im2.jpg 
	company/labels
	        train/im0.txt
			train/im1.txt
            val/im2.txt	

###labels 说明:
 One row per object
 Each row is class x_center y_center width height format.
 Box coordinates must be in normalized xywh format (from 0 - 1). 
     If your boxes are in pixels, divide x_center and width by image width, and y_center and height by image height.
 Class numbers are zero-indexed (start from 0).	

03.dataset.yaml

 # Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, 
 #  or 3) list: [path/to/imgs1, path/to/imgs2, ..]
 path: ../datasets/company  # dataset root dir
 train: images/train  # train images (relative to 'path') 128 images
 val: images/train  # val images (relative to 'path') 128 images   val: images/val
 test:  # test images (optional)
 
 # Classes (80 COCO classes)
 names:
   0: person
   1: vehicle
   2: car

04.代码示例

# -*- coding: utf-8 -*-#
import glob
import json
import os
import shutil
import numpy as np

       

def get_label_js(json_file_path):
    stat_set= set()
    extra_set=set(['ori', 'beidth', 'height'])
    with open(json_file_path,mode="r",encoding="utf8") as label_r:
        for num,data in enumerate(label_r):
            json_data = json.loads(data)
            for item in json_data.items():
                key = item[0]
                value = item[1]
                stat_set.add(key)
    label_nm = stat_set - extra_set
    return label_nm

 def set_label_index(json_file_path):
    label_set = get_label_js(json_file_path)
    file_gt_label_index =dict()
    for i,label in enumerate(label_set):
        file_gt_label_index[label] = i
    print(file_gt_label_index ) 
    return file_gt_label_index   
 ## 根据读出的标签 设置类别--可以根据标注文档,如果没有的话再采用这种方式   
gt_label_index ={
        'person': 0,
        'vehicle': 1, 
         }
##没有重复的key         
get_mark_index_label= {value:key for key,value in gt_label_index.items()}  


# COCO的格式: [x1,y1,w,h] 对应COCO的bbox格式
def get_box(points):
    min_x = min_y = np.inf
    max_x = max_y = 0
    for x, y in points:
        min_x = min(min_x, x)
        min_y = min(min_y, y)
        max_x = max(max_x, x)
        max_y = max(max_y, y)
    return [min_x, min_y, max_x - min_x, max_y - min_y]

# [min_x, min_y, max_x, max_y] 对应的格式: [ x_center y_center width height] 
def get_yolo_points(points, img_w = 1600, img_h=1200):
    min_x = points[0]
    min_y = points[1]
    max_x = points[2]
    max_y = points[3]
    ###归一化的坐标
    x_center = (min_x+max_x)/(2.0*img_w)
    y_center = (min_y+max_y)/(2.0*img_h)
    width = (max_x - min_x)/(1.0*img_w)
    height= (max_y - min_y)/(1.0*img_h)
    return [x_center, y_center, width, height]

def read_label_js(json_file_path,img_sub):
    img_target_dir = os.path.join(img_sub,"images","train")
    if not os.path.exists(img_target_dir):
        os.makedirs(img_target_dir)
    label_target_dir = os.path.join(img_sub,"labels","train")
    if not os.path.exists(label_target_dir):
        os.makedirs(label_target_dir)
    img_file = sorted(glob.glob(img_sub+"/*.jpeg", recursive=True))
    img_file_nm =  [ os.path.split(img)[-1] for img in img_file]
    with open(json_file_path,mode="r",encoding="utf8") as label_r:
        for num,data in enumerate(label_r):
            json_data = json.loads(data)
            if  json_data["my_key"] is not None:
                img_nm= os.path.split(json_data["my_key"])[-1]
                if img_nm in img_file_nm:
                    print(img_nm)
                    shutil.copyfile( os.path.join(img_sub,img_nm), os.path.join(img_target_dir,img_nm))
                    with open(os.path.join(label_target_dir,img_nm.replace("jpeg","txt")),mode="w",encoding="utf8") as label_w :
                        for item in json_data.items():
                            label_key = item[0]
                            label_value = item[1]
                            if  label_key in gt_label_index.keys():
                                for locatioin in json_data[label_key]:
                                    #  row is class x_center y_center width height format.
                                    # Box coordinates must be in normalized xywh format (from 0 - 1)
                                    label_location_data= get_yolo_points(locatioin["data"])
                                    print(label_key,gt_label_index[label_key],label_location_data )
                                    line = str(gt_label_index[label_key]) +  " "+  " ".join(map(str,label_location_data))
                                    label_w.write(line+"\n")
if __name__ == '__main__':
    label_json= r"D:\train.json"
    img_dir = r"D:\train"
    read_label_js(label_json,img_dir)
    #get_label_js(label_json)  

2.数据集配置 训练数据集的说明

data.yaml文件保存训练数据集的目录,类别数,类别   
位置ultralytics/ultralytics/yolo/data/datasets/coco.yaml
 # Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
 path: ../datasets/coco  # dataset root dir
 train: train2017.txt  # train images (relative to 'path') 118287 images
 val: val2017.txt  # val images (relative to 'path') 5000 images
 test: test-dev2017.txt  # 20288 of 40670 images, submit to https://competitions.codalab.org/competitions/20794
 
 # Classes
 names:

3.模型参数

  模型参数文件yolov8s.yaml	
配置:ultralytics/ultralytics/models/v8/cls/yolov8l-cls.yaml
	# Ultralytics YOLO 🚀, GPL-3.0 license

# Parameters
nc: 1000  # number of classes
depth_multiple: 1.00  # scales module repeats
width_multiple: 1.00  # scales convolution channels

# YOLOv8.0n backbone
backbone:
  # [from, repeats, module, args]
  - [-1, 1, Conv, [64, 3, 2]]  # 0-P1/2
  - [-1, 1, Conv, [128, 3, 2]]  # 1-P2/4
  - [-1, 3, C2f, [128, True]]
  - [-1, 1, Conv, [256, 3, 2]]  # 3-P3/8
  - [-1, 6, C2f, [256, True]]
  - [-1, 1, Conv, [512, 3, 2]]  # 5-P4/16
  - [-1, 6, C2f, [512, True]]
  - [-1, 1, Conv, [1024, 3, 2]]  # 7-P5/32
  - [-1, 3, C2f, [1024, True]]
# YOLOv8.0n head
head:
  - [-1, 1, Classify, [nc]]  

4.训练前的参数配置

 是YOLO_V8做了较大的改变,将所有的参数整合到一个文件中集中配置(ultralytics/ultralytics/yolo/cfg/default.yaml)
 # Ultralytics YOLO 🚀, GPL-3.0 license
 # Default training settings and hyperparameters for medium-augmentation COCO training
 
 task: detect  # inference task, i.e. detect, segment, classify
 mode: train  # YOLO mode, i.e. train, val, predict, export
 
 # Train settings -------------------------------------------------------------------------------------------------------

5.开始训练

  运行:YOLOv8\ultralytics\yolo\v8\detect\train.py,即可开始训练

6.用测试集验证模型

 验证

7.训练 测试迭代--导出和上传权重

 迭代的过程

8.预测

 推断

9.输出数据结构化

 数据整洁

https://docs.ultralytics.com/	  A Brief History of YOLO
yolo坐标归一化    https://blog.csdn.net/qq_38428735/article/details/121541641
posted @   辰令  阅读(566)  评论(0编辑  收藏  举报
相关博文:
阅读排行:
· CSnakes vs Python.NET:高效嵌入与灵活互通的跨语言方案对比
· 【.NET】调用本地 Deepseek 模型
· Plotly.NET 一个为 .NET 打造的强大开源交互式图表库
· 上周热点回顾(2.17-2.23)
· 如何使用 Uni-app 实现视频聊天(源码,支持安卓、iOS)
点击右上角即可分享
微信分享提示