第三十四节,目标检测之谷歌Object Detection API源码解析
目录
我们在第三十二节,使用谷歌Object Detection API进行目标检测、训练新的模型(使用VOC 2012数据集)那一节我们介绍了如何使用谷歌Object Detection API进行目标检测,以及如何使用谷歌提供的目标检测模型训练自己的数据。在训练自己的数据集时,主要包括以下几步:
- 制作自己的数据集,注意这里数据集在进行标注时,需要按照一定的格式。然后调object_detection\dataset_tools下对应的脚本生成tfrecord文件。如下图,如果我们想调用create_pascal_tf_record.py文件生成tfrecord文件,那么我们的数据集要和voc 2012数据集的标注方式一样。你也可以通过解读create_pascal_tf_record.py文件了解我们的数据集的标注方式。
- 下载我们所要使用的目标检测模型,进行预训练,不然从头开始训练时间成本会很高。
- 在object_detection/samples/configs文件夹下有一些配置文件,选择与我们所要使用的目标检测模型相对应的配置文件,并进行一些修改。
- 使用object_detection/train.py文件进行训练。
- 使用export_inference_graph.py脚本导出训练好的模型,并进行目标检测。
在这里我主要解析一下train.py文件的工作流程。
一 train.py文件解析
先附上源码:

# Copyright 2017 The TensorFlow Authors. All Rights Reserved. # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. # ============================================================================== r"""Training executable for detection models. This executable is used to train DetectionModels. There are two ways of configuring the training job: 1) A single pipeline_pb2.TrainEvalPipelineConfig configuration file can be specified by --pipeline_config_path. Example usage: ./train \ --logtostderr \ --train_dir=path/to/train_dir \ --pipeline_config_path=pipeline_config.pbtxt 2) Three configuration files can be provided: a model_pb2.DetectionModel configuration file to define what type of DetectionModel is being trained, an input_reader_pb2.InputReader file to specify what training data will be used and a train_pb2.TrainConfig file to configure training parameters. Example usage: ./train \ --logtostderr \ --train_dir=path/to/train_dir \ --model_config_path=model_config.pbtxt \ --train_config_path=train_config.pbtxt \ --input_config_path=train_input_config.pbtxt """ import functools import json import os import tensorflow as tf from object_detection import trainer from object_detection.builders import dataset_builder from object_detection.builders import graph_rewriter_builder from object_detection.builders import model_builder from object_detection.utils import config_util from object_detection.utils import dataset_util tf.logging.set_verbosity(tf.logging.INFO) flags = tf.app.flags flags.DEFINE_string('master', '', 'Name of the TensorFlow master to use.') flags.DEFINE_integer('task', 0, 'task id') flags.DEFINE_integer('num_clones', 1, 'Number of clones to deploy per worker.') flags.DEFINE_boolean('clone_on_cpu', False, 'Force clones to be deployed on CPU. Note that even if ' 'set to False (allowing ops to run on gpu), some ops may ' 'still be run on the CPU if they have no GPU kernel.') flags.DEFINE_integer('worker_replicas', 1, 'Number of worker+trainer ' 'replicas.') flags.DEFINE_integer('ps_tasks', 0, 'Number of parameter server tasks. If None, does not use ' 'a parameter server.') flags.DEFINE_string('train_dir', '', 'Directory to save the checkpoints and training summaries.') flags.DEFINE_string('pipeline_config_path', '', 'Path to a pipeline_pb2.TrainEvalPipelineConfig config ' 'file. If provided, other configs are ignored') flags.DEFINE_string('train_config_path', '', 'Path to a train_pb2.TrainConfig config file.') flags.DEFINE_string('input_config_path', '', 'Path to an input_reader_pb2.InputReader config file.') flags.DEFINE_string('model_config_path', '', 'Path to a model_pb2.DetectionModel config file.') FLAGS = flags.FLAGS def main(_): assert FLAGS.train_dir, '`train_dir` is missing.' if FLAGS.task == 0: tf.gfile.MakeDirs(FLAGS.train_dir) if FLAGS.pipeline_config_path: configs = config_util.get_configs_from_pipeline_file( FLAGS.pipeline_config_path) if FLAGS.task == 0: tf.gfile.Copy(FLAGS.pipeline_config_path, os.path.join(FLAGS.train_dir, 'pipeline.config'), overwrite=True) else: configs = config_util.get_configs_from_multiple_files( model_config_path=FLAGS.model_config_path, train_config_path=FLAGS.train_config_path, train_input_config_path=FLAGS.input_config_path) if FLAGS.task == 0: for name, config in [('model.config', FLAGS.model_config_path), ('train.config', FLAGS.train_config_path), ('input.config', FLAGS.input_config_path)]: tf.gfile.Copy(config, os.path.join(FLAGS.train_dir, name), overwrite=True) model_config = configs['model'] train_config = configs['train_config'] input_config = configs['train_input_config'] model_fn = functools.partial( model_builder.build, model_config=model_config, is_training=True) def get_next(config): return dataset_util.make_initializable_iterator( dataset_builder.build(config)).get_next() create_input_dict_fn = functools.partial(get_next, input_config) env = json.loads(os.environ.get('TF_CONFIG', '{}')) cluster_data = env.get('cluster', None) cluster = tf.train.ClusterSpec(cluster_data) if cluster_data else None task_data = env.get('task', None) or {'type': 'master', 'index': 0} task_info = type('TaskSpec', (object,), task_data) # Parameters for a single worker. ps_tasks = 0 worker_replicas = 1 worker_job_name = 'lonely_worker' task = 0 is_chief = True master = '' if cluster_data and 'worker' in cluster_data: # Number of total worker replicas include "worker"s and the "master". worker_replicas = len(cluster_data['worker']) + 1 if cluster_data and 'ps' in cluster_data: ps_tasks = len(cluster_data['ps']) if worker_replicas > 1 and ps_tasks < 1: raise ValueError('At least 1 ps task is needed for distributed training.') if worker_replicas >= 1 and ps_tasks > 0: # Set up distributed training. server = tf.train.Server(tf.train.ClusterSpec(cluster), protocol='grpc', job_name=task_info.type, task_index=task_info.index) if task_info.type == 'ps': server.join() return worker_job_name = '%s/task:%d' % (task_info.type, task_info.index) task = task_info.index is_chief = (task_info.type == 'master') master = server.target graph_rewriter_fn = None if 'graph_rewriter_config' in configs: graph_rewriter_fn = graph_rewriter_builder.build( configs['graph_rewriter_config'], is_training=True) trainer.train( create_input_dict_fn, model_fn, train_config, master, task, FLAGS.num_clones, worker_replicas, FLAGS.clone_on_cpu, ps_tasks, worker_job_name, is_chief, FLAGS.train_dir, graph_hook_fn=graph_rewriter_fn) if __name__ == '__main__': tf.app.run()
1、先定义了tf.app.flags,用于支持接受命令行传递参数,相当于接受argv。
flags = tf.app.flags flags.DEFINE_string('master', '', 'Name of the TensorFlow master to use.') flags.DEFINE_integer('task', 0, 'task id') flags.DEFINE_integer('num_clones', 1, 'Number of clones to deploy per worker.') flags.DEFINE_boolean('clone_on_cpu', False, 'Force clones to be deployed on CPU. Note that even if ' 'set to False (allowing ops to run on gpu), some ops may ' 'still be run on the CPU if they have no GPU kernel.') flags.DEFINE_integer('worker_replicas', 1, 'Number of worker+trainer ' 'replicas.') flags.DEFINE_integer('ps_tasks', 0, 'Number of parameter server tasks. If None, does not use ' 'a parameter server.') flags.DEFINE_string('train_dir', '', 'Directory to save the checkpoints and training summaries.') flags.DEFINE_string('pipeline_config_path', '', 'Path to a pipeline_pb2.TrainEvalPipelineConfig config ' 'file. If provided, other configs are ignored') flags.DEFINE_string('train_config_path', '', 'Path to a train_pb2.TrainConfig config file.') flags.DEFINE_string('input_config_path', '', 'Path to an input_reader_pb2.InputReader config file.') flags.DEFINE_string('model_config_path', '', 'Path to a model_pb2.DetectionModel config file.') FLAGS = flags.FLAGS
这里面有几个比较重要的参数,train_dir目录用于保存训练的模型和日志文件,pipeline_config_path用于指定pipeline_pb2.TrainEvalPipelineConfig配置文件的全路径(如果不指定指定这个参数,需要指定train_config_path,input_config_path,model_config_path配置文件,其实这三个文件就是把pipeline_pb2.TrainEvalPipelineConfig配置文件分成了三部分)。
2、再来看一下main函数,我们把它分成几部分来解读。
假设我们在控制台下的命令如下:
python train.py --train_dir voc/train_dir/ --pipeline_config_path voc/faster_rcnn_inception_resnet_v2_atrous_voc.config
3、第一部分
assert FLAGS.train_dir, '`train_dir` is missing.' if FLAGS.task == 0: tf.gfile.MakeDirs(FLAGS.train_dir) if FLAGS.pipeline_config_path: configs = config_util.get_configs_from_pipeline_file( FLAGS.pipeline_config_path) if FLAGS.task == 0: tf.gfile.Copy(FLAGS.pipeline_config_path, os.path.join(FLAGS.train_dir, 'pipeline.config'), overwrite=True) else: configs = config_util.get_configs_from_multiple_files( model_config_path=FLAGS.model_config_path, train_config_path=FLAGS.train_config_path, train_input_config_path=FLAGS.input_config_path) if FLAGS.task == 0: for name, config in [('model.config', FLAGS.model_config_path), ('train.config', FLAGS.train_config_path), ('input.config', FLAGS.input_config_path)]: tf.gfile.Copy(config, os.path.join(FLAGS.train_dir, name), overwrite=True)
因为我们传入了train_dir,pipeline_config_path参数,程序执行时会:
- 读取pipeline_config_path配置文件,返回一个dict,保存配置文件中`model`, `train_config`, `train_input_config`, `eval_config`, `eval_input_config`信息。
- 把pipeline_config_path配置文件复制到train_dir目录下,命名为pipeline.config
4、第二部分
model_config = configs['model'] train_config = configs['train_config'] input_config = configs['train_input_config'] model_fn = functools.partial( model_builder.build, model_config=model_config, is_training=True) def get_next(config): return dataset_util.make_initializable_iterator( dataset_builder.build(config)).get_next() create_input_dict_fn = functools.partial(get_next, input_config)
- 变量model_config,train_config,input_config初始化
- model_builder.build函数,指定两个固定参数model_config,is_training并返回一个新的函数model_fn 。这个函数很重要,包括对目标检测模型的实现,后面会详细介绍。
- get_next函数,指定固定参数input_config。这个函数主要实现了tfrecord数据的读取,我们也放在后面介绍。
5、第三部分
env = json.loads(os.environ.get('TF_CONFIG', '{}')) cluster_data = env.get('cluster', None) cluster = tf.train.ClusterSpec(cluster_data) if cluster_data else None task_data = env.get('task', None) or {'type': 'master', 'index': 0} task_info = type('TaskSpec', (object,), task_data) # Parameters for a single worker. ps_tasks = 0 worker_replicas = 1 worker_job_name = 'lonely_worker' task = 0 is_chief = True master = '' if cluster_data and 'worker' in cluster_data: # Number of total worker replicas include "worker"s and the "master". worker_replicas = len(cluster_data['worker']) + 1 if cluster_data and 'ps' in cluster_data: ps_tasks = len(cluster_data['ps']) if worker_replicas > 1 and ps_tasks < 1: raise ValueError('At least 1 ps task is needed for distributed training.') if worker_replicas >= 1 and ps_tasks > 0: # Set up distributed training. server = tf.train.Server(tf.train.ClusterSpec(cluster), protocol='grpc', job_name=task_info.type, task_index=task_info.index) if task_info.type == 'ps': server.join() return worker_job_name = '%s/task:%d' % (task_info.type, task_info.index) task = task_info.index is_chief = (task_info.type == 'master') master = server.target
- 这部分代码主要是用来实现分布式部署训练的。如果想了解的请点击这里第八节,配置分布式TensorFlow。
6、第四部分
graph_rewriter_fn = None if 'graph_rewriter_config' in configs: graph_rewriter_fn = graph_rewriter_builder.build( configs['graph_rewriter_config'], is_training=True) trainer.train( create_input_dict_fn, model_fn, train_config, master, task, FLAGS.num_clones, worker_replicas, FLAGS.clone_on_cpu, ps_tasks, worker_job_name, is_chief, FLAGS.train_dir, graph_hook_fn=graph_rewriter_fn)
- 由于没有定义graph_rewriter_config,所以会直接执行trainer.train,开始读取数据,进行预处理后训练。
二 dataset_builder.build函数
先附上代码:
def build(input_reader_config, transform_input_data_fn=None,
batch_size=None, max_num_boxes=None, num_classes=None,
spatial_image_shape=None):
"""Builds a tf.data.Dataset.
Builds a tf.data.Dataset by applying the `transform_input_data_fn` on all
records. Applies a padded batch to the resulting dataset.
Args:
input_reader_config: A input_reader_pb2.InputReader object.
transform_input_data_fn: Function to apply to all records, or None if
no extra decoding is required.
batch_size: Batch size. If None, batching is not performed.
max_num_boxes: Max number of groundtruth boxes needed to compute shapes for
padding. If None, will use a dynamic shape.
num_classes: Number of classes in the dataset needed to compute shapes for
padding. If None, will use a dynamic shape.
spatial_image_shape: A list of two integers of the form [height, width]
containing expected spatial shape of the image after applying
transform_input_data_fn. If None, will use dynamic shapes.
Returns:
A tf.data.Dataset based on the input_reader_config.
Raises:
ValueError: On invalid input reader proto.
ValueError: If no input paths are specified.
"""
if not isinstance(input_reader_config, input_reader_pb2.InputReader):
raise ValueError('input_reader_config not of type '
'input_reader_pb2.InputReader.')
if input_reader_config.WhichOneof('input_reader') == 'tf_record_input_reader':
config = input_reader_config.tf_record_input_reader
if not config.input_path:
raise ValueError('At least one input path must be specified in '
'`input_reader_config`.')
label_map_proto_file = None
if input_reader_config.HasField('label_map_path'):
label_map_proto_file = input_reader_config.label_map_path
#初始化需要解码的字段,以及解码对应字段的 handler
decoder = tf_example_decoder.TfExampleDecoder(
load_instance_masks=input_reader_config.load_instance_masks,
instance_mask_type=input_reader_config.mask_type,
label_map_proto_file=label_map_proto_file)
def process_fn(value):
processed = decoder.decode(value)
if transform_input_data_fn is not None:
return transform_input_data_fn(processed)
return processed
# 调用 tf.data.TFRecordDataset 从 config.input_path 读数据,调用 process_fn 对读取的数据解码数,预提取 input_reader_config.prefetch_size 条数据
dataset = dataset_util.read_dataset(
functools.partial(tf.data.TFRecordDataset, buffer_size=8 * 1000 * 1000),
process_fn, config.input_path[:], input_reader_config)
if batch_size:
padding_shapes = _get_padding_shapes(dataset, max_num_boxes, num_classes,
spatial_image_shape)
dataset = dataset.apply(
tf.contrib.data.padded_batch_and_drop_remainder(batch_size,
padding_shapes))
return dataset
raise ValueError('Unsupported input_reader_config.')
整个流程
- 获取训练集tfrecord文件路径,label_map_path文件路径,input_reader_config设置参数如下:
train_input_reader: { tf_record_input_reader { input_path: "voc/pascal_train.record" } label_map_path: "voc/pascal_label_map.pbtxt" }
- 初始化需要解码的字段,以及解码对应字段的 handler
- 调用 tf.data.TFRecordDataset 从 config.input_path 读数据,调用 process_fn (定义了数据的解码格式)对读取的数据解码,预提取 input_reader_config.prefetch_size 条数据
- 对数据集应用 tf.contrib.data.padded_batch_and_drop_remainder,如果不够一个 batch_size 就丢弃该部分数据
- 返回一个迭代器
三 model_builder.build函数
代码如下:
亲爱的读者和支持者们,自动博客加入了打赏功能,陆陆续续收到了各位老铁的打赏。在此,我想由衷地感谢每一位对我们博客的支持和打赏。你们的慷慨与支持,是我们前行的动力与源泉。
日期 | 姓名 | 金额 |
---|---|---|
2023-09-06 | *源 | 19 |
2023-09-11 | *朝科 | 88 |
2023-09-21 | *号 | 5 |
2023-09-16 | *真 | 60 |
2023-10-26 | *通 | 9.9 |
2023-11-04 | *慎 | 0.66 |
2023-11-24 | *恩 | 0.01 |
2023-12-30 | I*B | 1 |
2024-01-28 | *兴 | 20 |
2024-02-01 | QYing | 20 |
2024-02-11 | *督 | 6 |
2024-02-18 | 一*x | 1 |
2024-02-20 | c*l | 18.88 |
2024-01-01 | *I | 5 |
2024-04-08 | *程 | 150 |
2024-04-18 | *超 | 20 |
2024-04-26 | .*V | 30 |
2024-05-08 | D*W | 5 |
2024-05-29 | *辉 | 20 |
2024-05-30 | *雄 | 10 |
2024-06-08 | *: | 10 |
2024-06-23 | 小狮子 | 666 |
2024-06-28 | *s | 6.66 |
2024-06-29 | *炼 | 1 |
2024-06-30 | *! | 1 |
2024-07-08 | *方 | 20 |
2024-07-18 | A*1 | 6.66 |
2024-07-31 | *北 | 12 |
2024-08-13 | *基 | 1 |
2024-08-23 | n*s | 2 |
2024-09-02 | *源 | 50 |
2024-09-04 | *J | 2 |
2024-09-06 | *强 | 8.8 |
2024-09-09 | *波 | 1 |
2024-09-10 | *口 | 1 |
2024-09-10 | *波 | 1 |
2024-09-12 | *波 | 10 |
2024-09-18 | *明 | 1.68 |
2024-09-26 | B*h | 10 |
2024-09-30 | 岁 | 10 |
2024-10-02 | M*i | 1 |
2024-10-14 | *朋 | 10 |
2024-10-22 | *海 | 10 |
2024-10-23 | *南 | 10 |
2024-10-26 | *节 | 6.66 |
2024-10-27 | *o | 5 |
2024-10-28 | W*F | 6.66 |
2024-10-29 | R*n | 6.66 |
2024-11-02 | *球 | 6 |
2024-11-021 | *鑫 | 6.66 |
2024-11-25 | *沙 | 5 |
2024-11-29 | C*n | 2.88 |

分类:
deep learning
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 记一次.NET内存居高不下排查解决与启示
· 探究高空视频全景AR技术的实现原理
· 理解Rust引用及其生命周期标识(上)
· 浏览器原生「磁吸」效果!Anchor Positioning 锚点定位神器解析
· 没有源码,如何修改代码逻辑?
· 全程不用写代码,我用AI程序员写了一个飞机大战
· MongoDB 8.0这个新功能碉堡了,比商业数据库还牛
· DeepSeek 开源周回顾「GitHub 热点速览」
· 记一次.NET内存居高不下排查解决与启示
· 白话解读 Dapr 1.15:你的「微服务管家」又秀新绝活了