[GCP] Object Detection API - GCP

通过GCP训练一个自己的模型，顺便学习GCP的AI相关使用。

个人私房笔记，看客有限参考。

本地操作

一、设定一个实验名称

读取模板填写i变量，生成训练配置文件：experiment/mobilarTest/pipeline.config。

$ python ./scripts/gen-config.py --exp-name mobilarTest

2020-02-22 10:51:50,679-line:48-INFO-check_options(): train_dir            = gs://tfobd_2020_bucket/mobilarTest_train
2020-02-22 10:51:50,679-line:55-INFO-check_options(): data_dir             = gs://tfobd_2020_bucket/mobilarTest_data
2020-02-22 10:51:50,679-line:62-INFO-check_options(): checkpoint_file      = gs://tfobd_2020_bucket/mobilarTest_data/model.ckpt
2020-02-22 10:51:50,679-line:70-INFO-check_options(): train_input_path     = gs://tfobd_2020_bucket/mobilarTest_data/train.record
2020-02-22 10:51:50,679-line:77-INFO-check_options(): train_label_map_path = gs://tfobd_2020_bucket/mobilarTest_data/object-detection.pbtxt
2020-02-22 10:51:50,679-line:84-INFO-check_options(): test_input_path      = gs://tfobd_2020_bucket/mobilarTest_data/test.record
2020-02-22 10:51:50,679-line:90-INFO-check_options(): test_label_map_path  = gs://tfobd_2020_bucket/mobilarTest_data/object-detection.pbtxt

二、准备实验数据

需要准备的有：“训练数据” 和 “预训练模型”。

# 预训练模型
model.ckpt
object-detection.pbtxt

# 训练数据
train.record
test.record

远程操作

一、新建对应的项目

登录GCP，新建一个项目（例如 proj-tfobd-2020），并为该项目启用结算功能；

二、启用 AI API

本项目需要用到GCP的机器学习引擎等功能，需要启用相关API，步骤如下：

(1). 打开GCP的API&Services菜单，点击 “Enable APIs and Services” 按钮。

(2). 左侧边栏 “machine learning”中，找到 AI Platform Training & Prediction API 并确保启用该API。

三、配置 gcloud

Google Cloud SDK 用于管理托管在 Google Cloud Platform 上的资源和应用，其中的工具包括 gcloud、gsutil 和 bq 命令行工具。gcloud 命令行工具随 Cloud SDK 一并下载；如需查看 gcloud CLI 的综合指南，请参阅 gcloud 命令行工具概览。

如何在Ubuntu上安装Google Cloud SDK（搬运自：https://cloud.google.com/sdk/docs/quickstart-debian-ubuntu）。

四、使用 gsutil

确定训练数据文件没有问题后，同步到gcloud，准备训练。

gsutil -m cp train.record           gs://${YOUR_GCS_BUCKET}/data/
gsutil -m cp test.record            gs://${YOUR_GCS_BUCKET}/data/
gsutil -m cp object-detection.pbtxt gs://${YOUR_GCS_BUCKET}/data/

模型训练

一、上传训练

bash object_detection/dataset_tools/create_pycocotools_package.sh /tmp/pycocotools
python setup.py sdist
(cd slim && python setup.py sdist)

准备好tf的压缩配置包，就开始自动上传并训练。

 gcloud ai-platform jobs submit training ${job_id} \
 --job-dir=${gcp_training_dirPath} \
 --packages ${RESEARCH_DIR}/dist/object_detection-0.1.tar.gz,${RESEARCH_DIR}/slim/dist/slim-0.1.tar.gz,/tmp/pycocotools/pycocotools-2.0.tar.gz \
 --module-name object_detection.model_tpu_main \
 --runtime-version ${TF_VERSION} \
 --scale-tier BASIC_TPU \
 --region us-central1 \
 -- \
 --model_dir=${gcp_training_dirPath} \
 --tpu_zone us-central1 \
 --pipeline_config_path=${gcp_config_path}

二、训练日志查看

gcloud ai-platform jobs describe ${job_id}

三、终止训练

gcloud ai-platform jobs cancel ${job_id}

End.

posted @ 2018-04-24 16:02 郝壹贰叁阅读(655) 评论(0) 编辑收藏举报

刷新页面返回顶部

机器学习水很深

We all have two lives. The second one starts when we realize that we only have one. --- Tom Hiddleston