July 13, 2018 —
Posted by Sara Robinson, Aakanksha Chowdhery, and Jonathan Huang
What if you could train and serve your object detection models even faster? We’ve heard your feedback, and today we’re excited to announce support for training an object detection model on Cloud TPUs, model quantization, and the addition of new models including RetinaNet and a MobileNet adaptation of RetinaNet. You can check out the…
gcloud config set project YOUR_PROJECT_NAME
Then we’ll create a Cloud Storage bucket with the following command. Note that Storage bucket names must be globally unique, so you may get an error if the first name you choose is taken.gsutil mb gs://YOUR_UNIQUE_BUCKET_NAME
This may prompt you to first run gcloud auth login
, after which you will need to provide a verification code sent to your browser.export PROJECT="YOUR_PROJECT_ID"
export YOUR_GCS_BUCKET="YOUR_UNIQUE_BUCKET_NAME"
Next, to give our Cloud TPU access to our project we need to add a TPU-specific service account. First, get the name of your service account with the following command:curl -H "Authorization: Bearer $(gcloud auth print-access-token)" \
https://ml.googleapis.com/v1/projects/${PROJECT}:getConfig
When this command completes, copy the value of tpuServiceAccount
(it will look something like your-service-account-12345@cloud-tpu.iam.gserviceaccount.com
) and then save it as an environment variable:export TPU_ACCOUNT=your-service-account
Finally, grant the ml.serviceAgent
role to your TPU service account:gcloud projects add-iam-policy-binding $PROJECT \
--member serviceAccount:$TPU_ACCOUNT --role roles/ml.serviceAgent
python object_detection/builders/model_builder_test.py
If installation is successful, you should see the following output:Ran 18 tests in 0.079s
OK
pet_faces_train.record
and pet_faces_val.record
files publicly accessible here. You can either use the public TFRecord files, or if you’d like to generate them yourself, follow the steps here.mkdir /tmp/pet_faces_tfrecord/
cd /tmp/pet_faces_tfrecord/
curl "http://download.tensorflow.org/models/object_detection/pet_faces_tfrecord.tar.gz" | tar xzf -
Note that these TFRecord files are sharded, so once you’ve extract them you’ll have 10 pet_faces_train.record
files and 10 pet_faces_val.record files
./data
subdirectory:gsutil -m cp -r /tmp/pet_faces_tfrecord/pet_faces* gs://${YOUR_GCS_BUCKET}/data/
With your TFRecord files in GCS, move back to the models/research
directory on your local machine. Next you’ll add the pet_label_map.pbtxt file
in your GCS bucket. This maps each of the 37 pet breeds we’ll be detecting to an integer, so that our model can understand them in a numerical format. From the models/research
directory, run the following:gsutil cp object_detection/data/pet_label_map.pbtxt gs://${YOUR_GCS_BUCKET}/data/pet_label_map.pbtxt
At this point you should have 21 files in the /data
subdirectory of your GCS bucket: the 20 sharded TFRecord files for training and testing, and the label map file.cd /tmp
curl -O http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v1_0.75_depth_300x300_coco14_sync_2018_07_03.tar.gz
tar xzf ssd_mobilenet_v1_0.75_depth_300x300_coco14_sync_2018_07_03.tar.gz
gsutil cp /tmp/ssd_mobilenet_v1_0.75_depth_300x300_coco14_sync_2018_07_03/model.ckpt.* gs://${YOUR_GCS_BUCKET}/data/
When we train our model, it’ll use these checkpoints as its starting point for training. Now you should have 24 files in your GCS bucket. We’re almost ready to run our training job, but we need a way to tell ML Engine where our data and model checkpoints are located. We’ll do this with a config file, which we’ll set up in the next step. Our config file provides hyperparameters for our model, the file paths for our training data, test data, and the initial model checkpoint.loss {
classification_loss {
weighted_sigmoid_focal {
alpha: 0.75,
gamma: 2.0
}
}
This loss function computes loss for every example in the dataset and then reweights them, assigning more relative weight to hard, misclassified examples. This logic is better suited for TPUs than the hard example mining operation used in other training jobs. You can read more about focal loss in Lin et al. (2017).fine_tune_checkpoint: "gs://your-bucket/data/model.ckpt"
fine_tune_checkpoint_type: "detection"
We also need to consider how our model will be used after it’s been trained. Let’s say our pet detector becomes a global hit, used by animal lovers and pet stores everywhere. We need a scalable way to handle these inference requests with low latency. The output of a machine learning model is a binary file containing the trained weights of our model — these files are often quite large, but since we’ll be serving this model directly on a mobile device we’ll need to make it as small as possible.graph_rewriter {
quantization {
delay: 1800
activation_bits: 8
weight_bits: 8
}
}
Typically with quantization, a model will train with full precision for a certain number of steps before switching to quantized training. The delay
number above tells ML Engine to begin quantizing our weights and activations after 1800 training steps.object_detection/samples/configs/ssd_mobilenet_v1_0.75_depth_quantized_300x300_pets_sync.config
. Update all the PATH_TO_BE_CONFIGURED
strings with the full path of the data directory in your GCS bucket. For example, the train_input_reader
section of the config would look like the following (make sure to replace YOUR_GCS_BUCKET
with the name of your bucket):train_input_reader: {
tf_record_input_reader {
input_path: "gs://YOUR_GCS_BUCKET/data/pet_faces_train*"
}
label_map_path: "gs://YOUR_GCS_BUCKET/data/pet_label_map.pbtxt"
}
Then copy this quantized config file into your GCS bucket:gsutil cp object_detection/samples/configs/ssd_mobilenet_v1_0.75_depth_quantized_300x300_pets_sync.config gs://${YOUR_GCS_BUCKET}/data/pipeline.config
Before we kick off our training job on Cloud ML Engine, we need to package the Object Detection API, pycocotools, and TF Slim. We can do that with the following command (run this from the research/
directory, and note that the parentheses are part of the command):bash object_detection/dataset_tools/create_pycocotools_package.sh /tmp/pycocotools
python setup.py sdist
(cd slim && python setup.py sdist)
We’re ready to train our model! To kick off training, run the following gcloud command:gcloud ml-engine jobs submit training `whoami`_object_detection_`date +%s` \
--job-dir=gs://${YOUR_GCS_BUCKET}/train \
--packages dist/object_detection-0.1.tar.gz,slim/dist/slim-0.1.tar.gz,/tmp/pycocotools/pycocotools-2.0.tar.gz \
--module-name object_detection.model_tpu_main \
--runtime-version 1.8 \
--scale-tier BASIC_TPU \
--region us-central1 \
-- \
--model_dir=gs://${YOUR_GCS_BUCKET}/train \
--tpu_zone us-central1 \
--pipeline_config_path=gs://${YOUR_GCS_BUCKET}/data/pipeline.config
Note that if you receive an error saying that no Cloud TPUs are available, we recommend simply trying again in a different zone (Cloud TPUs are currently available in us-central1-b, us-central1-c, europe-west4-a, and asia-east1-c).gcloud ml-engine jobs submit training `whoami`_object_detection_eval_validation_`date +%s` \
--job-dir=gs://${YOUR_GCS_BUCKET}/train \
--packages dist/object_detection-0.1.tar.gz,slim/dist/slim-0.1.tar.gz,/tmp/pycocotools/pycocotools-2.0.tar.gz \
--module-name object_detection.model_main \
--runtime-version 1.8 \
--scale-tier BASIC_GPU \
--region us-central1 \
-- \
--model_dir=gs://${YOUR_GCS_BUCKET}/train \
--pipeline_config_path=gs://${YOUR_GCS_BUCKET}/data/pipeline.config \
--checkpoint_dir=gs://${YOUR_GCS_BUCKET}/train
Both training and evaluation should complete within about 30 minutes. While they are running, you can use TensorBoard to see the accuracy of your model. To start TensorBoard, run the following:tensorboard --logdir=gs://${YOUR_GCS_BUCKET}/train
Note that you may need to first run gcloud auth application-default login
.localhost:6006
to view your TensorBoard output. Here you’ll see some common ML metrics used to analyze the accuracy of your model. Note that these graphs only have 2 points plotted since the model trains quickly in very few steps (if you’ve used TensorBoard before you may be used to seeing more of a curve here). The first point here is early in the training process and the last point shows metrics at the last step (step 2000).export CONFIG_FILE=gs://${YOUR_GCS_BUCKET}/data/pipeline.config
export CHECKPOINT_PATH=gs://${YOUR_GCS_BUCKET}/train/model.ckpt-2000
export OUTPUT_DIR=/tmp/tflite
We start by getting a TensorFlow frozen graph with compatible ops that we can use with TensorFlow Lite. First, you’ll need to install these python libraries. Then to get the frozen graph, run the export_tflite_ssd_graph.py
script from the models/research
directory with this command:python object_detection/export_tflite_ssd_graph.py \
--pipeline_config_path=$CONFIG_FILE \
--trained_checkpoint_prefix=$CHECKPOINT_PATH \
--output_directory=$OUTPUT_DIR \
--add_postprocessing_op=true
In the /tmp/tflite
directory, you should now see two files: tflite_graph.pb
and tflite_graph.pbtxt
(sample frozen graphs are here). Note that the add_postprocessing
flag enables the model to take advantage of a custom optimized detection post-processing operation which can be thought of as a replacement for tf.image.non_max_suppression
. Make sure not to confuse export_tflite_ssd_graph
with export_inference_graph
in the same directory. Both scripts output frozen graphs: export_tflite_ssd_graph
will output the frozen graph that we can input to TensorFlow Lite directly and is the one we’ll be using.bazel run -c opt tensorflow/contrib/lite/toco:toco -- \
--input_file=$OUTPUT_DIR/tflite_graph.pb \
--output_file=$OUTPUT_DIR/detect.tflite \
--input_shapes=1,300,300,3 \
--input_arrays=normalized_input_image_tensor \
--output_arrays='TFLite_Detection_PostProcess','TFLite_Detection_PostProcess:1','TFLite_Detection_PostProcess:2','TFLite_Detection_PostProcess:3' \
--inference_type=QUANTIZED_UINT8 \
--mean_values=128 \
--std_values=128 \
--change_concat_input_ranges=false \
--allow_custom_ops
This command takes the input tensor normalized_input_image_tensor
after resizing each camera image frame to 300x300 pixels. The outputs of the quantized model are named 'TFLite_Detection_PostProcess'
, 'TFLite_Detection_PostProcess:1'
, 'TFLite_Detection_PostProcess:2'
, and 'TFLite_Detection_PostProcess:3'
and represent four arrays: detection_boxes, detection_classes, detection_scores, and num_detections. The documentation for other flags used in this command is here. If things ran successfully, you should now see a third file in the /tmp/tflite
directory called detect.tflite
(sample tflite file is here). This file contains the graph and all model parameters and can be run via the TensorFlow Lite interpreter on the Android device and should be less than 4 Mb in size.bazel build -c opt --config=android_arm{,64} --cxxopt='--std=c++11' \
//tensorflow/contrib/lite/examples/android:tflite_demo
The apk above will be built for 64-bit architecture and you may replace it with-- config=android_arm
for 32-bit support. Now install the demo on a debug-enabled Android phone via Android Debug Bridge (adb):adb install bazel-bin/tensorflow/contrib/lite/examples/android/tflite_demo.apk
Try running this starter app (called TFLDetect) and holding your camera to people, furniture, cars, pets, etc. The working test app should look something like this. You will see boxes around the detected objects with their labels. The working test app was trained using the COCO dataset.cp /tmp/tflite/detect.tflite \
tensorflow/contrib/lite/examples/android/app/src/main/assets
We will now edit the BUILD file to point to this new model. First, open the BUILD file tensorflow/contrib/lite/examples/android/BUILD. Then find the assets section, and replace the line "@tflite_mobilenet_ssd_quant//:detect.tflite"
(which by default points to a COCO pretrained model) with the path to your TFLite pets model “//tensorflow/contrib/lite/examples/android/app/src/main/assets:detect.tflite
”. Finally, change the last line in assets section to use the new label map. Your final assets section should look like this:assets = [
"//tensorflow/contrib/lite/examples/android/app/src/main/assets:labels_mobilenet_quant_v1_224.txt",
"@tflite_mobilenet//:mobilenet_quant_v1_224.tflite",
"@tflite_conv_actions_frozen//:conv_actions_frozen.tflite",
"//tensorflow/contrib/lite/examples/android/app/src/main/assets:conv_actions_labels.txt",
"@tflite_mobilenet_ssd//:mobilenet_ssd.tflite",
"//tensorflow/contrib/lite/examples/android/app/src/main/assets:detect.tflite",
"//tensorflow/contrib/lite/examples/android/app/src/main/assets:box_priors.txt",
"//tensorflow/contrib/lite/examples/android/app/src/main/assets:pets_labels_list.txt",
],
We will also need to tell our app to use the new label map. In order to do this, open up the tensorflow/contrib/lite/examples/android/app/src/main/java/org/tensorflow/demo/DetectorActivity.java file in a text editor and find the definition of TF_OD_API_LABELS_FILE
. Update this path to point to your pets label map file: “file:///android_asset/pets_labels_list.txt
”. Note that we have already made the pets_labels_list.txt
file available for your convenience. This new section of DetectorActivity.java (around line 50) should now look as follows:// Configuration values for the prepackaged SSD model.
private static final int TF_OD_API_INPUT_SIZE = 300;
private static final boolean TF_OD_API_IS_QUANTIZED = true;
private static final String TF_OD_API_MODEL_FILE = "detect.tflite";
private static final String TF_OD_API_LABELS_FILE = "file:///android_asset/pets_labels_list.txt";
Once you’ve copied the TensorFlow Lite file and edited your BUILD and DetectorActivity.java files, rebuild and reinstall your app with the following commands:bazel build -c opt --config=android_arm{,64} --cxxopt='--std=c++11' \
//tensorflow/contrib/lite/examples/android:tflite_demo
adb install -r bazel-bin/tensorflow/contrib/lite/examples/android/tflite_demo.apk
Now for the best part: find the nearest dog or cat and try detecting it. On a Pixel 2, we get greater than 15 frames per second.
July 13, 2018
—
Posted by Sara Robinson, Aakanksha Chowdhery, and Jonathan Huang
What if you could train and serve your object detection models even faster? We’ve heard your feedback, and today we’re excited to announce support for training an object detection model on Cloud TPUs, model quantization, and the addition of new models including RetinaNet and a MobileNet adaptation of RetinaNet. You can check out the…