# Using MobilenetSSD

Tensorflow MobilenetSSD model

Tensorflow Mobilenet SSD frozen graphs come in a couple of
flavors. The standard frozen graph and a quantization aware
frozen graph. The following example uses a quantization aware
frozen graph to ensure accurate results on the Qualcomm® Neural Processing SDK runtimes.

**Prerequisites**

The quantization aware model conversion process was tested
using Tensorflow v1.11 however other versions may also work.
The CPU version of Tensorflow was used to avoid out of memory
issues observed across various GPU cards during conversion.

**Setup the Tensorflow Object Detection Framework**

The quantization aware model is provided as a TFLite frozen
graph. However Qualcomm® Neural Processing SDK requires a Tensorflow frozen graph (.PB).
To convert the quantized model, the object detection framework
is used to export to a Tensorflow frozen graph. Follow these
steps to clone the object detection framework:

mkdir ~/tfmodels
    cd ~/tfmodels
    git clone https://github.com/tensorflow/models.git
    Copy to clipboard

Checkout a tested object detection framework commit (SHA)

git checkout ad386df597c069873ace235b931578671526ee00
    Copy to clipboard

Follow third party instructions to setup the Tensorflow
object detection framework

**Download the quantization aware model**

A specific version of the Tensorflow MobilenetSSD model has
been tested:
**ssd\_mobilenet\_v2\_quantized\_300x300\_coco\_2019\_01\_03.tar.gz**

wget http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v2_quantized_300x300_coco_2019_01_03.tar.gz
    Copy to clipboard

After downloading the model extract the contents to a
directory.

tar xzvf ssd_mobilenet_v2_quantized_300x300_coco_2019_01_03.tar.gz
    Copy to clipboard

**Export a trained graph from the object detection framework**

Follow these instructions to export the Tensorflow graph:

- [https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/exporting_models.md](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/exporting_models.md)

or modify and execute this sample script

Create this file, export\_train.sh, using your favorite editor.
Modify the paths to the correct directory location of the
downloaded quantization aware model files.

#!/bin/bash
    INPUT_TYPE=image_tensor
    PIPELINE_CONFIG_PATH=<path_to>/ssd_mobilenet_v2_quantized_300x300_coco_2019_01_03/pipeline.config
    TRAINED_CKPT_PREFIX=<path_to>/ssd_mobilenet_v2_quantized_300x300_coco_2019_01_03/model.ckpt
    EXPORT_DIR=<path_to>/exported
    pushd ~/tfmodels/models/tfmodels/research
    python3 object_detection/export_inference_graph.py \
    --input_type=${INPUT_TYPE} \
    --pipeline_config_path=${PIPELINE_CONFIG_PATH} \
    --trained_checkpoint_prefix=${TRAINED_CKPT_PREFIX} \
    --output_directory=${EXPORT_DIR}
    popd
    Copy to clipboard

Make the script executable

chmod u+x export_train.sh
    Copy to clipboard

Run the script

./export_train.sh
    Copy to clipboard

This should generate a frozen graph in
`<path_to>/exported/frozen_inference_graph.pb`

Convert the frozen graph using the
[snpe-tensorflow-to-dlc](https://docs.qualcomm.com/doc/80-63442-2/topic/tools.html#snpe-tensorflow-to-dlc)
converter.

snpe-tensorflow-to-dlc --input_network <path_to>/exported/frozen_inference_graph.pb --input_dim Preprocessor/sub 1,300,300,3 --out_node detection_classes --out_node detection_boxes --out_node detection_scores ---output_path mobilenet_ssd.dlc --allow_unconsumed_nodes
    Copy to clipboard

After Qualcomm® Neural Processing SDK conversion you should have a mobilenet\_ssd.dlc that
can be loaded and run in the Qualcomm® Neural Processing SDK runtimes.

The output layers for the model are:

- Postprocessor/BatchMultiClassNonMaxSuppression
- add

The output buffer names are:

- (classes) detection\_classes:0 (+1 index offset)
- (classes)
Postprocessor/BatchMultiClassNonMaxSuppression\_classes (0
index offset)
- (boxes) Postprocessor/BatchMultiClassNonMaxSuppression\_boxes
- (scores)
Postprocessor/BatchMultiClassNonMaxSuppression\_scores

**Running the model in |Qualcomm(R)| Neural Processing SDK**

The following are limitations and suggestions for running DLC
model in Qualcomm® Neural Processing SDK:

- Batch dimension &gt; 1 is not supported.
- DetectionOutput layer is supported on CPU runtime processor
only.
To run the model using different runtime processor, such as
GPU or DSP, CPU fallback mode must be enabled in Runtime
List (see
Snpe\_SNPEBuilder\_SetRuntimeProcessorOrder()
description in Qualcomm® Neural Processing SDK API).
If using [snpe-net-run](https://docs.qualcomm.com/doc/80-63442-2/topic/tools.html#snpe-net-run)
tool, use `–runtime_order` option
- Configure DetectionOutput layer reasonably.
Performance of DetectionOutput layer (i.e. processing time)
is function of layer parameters: `top_k`, `keep_top_k`
and `confidence_threshold`.
For example, `top_k` parameters have practically
exponential impact on processing time; e.g. top\_k=100 will
result in much smaller processing time vs. top\_k=1000.
Smaller `confidence_threshold` will result in larger
number of boxes to output, and vice versa.
- Resizing input dimensions at SNPE object creation/build time
is not allowed.
Note that input dimensions are embedded into DLC model
during conversion, but in some cases can be overridden via
Snpe\_SNPEBuilder\_SetInputDimensions()
(see description in Qualcomm® Neural Processing SDK API) at SNPE object
creation/build time. Due to PriorBox layer folding in the
model converter, input/network resizing is not possible.

Last Published: Oct 02, 2025

[Previous Topic
Model Tips](https://docs.qualcomm.com/bundle/publicresource/80-63442-2/topics/usergroup4.md) [Next Topic
Using DeepLabv3](https://docs.qualcomm.com/bundle/publicresource/80-63442-2/topics/convert_deeplabv3.md)