# Image classification and display with Neural Processing SDK 

Source: [https://docs.qualcomm.com/doc/80-70022-50/topic/single-camera-stream-with-image-classification-and-display-with-mobilenet-v1.html](https://docs.qualcomm.com/doc/80-70022-50/topic/single-camera-stream-with-image-classification-and-display-with-mobilenet-v1.html)

The use cases implement an Inceptionv3 model with Qualcomm Neural Processing SDK to
        classify scenes, either overlay or compose the classification labels, and then display the
        results.

You can use any publicly available classification model with TensorFlow and convert it to
            the `.dlc` format as described in [TensorFlow Model Conversion](https://docs.qualcomm.com/bundle/publicresource/topics/80-63442-2/model_conv_tensorflow.html).

## Use qtivoverlay plugin to apply classification overlay

Run the use case on the target
                device:

    gst-launch-1.0 -e --gst-debug=2 \
    qtiqmmfsrc name=camsrc ! video/x-raw,format=NV12_Q08C,width=1280,height=720,framerate=30/1 ! queue ! tee name=split \
    split. ! queue ! qtimetamux name=metamux ! queue ! qtivoverlay ! queue ! waylandsink fullscreen=true sync=false \
    split. ! queue ! qtimlvconverter ! queue ! qtimlsnpe delegate=dsp model=/etc/models/inceptionv3.dlc ! queue ! \
    qtimlpostprocess settings="{\"confidence\": 40.0}" results=2 module=mobilenet-softmax labels=/etc/labels/classification.json ! text/x-raw ! queue ! metamux.Copy to clipboard

To stop the use case, use CTRL + C.

The following figure shows the flow of the use case execution:

- Classify scenes from a video stream coming through a camera source.
- Overlay classification labels using overlaylib.
- Display the results on a local display.

Figure : Pipeline for classification overlay
                
                <!--?xml version="1.0" encoding="UTF-8"?-->
<svg xmlns="http://www.w3.org/2000/svg" width="940" height="349.974597930908203" viewbox="0 0 940 349.974597930908203">
  <g id="Layer_1" data-name="Layer 1">
    <g>
      <rect x=".5" y=".499774932861328" width="939" height="348.974609375" rx="7.5" ry="7.5" style="fill: #fafafa;"></rect>
      <path d="M932,1c3.859741210939319,0,7,3.140233993530273,7,7v333.974597930908203c0,3.8597412109375-3.140258789060681,7-7,7H8c-3.859771728515625,0-7-3.1402587890625-7-7V8c0-3.859766006469727,3.140228271484375-7,7-7h924M932,0H8C3.581771850585938,0,0,3.581764221191406,0,8v333.974597930908203c0,4.418212890625,3.581771850585938,8,8,8h924c4.418334960939319,0,8-3.581787109375,8-8V8c0-4.418235778808594-3.581665039060681-8-8-8h0Z" style="fill: #d2d7e1;"></path>
    </g>
    <g>
      <g>
        <text transform="translate(744.4927978515625 326.066074371337891)" style="font-family: Roboto-Regular, Roboto; font-size: 14px;"><tspan x="0" y="0">Qualcomm </tspan></text>
        <rect x="724.241789584668368" y="313.974597930908203" width="16" height="16" rx="2" ry="2" style="fill: #2a2aea;"></rect>
      </g>
      <g>
        <text transform="translate(843.0745849609375 326.066074371337891)" style="font-family: Roboto-Regular, Roboto; font-size: 14px;"><tspan x="0" y="0">Open source</tspan></text>
        <rect x="822.823570997521529" y="313.974597930908203" width="16" height="16" rx="2" ry="2" style="fill: #007884;"></rect>
      </g>
    </g>
  </g>
  <g id="Layer_2" data-name="Layer 2">
    <g>
      <g>
        <g>
          <rect x="20" y="20.000005443602277" width="160" height="50" rx="4" ry="4" style="fill: #007884;"></rect>
          <text transform="translate(73.429725646972656 48.506705760955811)" style="fill: #fff; font-family: Roboto-Regular, Roboto; font-size: 16px;"><tspan x="0" y="0">camsrc</tspan></text>
        </g>
        <g>
          <line x1="180" y1="45.000003814697266" x2="199.976654052734375" y2="45.000003814697266" style="fill: none; stroke: #000; stroke-miterlimit: 10;"></line>
          <polygon points="198.955352783203125 48.490436553955078 205 45.000003814697266 198.955352783203125 41.509574890136719 198.955352783203125 48.490436553955078"></polygon>
        </g>
        <g>
          <line x1="285" y1="68.974582672119141" x2="285" y2="88.951221466064453" style="fill: none; stroke: #000; stroke-miterlimit: 10;"></line>
          <polygon points="281.50958251953125 87.929927825927734 285 93.974582672119141 288.49041748046875 87.929927825927734 281.50958251953125 87.929927825927734"></polygon>
        </g>
        <g>
          <rect x="205" y="20.000005443602277" width="160" height="50" rx="4" ry="4" style="fill: #007884;"></rect>
          <text transform="translate(273.910198211669922 48.506705760955811)" style="fill: #fff; font-family: Roboto-Regular, Roboto; font-size: 16px;"><tspan x="0" y="0">tee</tspan></text>
        </g>
        <g>
          <rect x="205" y="93.974581043215949" width="160" height="50" rx="4" ry="4" style="fill: #2a2aea;"></rect>
          <text transform="translate(229.527484893798828 123.650539398193359)" style="fill: #fff; font-family: Roboto-Regular, Roboto; font-size: 16px;"><tspan x="0" y="0">qtimlvconverter</tspan></text>
        </g>
        <g>
          <line x1="285" y1="143.974582672119141" x2="285" y2="163.951221466064453" style="fill: none; stroke: #000; stroke-miterlimit: 10;"></line>
          <polygon points="281.50958251953125 162.929920196533203 285 168.974582672119141 288.49041748046875 162.929920196533203 281.50958251953125 162.929920196533203"></polygon>
        </g>
        <g>
          <rect x="205" y="168.974581043215949" width="160" height="50" rx="4" ry="4" style="fill: #2a2aea;"></rect>
          <text transform="translate(249.679828643798828 198.650547027587891)" style="fill: #fff; font-family: Roboto-Regular, Roboto; font-size: 16px;"><tspan x="0" y="0">qtimlsnpe</tspan></text>
        </g>
        <g>
          <line x1="285" y1="218.974582672119141" x2="285" y2="238.951221466064453" style="fill: none; stroke: #000; stroke-miterlimit: 10;"></line>
          <polygon points="281.50958251953125 237.929920196533203 285 243.974567413330078 288.49041748046875 237.929920196533203 281.50958251953125 237.929920196533203"></polygon>
        </g>
        <g>
          <rect x="205" y="243.974581043215949" width="160" height="49.999999999998181" rx="4" ry="4" style="fill: #2a2aea;"></rect>
          <text transform="translate(222.722797393798828 273.650547027587891)" style="fill: #fff; font-family: Roboto-Regular, Roboto; font-size: 16px;"><tspan x="0" y="0">qtimlpostprocess</tspan></text>
        </g>
        <g>
          <line x1="365" y1="45.000003814697266" x2="384.97662353515625" y2="45.000003814697266" style="fill: none; stroke: #000; stroke-miterlimit: 10;"></line>
          <polygon points="383.955322265625 48.490436553955078 390 45.000003814697266 383.955322265625 41.509574890136719 383.955322265625 48.490436553955078"></polygon>
        </g>
        <g>
          <rect x="390" y="20.000005443602277" width="160" height="50" rx="4" ry="4" style="fill: #2a2aea;"></rect>
          <text transform="translate(427.296905517578125 48.506705760955811)" style="fill: #fff; font-family: Roboto-Regular, Roboto; font-size: 16px;"><tspan x="0" y="0">qtimetamux</tspan></text>
        </g>
        <g>
          <line x1="550" y1="45.000003814697266" x2="569.97662353515625" y2="45.000003814697266" style="fill: none; stroke: #000; stroke-miterlimit: 10;"></line>
          <polygon points="568.955322265625 48.490436553955078 575 45.000003814697266 568.955322265625 41.509574890136719 568.955322265625 48.490436553955078"></polygon>
        </g>
        <g>
          <rect x="575" y="20.000005443602277" width="160" height="50" rx="4" ry="4" style="fill: #2a2aea;"></rect>
          <text transform="translate(620.437530517578125 48.506705760955811)" style="fill: #fff; font-family: Roboto-Regular, Roboto; font-size: 16px;"><tspan x="0" y="0">qtioverlay</tspan></text>
        </g>
        <g>
          <line x1="735" y1="45.000003814697266" x2="754.97662353515625" y2="45.000003814697266" style="fill: none; stroke: #000; stroke-miterlimit: 10;"></line>
          <polygon points="753.955322265625 48.490436553955078 760 45.000003814697266 753.955322265625 41.509574890136719 753.955322265625 48.490436553955078"></polygon>
        </g>
        <g>
          <rect x="760" y="20.000005443602277" width="160" height="50" rx="4" ry="4" style="fill: #007884;"></rect>
          <text transform="translate(796.09771728515625 48.506705760955811)" style="fill: #fff; font-family: Roboto-Regular, Roboto; font-size: 16px;"><tspan x="0" y="0">waylandsink</tspan></text>
        </g>
      </g>
      <g>
        <polyline points="365 268.974567413330078 470 268.974597930908203 470 74.922946929931641" style="fill: none; stroke: #000; stroke-miterlimit: 10;"></polyline>
        <polygon points="473.49041748046875 75.944240570068359 470 69.899585723876953 466.50958251953125 75.944240570068359 473.49041748046875 75.944240570068359"></polygon>
      </g>
    </g>
  </g>
</svg>

The following table provides the sequential processing stages of the pipeline
                execution:

| Process | Description |
| --- | --- |
| [qtiqmmfsrc](https://docs.qualcomm.com/doc/80-70022-50/topic/qtiqmmfsrc.html) | <ol class="ol" id="single-camera-stream-with-image-classification-and-display-with-mobilenet-v1__ol_l2f_zgm_vbc"><br>                                    <li class="li">Collects the video stream (source) and creates two copies of<br>                                        the source:<ul class="ul" id="single-camera-stream-with-image-classification-and-display-with-mobilenet-v1__ol_m2f_zgm_vbc"><br>                                            <li class="li">One stream is sent to qtimetamux plugin to retain<br>                                                the video stream.</li><br><br>                                            <li class="li">The other stream is sent to an ML inferencing<br>                                                pipeline.</li><br><br>                                        </ul><br></li><br><br>                                </ol> |
| **Preprocessing** | **Preprocessing** |
| [qtimlvconverter](https://docs.qualcomm.com/doc/80-70022-50/topic/qtimlvconverter.html) | <ol class="ol" id="single-camera-stream-with-image-classification-and-display-with-mobilenet-v1__ol_xsf_q5l_vbc"><br>                                    <li class="li">Receives the video stream on its sink pad.</li><br><br>                                    <li class="li">Performs preprocessing:<ul class="ul" id="single-camera-stream-with-image-classification-and-display-with-mobilenet-v1__ul_ff2_twl_vbc"><br>                                            <li class="li">Color conversion</li><br><br>                                            <li class="li">Scaling down/up</li><br><br>                                            <li class="li">Normalization on the stream data when the model<br>                                                expects the floating point values as input</li><br><br>                                        </ul><br></li><br><br>                                    <li class="li">Converts the video stream to a tensor stream on its source<br>                                            pad.<p class="p">The classification model uses this tensor stream<br>                                            for inferencing.</p><br></li><br><br>                                </ol> |
| **Inferencing** | **Inferencing** |
| [qtimlsnpe](https://docs.qualcomm.com/doc/80-70022-50/topic/qtimlsnpe.html) | <ol class="ol" id="single-camera-stream-with-image-classification-and-display-with-mobilenet-v1__ol_bwn_s5l_vbc"><br>                                    <li class="li">Loads the model.</li><br><br>                                    <li class="li">Modifies the graph for the chosen delegate.</li><br><br>                                    <li class="li">Receives the tensor stream on its sinkpad.</li><br><br>                                    <li class="li">Runs the inference and produces a tensor stream with the<br>                                        inference results on its source pad.</li><br><br>                                </ol> |
| **Postprocessing** | **Postprocessing** |
| qtimlpostprocess | <ol class="ol" id="single-camera-stream-with-image-classification-and-display-with-mobilenet-v1__ol_gr1_w5l_vbc"><br>                                    <li class="li">Receives the inference tensors from the model on its<br>                                        sinkpad.</li><br><br>                                    <li class="li">Converts the tensors into formats such as video or text that<br>                                        the multimedia plugins can process later.</li><br><br>                                    <li class="li">Applies the threshold to the chosen number of results.</li><br><br>                                    <li class="li">Loads the corresponding modules of the classification<br>                                        models. <p class="p">In this use case, qtimlpostprocess does the<br>                                            following:</p><ol class="ol" type="a" id="single-camera-stream-with-image-classification-and-display-with-mobilenet-v1__ol_rrb_1xl_vbc"><br>                                            <li class="li">Loads the submodule of the model.</li><br><br>                                            <li class="li">Produces results as structures of text.</li><br><br>                                            <li class="li">Sends them to the sinkpad of qtimetamux.</li><br><br>                                        </ol><br></li><br><br>                                </ol> |
| [qtimetamux](https://docs.qualcomm.com/doc/80-70022-50/topic/qtimetamux.html) | <ol class="ol" id="single-camera-stream-with-image-classification-and-display-with-mobilenet-v1__ol_ll3_x5l_vbc"><br>                                    <li class="li">Receives the video stream and text stream with<br>                                        classification results corresponding to the video stream on<br>                                        its sinkpads.</li><br><br>                                    <li class="li">Produces GST buffers with the contents of video stream on<br>                                        its sink pad.</li><br><br>                                    <li class="li">Adds classification result from data sinkpad to GST buffer<br>                                        meta (meta muxing) on its source pad.</li><br><br>                                </ol> |
| [qtivoverlay](https://docs.qualcomm.com/doc/80-70022-50/topic/qtioverlay.html) | <ol class="ol" id="single-camera-stream-with-image-classification-and-display-with-mobilenet-v1__ol_wst_y5l_vbc"><br>                                    <li class="li">Receives the multiplexed stream.</li><br><br>                                    <li class="li">Overlays the classification labels on the VideoFrame using<br>                                        CL. </li><br><br>                                    <li class="li">Produces GST buffers with overlays in its source pad.</li><br><br>                                </ol> |
| **Output** | **Output** |
| [Waylandsink](https://docs.qualcomm.com/doc/80-70022-50/topic/waylandsink.html) | <ol class="ol" id="single-camera-stream-with-image-classification-and-display-with-mobilenet-v1__ol_fd3_wc5_vbc"><br>                                    <li class="li">Receives the video stream on its sinkpad.</li><br><br>                                    <li class="li">Submits the video stream to Weston. </li><br><br>                                    <li class="li">Weston renders the video stream and possible classifications<br>                                        generated for that scene on a local display device.</li><br><br>                                </ol> |

## Use qtivcomposer to mix original frame with classification mask

Run the use case on the target device:

    gst-launch-1.0 -e --gst-debug=2 qtiqmmfsrc name=camsrc ! video/x-raw,format=NV12_Q08C,width=1280,height=720,framerate=30/1 ! queue ! \
    tee name=split split. ! queue ! qtivcomposer name=mixer sink_1::position="<30, 30>" sink_1::dimensions="<320, 320>" ! queue ! waylandsink fullscreen=true \
    split. ! queue ! qtimlvconverter ! queue ! qtimlsnpe delegate=dsp model=/etc/models/inceptionv3.dlc ! queue ! \
    qtimlpostprocess settings="{\"confidence\": 40.0}" results=2 module=mobilenet-softmax labels=/etc/labels/classification.json ! \
    video/x-raw,format=BGRA,width=640,height=360 ! queue ! mixer.Copy to clipboard

To stop the use case, use CTRL + C.

The following figure shows the flow of the use case execution:
- Classify scenes from a video stream coming through a camera source.
- Compose classification labels and video stream together using
                        qtivcomposer.
- Display the results to a local display.

Figure : Pipeline for classification using qtivcomposer
                
                <!--?xml version="1.0" encoding="UTF-8"?-->
<svg xmlns="http://www.w3.org/2000/svg" width="755" height="349.974590301513672" viewbox="0 0 755 349.974590301513672">
  <g id="Layer_1" data-name="Layer 1">
    <g>
      <rect x=".5" y=".499797821044922" width="754" height="348.974609375" rx="7.499999999999944" ry="7.499999999999944" style="fill: #fafafa;"></rect>
      <path d="M747,1c3.85980224609375,0,7,3.140201568603516,7,7v333.974590301513672c0,3.85980224609375-3.14019775390625,7-7,7H8c-3.85980224609375,0-7-3.14019775390625-7-7V8c0-3.859798431396484,3.14019775390625-7,7-7h739M747,0H8C3.581695556640625,0,0,3.581701278686523,0,8v333.974590301513672c0,4.418304443359375,3.581695556640625,8,8,8h739c4.41827392578125,0,8-3.581695556640625,8-8V8c0-4.418298721313477-3.58172607421875-8-8-8h0Z" style="fill: #d2d7e1;"></path>
    </g>
    <g>
      <g>
        <text transform="translate(565.4931640625 326.066082000732422)" style="font-family: Roboto-Regular, Roboto; font-size: 14px;"><tspan x="0" y="0">Qualcomm </tspan></text>
        <rect x="545.242149370851621" y="313.974590301513672" width="16" height="16" rx="2" ry="2" style="fill: #2a2aea;"></rect>
      </g>
      <g>
        <text transform="translate(664.074951171875 326.066082000732422)" style="font-family: Roboto-Regular, Roboto; font-size: 14px;"><tspan x="0" y="0">Open source</tspan></text>
        <rect x="643.823930783704782" y="313.974590301513672" width="16" height="16" rx="2" ry="2" style="fill: #007884;"></rect>
      </g>
    </g>
  </g>
  <g id="Layer_2" data-name="Layer 2">
    <g>
      <g>
        <rect x="20" y="20.000014873722648" width="160" height="50" rx="4" ry="4" style="fill: #007884;"></rect>
        <text transform="translate(73.429729461669922 48.506718635559082)" style="fill: #fff; font-family: Roboto-Regular, Roboto; font-size: 16px;"><tspan x="0" y="0">camsrc</tspan></text>
      </g>
      <g>
        <line x1="180" y1="45.000011444091797" x2="199.976654052734375" y2="45.000011444091797" style="fill: none; stroke: #000; stroke-miterlimit: 10;"></line>
        <polygon points="198.955352783203125 48.490444183349609 205 45.000011444091797 198.955352783203125 41.50958251953125 198.955352783203125 48.490444183349609"></polygon>
      </g>
      <g>
        <line x1="285" y1="68.974590301513672" x2="285" y2="88.951229095458984" style="fill: none; stroke: #000; stroke-miterlimit: 10;"></line>
        <polygon points="281.50958251953125 87.929935455322266 285 93.974590301513672 288.49041748046875 87.929935455322266 281.50958251953125 87.929935455322266"></polygon>
      </g>
      <g>
        <rect x="205" y="20.000014873722648" width="160" height="50" rx="4" ry="4" style="fill: #007884;"></rect>
        <text transform="translate(273.910202026367188 48.506718635559082)" style="fill: #fff; font-family: Roboto-Regular, Roboto; font-size: 16px;"><tspan x="0" y="0">tee</tspan></text>
      </g>
      <g>
        <rect x="205" y="93.97459047333632" width="160" height="50" rx="4" ry="4" style="fill: #2a2aea;"></rect>
        <text transform="translate(229.527496337890625 123.650547027587891)" style="fill: #fff; font-family: Roboto-Regular, Roboto; font-size: 16px;"><tspan x="0" y="0">qtimlvconverter</tspan></text>
      </g>
      <g>
        <line x1="285" y1="143.974590301513672" x2="285" y2="163.951229095458984" style="fill: none; stroke: #000; stroke-miterlimit: 10;"></line>
        <polygon points="281.50958251953125 162.929943084716797 285 168.974590301513672 288.49041748046875 162.929943084716797 281.50958251953125 162.929943084716797"></polygon>
      </g>
      <g>
        <rect x="205" y="168.97459047333632" width="160" height="50" rx="4" ry="4" style="fill: #2a2aea;"></rect>
        <text transform="translate(249.679840087890625 198.650554656982422)" style="fill: #fff; font-family: Roboto-Regular, Roboto; font-size: 16px;"><tspan x="0" y="0">qtimlsnpe</tspan></text>
      </g>
      <g>
        <line x1="285" y1="218.974590301513672" x2="285" y2="238.951244354248047" style="fill: none; stroke: #000; stroke-miterlimit: 10;"></line>
        <polygon points="281.50958251953125 237.929943084716797 285 243.974590301513672 288.49041748046875 237.929943084716797 281.50958251953125 237.929943084716797"></polygon>
      </g>
      <g>
        <rect x="205" y="243.97459047333632" width="160" height="49.999999999999091" rx="4" ry="4" style="fill: #2a2aea;"></rect>
        <text transform="translate(222.722808837890625 273.650554656982422)" style="fill: #fff; font-family: Roboto-Regular, Roboto; font-size: 16px;"><tspan x="0" y="0">qtimlpostprocess</tspan></text>
      </g>
      <g>
        <line x1="365" y1="45.000011444091797" x2="384.97662353515625" y2="45.000011444091797" style="fill: none; stroke: #000; stroke-miterlimit: 10;"></line>
        <polygon points="383.955322265625 48.490444183349609 390 45.000011444091797 383.955322265625 41.50958251953125 383.955322265625 48.490444183349609"></polygon>
      </g>
      <g>
        <rect x="390" y="20.000014873722648" width="160" height="50" rx="4" ry="4" style="fill: #2a2aea;"></rect>
        <text transform="translate(421.140655517578125 48.506718635559082)" style="fill: #fff; font-family: Roboto-Regular, Roboto; font-size: 16px;"><tspan x="0" y="0">qtivcomposer</tspan></text>
      </g>
      <g>
        <line x1="550" y1="45.000011444091797" x2="569.97662353515625" y2="45.000011444091797" style="fill: none; stroke: #000; stroke-miterlimit: 10;"></line>
        <polygon points="568.955322265625 48.490444183349609 575 45.000011444091797 568.955322265625 41.50958251953125 568.955322265625 48.490444183349609"></polygon>
      </g>
      <g>
        <rect x="575" y="20.000014873722648" width="160" height="50" rx="4.000000000000019" ry="4.000000000000019" style="fill: #007884;"></rect>
        <text transform="translate(611.09771728515625 48.506718635559082)" style="fill: #fff; font-family: Roboto-Regular, Roboto; font-size: 16px;"><tspan x="0" y="0">waylandsink</tspan></text>
      </g>
    </g>
    <g>
      <polyline points="365 268.974590301513672 470 268.974590301513672 470 75.847972869873047" style="fill: none; stroke: #000; stroke-miterlimit: 10;"></polyline>
      <polygon points="473.49041748046875 76.869258880615234 470 70.824611663818359 466.50958251953125 76.869258880615234 473.49041748046875 76.869258880615234"></polygon>
    </g>
  </g>
</svg>

The following table provides the sequential processing stages of the pipeline
                execution:

| Process | Description |
| --- | --- |
| [qtiqmmfsrc](https://docs.qualcomm.com/doc/80-70022-50/topic/qtiqmmfsrc.html) | <ol class="ol" id="single-camera-stream-with-image-classification-and-display-with-mobilenet-v1__ol_x5l_jd5_vbc"><br>                                    <li class="li">Collects the video stream (source) and creates two copies of<br>                                        the source:<ul class="ul" id="single-camera-stream-with-image-classification-and-display-with-mobilenet-v1__ul_n44_nwl_vbc"><br>                                            <li class="li">One stream is sent to the qtivcomposer plugin to<br>                                                retain the video stream.</li><br><br>                                            <li class="li">The other stream is sent to the ML inferencing<br>                                                branch in the pipeline.</li><br><br>                                        </ul><br></li><br><br>                                </ol> |
| **Preprocessing** | **Preprocessing** |
| [qtimlvconverter](https://docs.qualcomm.com/doc/80-70022-50/topic/qtimlvconverter.html) | <ol class="ol" id="single-camera-stream-with-image-classification-and-display-with-mobilenet-v1__ol_i5w_4wl_vbc"><br>                                    <li class="li">Receives the video stream on its sink pad.</li><br><br>                                    <li class="li">Performs preprocessing:<ul class="ul" id="single-camera-stream-with-image-classification-and-display-with-mobilenet-v1__ol_zdw_qwl_vbc"><br>                                            <li class="li">Color conversion</li><br><br>                                            <li class="li">Scaling down/up</li><br><br>                                            <li class="li">Normalization on the stream data when a model<br>                                                expects the floating point values as input</li><br><br>                                        </ul><br></li><br><br>                                    <li class="li">Converts the video stream to a tensor stream on its source<br>                                            pad.<p class="p">The classification model uses this tensor stream<br>                                            for inferencing.</p><br></li><br><br>                                </ol> |
| **Inferencing** | **Inferencing** |
| [qtimlsnpe](https://docs.qualcomm.com/doc/80-70022-50/topic/qtimlsnpe.html) | <ol class="ol" id="single-camera-stream-with-image-classification-and-display-with-mobilenet-v1__ol_u1l_cxl_vbc"><br>                                    <li class="li">Loads the model.</li><br><br>                                    <li class="li">Modifies the graph for the chosen delegate.</li><br><br>                                    <li class="li">Receives the tensor stream on its sinkpad.</li><br><br>                                    <li class="li">Runs the inference and produces a tensor stream with the<br>                                        inference results on its source pad.</li><br><br>                                </ol> |
| **Postprocessing** | **Postprocessing** |
| qtimlpostprocess | <ol class="ol" id="single-camera-stream-with-image-classification-and-display-with-mobilenet-v1__ol_o3v_2xl_vbc"><br>                                    <li class="li">Receives the inference results from the model on its<br>                                        sinkpad. </li><br><br>                                    <li class="li">Converts the inference tensors into formats like video or<br>                                        text that the multimedia plugins can process later.</li><br><br>                                    <li class="li">Applies the threshold to the chosen number of results. </li><br><br>                                    <li class="li">Loads the corresponding modules for the classification<br>                                        models. <p class="p">In this use case, qtimlpostprocess does the<br>                                            following: </p><ol class="ol" type="a" id="single-camera-stream-with-image-classification-and-display-with-mobilenet-v1__ol_p3v_2xl_vbc"><br>                                            <li class="li">Loads the submodule of the model.</li><br><br>                                            <li class="li">Produces results as video frames with classification<br>                                                labels.</li><br><br>                                            <li class="li">Sends them to the sinkpad of qtivcomposer.</li><br><br>                                        </ol><br></li><br><br>                                </ol> |
| [qtivcomposer](https://docs.qualcomm.com/doc/80-70022-50/topic/qtivcomposer.html) | <ol class="ol" id="single-camera-stream-with-image-classification-and-display-with-mobilenet-v1__ol_nmc_lxl_vbc"><br>                                    <li class="li">Receives the original video stream with classification<br>                                        results on its sinkpads. </li><br><br>                                    <li class="li">On its sourcepad, produces GST buffers with contents<br>                                        composed of video streams from its sinkpads.</li><br><br>                                </ol> |
| **Output** | **Output** |
| [Waylandsink](https://docs.qualcomm.com/doc/80-70022-50/topic/waylandsink.html) | <ol class="ol" id="single-camera-stream-with-image-classification-and-display-with-mobilenet-v1__ol_cgt_mwl_vbc"><br>                                    <li class="li">Receives the video in its sinkpad</li><br><br>                                    <li class="li">Submits the video stream to Weston. </li><br><br>                                    <li class="li">Weston renders the video stream and possible classifications<br>                                        generated for that scene on a local display device.</li><br><br>                                </ol> |

**Parent Topic:** [Qualcomm Neural Processing SDK use cases](https://docs.qualcomm.com/doc/80-70022-50/topic/qualcomm-neural-processing-sdk-use-cases.html)

Last Published: Feb 20, 2026

[Previous Topic
Qualcomm Neural Processing SDK use cases](https://docs.qualcomm.com/bundle/publicresource/80-70022-50/topics/qualcomm-neural-processing-sdk-use-cases.md) [Next Topic
Image classification and encode with Neural Processing SDK](https://docs.qualcomm.com/bundle/publicresource/80-70022-50/topics/single-camera-stream-with-image-classification-and-encode-with-mobilenet-v1.md)