# Image classification and display with Neural Processing SDK 

Source: [https://docs.qualcomm.com/doc/80-70023-50/topic/single-camera-stream-with-image-classification-and-display-with-mobilenet-v1.html](https://docs.qualcomm.com/doc/80-70023-50/topic/single-camera-stream-with-image-classification-and-display-with-mobilenet-v1.html)

The use cases implement an Inceptionv3 model with Qualcomm Neural Processing SDK to
        classify scenes, either overlay or compose the classification labels, and then display the
        results.

You can use any publicly available classification model with TensorFlow and convert it to
            the `.dlc` format as described in [TensorFlow Model Conversion](https://docs.qualcomm.com/bundle/publicresource/topics/80-63442-2/model_conv_tensorflow.html).

## Use qtivoverlay plugin to apply classification overlay

Run the use case on the target
                device:

    gst-launch-1.0 -e --gst-debug=2 \
    qtiqmmfsrc name=camsrc ! video/x-raw,format=NV12_Q08C,width=1280,height=720,framerate=30/1 ! queue ! tee name=split \
    split. ! queue ! qtimetamux name=metamux ! queue ! qtivoverlay ! queue ! waylandsink fullscreen=true sync=false \
    split. ! queue ! qtimlvconverter ! queue ! qtimlsnpe delegate=dsp model=/etc/models/inceptionv3.dlc ! queue ! \
    qtimlpostprocess settings="{\"confidence\": 40.0}" results=2 module=mobilenet-softmax labels=/etc/labels/classification.json ! text/x-raw ! queue ! metamux.Copy to clipboard

To stop the use case, use CTRL + C.

The following figure shows the flow of the use case execution:

- Classify scenes from a video stream coming through a camera source.
- Overlay classification labels using overlaylib.
- Display the results on a local display.

Figure : Pipeline for classification overlay
                
                <!--?xml version="1.0" encoding="UTF-8"?-->
<svg xmlns="http://www.w3.org/2000/svg" width="755" height="444.700401306152344" viewbox="0 0 755 444.700401306152344">
  <defs>
    <style>.svg-1 .cls-1 { fill: none; stroke: #000; stroke-miterlimit: 10 }
.svg-1 .cls-2 { fill: #fff; font-size: 16px }
.svg-1 .cls-2,.svg-1 .cls-3 { font-family: Roboto-Regular, Roboto }
.svg-1 .cls-4 { fill: #007884 }
.svg-1 .cls-5 { fill: #d2d7e1 }
.svg-1 .cls-6 { fill: #2a2aea }
.svg-1 .cls-3 { font-size: 14px }
.svg-1 .cls-7 { fill: #fafafa }</style>
  </defs>
  <g id="Layer_1" data-name="Layer 1">
    <g>
      <rect class="cls-7" x=".5" y=".499992370605469" width="754" height="443.7001953125" rx="7.499999999999947" ry="7.499999999999947"></rect>
      <path class="cls-5" d="M747,1c3.8597412109375,0,7,3.14019775390625,7,7v428.700401306152344c0,3.85980224609375-3.1402587890625,7-7,7H8c-3.85980224609375,0-7-3.14019775390625-7-7V8c0-3.85980224609375,3.14019775390625-7,7-7h739M747,0H8C3.581695556640625,0,0,3.581794738769531,0,8v428.700401306152344c0,4.418304443359375,3.581695556640625,8,8,8h739c4.4183349609375,0,8-3.581695556640625,8-8V8c0-4.418205261230469-3.5816650390625-8-8-8h0Z"></path>
    </g>
    <g>
      <g>
        <text class="cls-3" transform="translate(559.4927978515625 420.791877746582031)"><tspan x="0" y="0">Qualcomm </tspan></text>
        <rect class="cls-6" x="539.241789584668368" y="408.700401306152344" width="16" height="16" rx="2" ry="2"></rect>
      </g>
      <g>
        <text class="cls-3" transform="translate(658.0745849609375 420.791877746582031)"><tspan x="0" y="0">Open source</tspan></text>
        <rect class="cls-4" x="637.823570997521529" y="408.700401306152344" width="16" height="16" rx="2" ry="2"></rect>
      </g>
    </g>
  </g>
  <g id="Layer_2" data-name="Layer 2">
    <g>
      <line class="cls-1" x1="100" y1="163.700386047363281" x2="100" y2="182.959403991699219"></line>
      <polygon points="96.01092529296875 181.792205810546875 100 188.700386047363281 103.98907470703125 181.792205810546875 96.01092529296875 181.792205810546875"></polygon>
    </g>
    <rect class="cls-4" x="20" y="114.725808818846417" width="160" height="50" rx="4" ry="4"></rect>
    <text class="cls-2" transform="translate(88.910198211669922 143.232509136199951)"><tspan x="0" y="0">tee</tspan></text>
    <rect class="cls-6" x="20" y="188.700384418460089" width="160" height="50" rx="4" ry="4"></rect>
    <text class="cls-2" transform="translate(44.527484893798828 218.3763427734375)"><tspan x="0" y="0">qtimlvconverter</tspan></text>
    <g>
      <line class="cls-1" x1="100" y1="238.700386047363281" x2="100" y2="257.959403991699219"></line>
      <polygon points="96.01092529296875 256.792213439941406 100 263.700386047363281 103.98907470703125 256.792213439941406 96.01092529296875 256.792213439941406"></polygon>
    </g>
    <rect class="cls-6" x="20" y="263.700384418460089" width="160" height="50" rx="4" ry="4"></rect>
    <text class="cls-2" transform="translate(64.679828643798828 293.376350402832031)"><tspan x="0" y="0">qtimlsnpe</tspan></text>
    <g>
      <line class="cls-1" x1="100" y1="313.700386047363281" x2="100" y2="332.959403991699219"></line>
      <polygon points="96.01092529296875 331.792198181152344 100 338.700370788574219 103.98907470703125 331.792198181152344 96.01092529296875 331.792198181152344"></polygon>
    </g>
    <rect class="cls-6" x="20" y="338.700384418460089" width="160" height="49.999999999998181" rx="4" ry="4"></rect>
    <text class="cls-2" transform="translate(37.722797393798828 368.376350402832031)"><tspan x="0" y="0">qtimlpostprocess</tspan></text>
    <g>
      <line class="cls-1" x1="180" y1="139.725807189941406" x2="199.259033203125" y2="139.725807189941406"></line>
      <polygon points="198.091796875 143.714874267578125 205 139.725807189941406 198.091796875 135.736743927001953 198.091796875 143.714874267578125"></polygon>
    </g>
    <rect class="cls-6" x="205" y="114.725808818846417" width="160" height="50" rx="4" ry="4"></rect>
    <text class="cls-2" transform="translate(242.296905517578125 143.232509136199951)"><tspan x="0" y="0">qtimetamux</tspan></text>
    <g>
      <line class="cls-1" x1="365" y1="139.725807189941406" x2="384.259033203125" y2="139.725807189941406"></line>
      <polygon points="383.091796875 143.714874267578125 390 139.725807189941406 383.091796875 135.736743927001953 383.091796875 143.714874267578125"></polygon>
    </g>
    <rect class="cls-6" x="390" y="114.725808818846417" width="160" height="50" rx="4" ry="4"></rect>
    <text class="cls-2" transform="translate(431.562530517578125 143.232509136199951)"><tspan x="0" y="0">qtivoverlay</tspan></text>
    <g>
      <line class="cls-1" x1="550" y1="139.725807189941406" x2="569.259033203125" y2="139.725807189941406"></line>
      <polygon points="568.091796875 143.714874267578125 575 139.725807189941406 568.091796875 135.736743927001953 568.091796875 143.714874267578125"></polygon>
    </g>
    <rect class="cls-4" x="575" y="114.725808818846417" width="160" height="50" rx="4" ry="4"></rect>
    <text class="cls-2" transform="translate(611.09771728515625 143.232509136199951)"><tspan x="0" y="0">waylandsink</tspan></text>
    <rect class="cls-4" x="20" y="24.881078636875827" width="160" height="50" rx="4" ry="4"></rect>
    <text class="cls-2" transform="translate(78.082073211669922 53.387771606445312)"><tspan x="0" y="0">filesrc</tspan></text>
    <g>
      <line class="cls-1" x1="180" y1="49.881078720092773" x2="199.259033203125" y2="49.881078720092773"></line>
      <polygon points="198.091796875 53.870141983032227 205 49.881078720092773 198.091796875 45.89201545715332 198.091796875 53.870141983032227"></polygon>
    </g>
    <rect class="cls-4" x="205" y="24.881078636875827" width="160" height="50" rx="4" ry="4"></rect>
    <text class="cls-2" transform="translate(253.703155517578125 53.387771606445312)"><tspan x="0" y="0">qtdemux</tspan></text>
    <g>
      <line class="cls-1" x1="365" y1="49.881078720092773" x2="384.259033203125" y2="49.881078720092773"></line>
      <polygon points="383.091796875 53.870141983032227 390 49.881078720092773 383.091796875 45.89201545715332 383.091796875 53.870141983032227"></polygon>
    </g>
    <rect class="cls-4" x="390" y="24.881078636875827" width="160" height="50" rx="4" ry="4"></rect>
    <text class="cls-2" transform="translate(432.207061767578125 53.387771606445312)"><tspan x="0" y="0">h264parse</tspan></text>
    <g>
      <line class="cls-1" x1="550" y1="49.881078720092773" x2="569.259033203125" y2="49.881078720092773"></line>
      <polygon points="568.091796875 53.870141983032227 575 49.881078720092773 568.091796875 45.89201545715332 568.091796875 53.870141983032227"></polygon>
    </g>
    <g>
      <polyline class="cls-1" points="100 109.085243225097656 100 94.826225280761719 655 94.826225280761719 655 74.92664361000061"></polyline>
      <polygon points="103.98907470703125 107.918048858642578 100 114.826225280761719 96.01092529296875 107.918048858642578 103.98907470703125 107.918048858642578"></polygon>
    </g>
    <rect class="cls-4" x="575" y="24.881078636875827" width="160" height="50" rx="4" ry="4"></rect>
    <text class="cls-2" transform="translate(609.37896728515625 53.387771606445312)"><tspan x="0" y="0">v4l2h264dec</tspan></text>
    <g>
      <polyline class="cls-1" points="180 363.700370788574219 285 363.700401306152344 285 170.366378784179688"></polyline>
      <polygon points="288.98907470703125 171.5335693359375 285 164.625389099121094 281.01092529296875 171.5335693359375 288.98907470703125 171.5335693359375"></polygon>
    </g>
  </g>
</svg>

The following table provides the sequential processing stages of the pipeline
                execution:

| Process | Description |
| --- | --- |
| [qtiqmmfsrc](https://docs.qualcomm.com/doc/80-70023-50/topic/qtiqmmfsrc.html) | <ol class="ol" id="single-camera-stream-with-image-classification-and-display-with-mobilenet-v1__ol_l2f_zgm_vbc"><br>                                    <li class="li">Collects the video stream (source) and creates two copies of<br>                                        the source:<ul class="ul" id="single-camera-stream-with-image-classification-and-display-with-mobilenet-v1__ol_m2f_zgm_vbc"><br>                                            <li class="li">One stream is sent to qtimetamux plugin to retain<br>                                                the video stream.</li><br><br>                                            <li class="li">The other stream is sent to an ML inferencing<br>                                                pipeline.</li><br><br>                                        </ul><br></li><br><br>                                </ol> |
| **Preprocessing** | **Preprocessing** |
| [qtimlvconverter](https://docs.qualcomm.com/doc/80-70023-50/topic/qtimlvconverter.html) | <ol class="ol" id="single-camera-stream-with-image-classification-and-display-with-mobilenet-v1__ol_xsf_q5l_vbc"><br>                                    <li class="li">Receives the video stream on its sink pad.</li><br><br>                                    <li class="li">Performs preprocessing:<ul class="ul" id="single-camera-stream-with-image-classification-and-display-with-mobilenet-v1__ul_ff2_twl_vbc"><br>                                            <li class="li">Color conversion</li><br><br>                                            <li class="li">Scaling down/up</li><br><br>                                            <li class="li">Normalization on the stream data when the model<br>                                                expects the floating point values as input</li><br><br>                                        </ul><br></li><br><br>                                    <li class="li">Converts the video stream to a tensor stream on its source<br>                                            pad.<p class="p">The classification model uses this tensor stream<br>                                            for inferencing.</p><br></li><br><br>                                </ol> |
| **Inferencing** | **Inferencing** |
| [qtimlsnpe](https://docs.qualcomm.com/doc/80-70023-50/topic/qtimlsnpe.html) | <ol class="ol" id="single-camera-stream-with-image-classification-and-display-with-mobilenet-v1__ol_bwn_s5l_vbc"><br>                                    <li class="li">Loads the model.</li><br><br>                                    <li class="li">Modifies the graph for the chosen delegate.</li><br><br>                                    <li class="li">Receives the tensor stream on its sinkpad.</li><br><br>                                    <li class="li">Runs the inference and produces a tensor stream with the<br>                                        inference results on its source pad.</li><br><br>                                </ol> |
| **Postprocessing** | **Postprocessing** |
| qtimlpostprocess | <ol class="ol" id="single-camera-stream-with-image-classification-and-display-with-mobilenet-v1__ol_gr1_w5l_vbc"><br>                                    <li class="li">Receives the inference tensors from the model on its<br>                                        sinkpad.</li><br><br>                                    <li class="li">Converts the tensors into formats such as video or text that<br>                                        the multimedia plugins can process later.</li><br><br>                                    <li class="li">Applies the threshold to the chosen number of results.</li><br><br>                                    <li class="li">Loads the corresponding modules of the classification<br>                                        models. <p class="p">In this use case, qtimlpostprocess does the<br>                                            following:</p><ol class="ol" type="a" id="single-camera-stream-with-image-classification-and-display-with-mobilenet-v1__ol_rrb_1xl_vbc"><br>                                            <li class="li">Loads the submodule of the model.</li><br><br>                                            <li class="li">Produces results as structures of text.</li><br><br>                                            <li class="li">Sends them to the sinkpad of qtimetamux.</li><br><br>                                        </ol><br></li><br><br>                                </ol> |
| [qtimetamux](https://docs.qualcomm.com/doc/80-70023-50/topic/qtimetamux.html) | <ol class="ol" id="single-camera-stream-with-image-classification-and-display-with-mobilenet-v1__ol_ll3_x5l_vbc"><br>                                    <li class="li">Receives the video stream and text stream with<br>                                        classification results corresponding to the video stream on<br>                                        its sinkpads.</li><br><br>                                    <li class="li">Produces GST buffers with the contents of video stream on<br>                                        its sink pad.</li><br><br>                                    <li class="li">Adds classification result from data sinkpad to GST buffer<br>                                        meta (meta muxing) on its source pad.</li><br><br>                                </ol> |
| [qtivoverlay](https://docs.qualcomm.com/doc/80-70023-50/topic/qtioverlay.html) | <ol class="ol" id="single-camera-stream-with-image-classification-and-display-with-mobilenet-v1__ol_wst_y5l_vbc"><br>                                    <li class="li">Receives the multiplexed stream.</li><br><br>                                    <li class="li">Overlays the classification labels on the VideoFrame using<br>                                        CL. </li><br><br>                                    <li class="li">Produces GST buffers with overlays in its source pad.</li><br><br>                                </ol> |
| **Output** | **Output** |
| [Waylandsink](https://docs.qualcomm.com/doc/80-70023-50/topic/waylandsink.html) | <ol class="ol" id="single-camera-stream-with-image-classification-and-display-with-mobilenet-v1__ol_fd3_wc5_vbc"><br>                                    <li class="li">Receives the video stream on its sinkpad.</li><br><br>                                    <li class="li">Submits the video stream to Weston. </li><br><br>                                    <li class="li">Weston renders the video stream and possible classifications<br>                                        generated for that scene on a local display device.</li><br><br>                                </ol> |

## Use qtivcomposer to mix original frame with classification mask

Run the use case on the target device:

    gst-launch-1.0 -e --gst-debug=2 qtiqmmfsrc name=camsrc ! video/x-raw,format=NV12_Q08C,width=1280,height=720,framerate=30/1 ! queue ! \
    tee name=split split. ! queue ! qtivcomposer name=mixer sink_1::position="<30, 30>" sink_1::dimensions="<320, 320>" ! queue ! waylandsink fullscreen=true \
    split. ! queue ! qtimlvconverter ! queue ! qtimlsnpe delegate=dsp model=/etc/models/inceptionv3.dlc ! queue ! \
    qtimlpostprocess settings="{\"confidence\": 40.0}" results=2 module=mobilenet-softmax labels=/etc/labels/classification.json ! \
    video/x-raw,format=BGRA,width=640,height=360 ! queue ! mixer.Copy to clipboard

To stop the use case, use CTRL + C.

The following figure shows the flow of the use case execution:
- Classify scenes from a video stream coming through a camera source.
- Compose classification labels and video stream together using
                        qtivcomposer.
- Display the results to a local display.

Figure : Pipeline for classification using qtivcomposer
                
                <!--?xml version="1.0" encoding="UTF-8"?-->
<svg id="Layer_2" data-name="Layer 2" xmlns="http://www.w3.org/2000/svg" width="755" height="444.133987426757812" viewbox="0 0 755 444.133987426757812">
  <defs>
    <style>.svg-2 .cls-1 { fill: none; stroke: #000; stroke-miterlimit: 10 }
.svg-2 .cls-2 { fill: #fff; font-size: 16px }
.svg-2 .cls-2,.svg-2 .cls-3 { font-family: Roboto-Regular, Roboto }
.svg-2 .cls-4 { fill: #007884 }
.svg-2 .cls-5 { fill: #d2d7e1 }
.svg-2 .cls-6 { fill: #2a2aea }
.svg-2 .cls-3 { font-size: 14px }
.svg-2 .cls-7 { fill: #fafafa }</style>
  </defs>
  <g>
    <rect class="cls-7" x=".5" y=".5" width="753.99951171875" height="443.1337890625" rx="7.499999999999983" ry="7.499999999999983"></rect>
    <path class="cls-5" d="M747,1c3.8597412109375,0,7,3.14019775390625,7,7v428.133987426757812c0,3.85980224609375-3.1402587890625,7-7,7H8c-3.85980224609375,0-7-3.14019775390625-7-7V8c0-3.85980224609375,3.14019775390625-7,7-7h739M747,0H8C3.581695556640625,0,0,3.581695556640625,0,8v428.133987426757812c0,4.418296813964844,3.581695556640625,8,8,8h739c4.418212890625,0,8-3.581703186035156,8-8V8c0-4.418304443359375-3.581787109375-8-8-8h0Z"></path>
  </g>
  <g>
    <g>
      <text class="cls-3" transform="translate(557.492887065049217 420.225437934667752)"><tspan x="0" y="0">Qualcomm </tspan></text>
      <rect class="cls-6" x="537.24186873979852" y="408.134004540469505" width="16" height="16" rx="2" ry="2"></rect>
    </g>
    <g>
      <text class="cls-3" transform="translate(656.074674174424217 420.225437934667752)"><tspan x="0" y="0">Open source</tspan></text>
      <rect class="cls-4" x="635.823650152655318" y="408.134004540469505" width="16" height="16" rx="2" ry="2"></rect>
    </g>
  </g>
  <g>
    <g>
      <line class="cls-1" x1="99.999967143174217" y1="163.133997288068713" x2="99.999967143174217" y2="182.393030491193713"></line>
      <polygon points="96.010892436142967 181.225824680646838 99.999967143174217 188.133997288068713 103.989041850205467 181.225824680646838 96.010892436142967 181.225824680646838"></polygon>
    </g>
    <g>
      <rect class="cls-4" x="19.999967143174217" y="114.15942731195355" width="160" height="50" rx="4" ry="4"></rect>
      <text class="cls-2" transform="translate(88.910163447495506 142.666142280370877)"><tspan x="0" y="0">tee</tspan></text>
    </g>
    <g>
      <rect class="cls-6" x="19.999967143174217" y="188.134002911566313" width="160" height="50" rx="4" ry="4"></rect>
      <text class="cls-2" transform="translate(44.527461573716209 217.809971626074002)"><tspan x="0" y="0">qtimlvconverter</tspan></text>
    </g>
    <g>
      <line class="cls-1" x1="99.999967143174217" y1="238.133997288068713" x2="99.999967143174217" y2="257.39301523240465"></line>
      <polygon points="96.010892436142967 256.225824680646838 99.999967143174217 263.133997288068713 103.989041850205467 256.225824680646838 96.010892436142967 256.225824680646838"></polygon>
    </g>
    <g>
      <rect class="cls-6" x="19.999967143174217" y="263.134002911566313" width="160" height="50" rx="4" ry="4"></rect>
      <text class="cls-2" transform="translate(64.679805323716209 292.80995636728494)"><tspan x="0" y="0">qtimlsnpe</tspan></text>
    </g>
    <g>
      <line class="cls-1" x1="99.999967143174217" y1="313.133997288068713" x2="99.999967143174217" y2="332.39301523240465"></line>
      <polygon points="96.010892436142967 331.225824680646838 99.999967143174217 338.134004917463244 103.989041850205467 331.225824680646838 96.010892436142967 331.225824680646838"></polygon>
    </g>
    <g>
      <rect class="cls-6" x="19.999967143174217" y="338.134002911566313" width="160" height="49.999999999998181" rx="4" ry="4"></rect>
      <text class="cls-2" transform="translate(37.722774073716209 367.80995636728494)"><tspan x="0" y="0">qtimlpostprocess</tspan></text>
    </g>
    <g>
      <line class="cls-1" x1="179.999967143174217" y1="139.159418430646838" x2="199.258969828721092" y2="139.159418430646838"></line>
      <polygon points="198.091794535752342 143.148493137678088 204.999967143174217 139.159418430646838 198.091794535752342 135.170374241193713 198.091794535752342 143.148493137678088"></polygon>
    </g>
    <g>
      <rect class="cls-6" x="204.999967143174217" y="114.15942731195355" width="160" height="50" rx="4" ry="4"></rect>
      <text class="cls-2" transform="translate(236.140622660752342 143.835331733495877)"><tspan x="0" y="0">qtivcomposer</tspan></text>
    </g>
    <g>
      <line class="cls-1" x1="364.999967143174217" y1="139.159418430646838" x2="384.259000346299217" y2="139.159418430646838"></line>
      <polygon points="383.091764018174217 143.148493137678088 389.999967143174217 139.159418430646838 383.091764018174217 135.170374241193713 383.091764018174217 143.148493137678088"></polygon>
    </g>
    <g>
      <rect class="cls-4" x="389.999967143174217" y="114.15942731195355" width="160" height="50" rx="4" ry="4"></rect>
      <text class="cls-2" transform="translate(426.097684428330467 142.666142280370877)"><tspan x="0" y="0">waylandsink</tspan></text>
    </g>
    <g>
      <polyline class="cls-1" points="179.999967143174217 365.497721042951525 284.999967143174217 365.497728672346057 284.999967143174217 169.799982395489678"></polyline>
      <polygon points="288.989041850205467 170.967188206037463 284.999967143174217 164.059015598614678 281.010892436142967 170.967188206037463 288.989041850205467 170.967188206037463"></polygon>
    </g>
    <rect class="cls-4" x="19.999967143174217" y="24.113862343800974" width="160" height="50" rx="4" ry="4"></rect>
    <text class="cls-2" transform="translate(78.082038447495506 52.620579536230252)"><tspan x="0" y="0">filesrc</tspan></text>
    <g>
      <line class="cls-1" x1="179.999967143174217" y1="49.113862938907005" x2="199.258969828721092" y2="49.113862938907005"></line>
      <polygon points="198.091794535752342 53.102926201846458 204.999967143174217 49.113862938907005 198.091794535752342 45.124797768618919 198.091794535752342 53.102926201846458"></polygon>
    </g>
    <rect class="cls-4" x="204.999967143174217" y="24.113862343800974" width="160" height="50" rx="4" ry="4"></rect>
    <text class="cls-2" transform="translate(253.703122660752342 52.620579536230252)"><tspan x="0" y="0">qtdemux</tspan></text>
    <g>
      <line class="cls-1" x1="364.999967143174217" y1="49.113862938907005" x2="384.259000346299217" y2="49.113862938907005"></line>
      <polygon points="383.091764018174217 53.102926201846458 389.999967143174217 49.113862938907005 383.091764018174217 45.124797768618919 383.091764018174217 53.102926201846458"></polygon>
    </g>
    <rect class="cls-4" x="389.999967143174217" y="24.113862343800974" width="160" height="50" rx="4" ry="4"></rect>
    <text class="cls-2" transform="translate(432.206998393174217 52.620579536230252)"><tspan x="0" y="0">h264parse</tspan></text>
    <g>
      <line class="cls-1" x1="549.999967143174217" y1="49.113862938907005" x2="569.259000346299217" y2="49.113862938907005"></line>
      <polygon points="568.091764018174217 53.102926201846458 574.999967143174217 49.113862938907005 568.091764018174217 45.124797768618919 568.091764018174217 53.102926201846458"></polygon>
    </g>
    <g>
      <polyline class="cls-1" points="99.999723002549217 108.418443627382658 99.999723002549217 94.159427590396263 654.999967143174217 94.159427590396263 654.999967143174217 74.159427351977683"></polyline>
      <polygon points="103.988797709580467 107.251253075625755 99.999723002549217 114.159425683048539 96.01066355430703 107.251253075625755 103.988797709580467 107.251253075625755"></polygon>
    </g>
    <rect class="cls-4" x="574.999967143174217" y="24.113862343800974" width="160" height="50" rx="4" ry="4"></rect>
    <text class="cls-2" transform="translate(609.378934428330467 52.620579536230252)"><tspan x="0" y="0">v4l2h264dec</tspan></text>
  </g>
</svg>

The following table provides the sequential processing stages of the pipeline
                execution:

| Process | Description |
| --- | --- |
| [qtiqmmfsrc](https://docs.qualcomm.com/doc/80-70023-50/topic/qtiqmmfsrc.html) | <ol class="ol" id="single-camera-stream-with-image-classification-and-display-with-mobilenet-v1__ol_x5l_jd5_vbc"><br>                                    <li class="li">Collects the video stream (source) and creates two copies of<br>                                        the source:<ul class="ul" id="single-camera-stream-with-image-classification-and-display-with-mobilenet-v1__ul_n44_nwl_vbc"><br>                                            <li class="li">One stream is sent to the qtivcomposer plugin to<br>                                                retain the video stream.</li><br><br>                                            <li class="li">The other stream is sent to the ML inferencing<br>                                                branch in the pipeline.</li><br><br>                                        </ul><br></li><br><br>                                </ol> |
| **Preprocessing** | **Preprocessing** |
| [qtimlvconverter](https://docs.qualcomm.com/doc/80-70023-50/topic/qtimlvconverter.html) | <ol class="ol" id="single-camera-stream-with-image-classification-and-display-with-mobilenet-v1__ol_i5w_4wl_vbc"><br>                                    <li class="li">Receives the video stream on its sink pad.</li><br><br>                                    <li class="li">Performs preprocessing:<ul class="ul" id="single-camera-stream-with-image-classification-and-display-with-mobilenet-v1__ol_zdw_qwl_vbc"><br>                                            <li class="li">Color conversion</li><br><br>                                            <li class="li">Scaling down/up</li><br><br>                                            <li class="li">Normalization on the stream data when a model<br>                                                expects the floating point values as input</li><br><br>                                        </ul><br></li><br><br>                                    <li class="li">Converts the video stream to a tensor stream on its source<br>                                            pad.<p class="p">The classification model uses this tensor stream<br>                                            for inferencing.</p><br></li><br><br>                                </ol> |
| **Inferencing** | **Inferencing** |
| [qtimlsnpe](https://docs.qualcomm.com/doc/80-70023-50/topic/qtimlsnpe.html) | <ol class="ol" id="single-camera-stream-with-image-classification-and-display-with-mobilenet-v1__ol_u1l_cxl_vbc"><br>                                    <li class="li">Loads the model.</li><br><br>                                    <li class="li">Modifies the graph for the chosen delegate.</li><br><br>                                    <li class="li">Receives the tensor stream on its sinkpad.</li><br><br>                                    <li class="li">Runs the inference and produces a tensor stream with the<br>                                        inference results on its source pad.</li><br><br>                                </ol> |
| **Postprocessing** | **Postprocessing** |
| qtimlpostprocess | <ol class="ol" id="single-camera-stream-with-image-classification-and-display-with-mobilenet-v1__ol_o3v_2xl_vbc"><br>                                    <li class="li">Receives the inference results from the model on its<br>                                        sinkpad. </li><br><br>                                    <li class="li">Converts the inference tensors into formats like video or<br>                                        text that the multimedia plugins can process later.</li><br><br>                                    <li class="li">Applies the threshold to the chosen number of results. </li><br><br>                                    <li class="li">Loads the corresponding modules for the classification<br>                                        models. <p class="p">In this use case, qtimlpostprocess does the<br>                                            following: </p><ol class="ol" type="a" id="single-camera-stream-with-image-classification-and-display-with-mobilenet-v1__ol_p3v_2xl_vbc"><br>                                            <li class="li">Loads the submodule of the model.</li><br><br>                                            <li class="li">Produces results as video frames with classification<br>                                                labels.</li><br><br>                                            <li class="li">Sends them to the sinkpad of qtivcomposer.</li><br><br>                                        </ol><br></li><br><br>                                </ol> |
| [qtivcomposer](https://docs.qualcomm.com/doc/80-70023-50/topic/qtivcomposer.html) | <ol class="ol" id="single-camera-stream-with-image-classification-and-display-with-mobilenet-v1__ol_nmc_lxl_vbc"><br>                                    <li class="li">Receives the original video stream with classification<br>                                        results on its sinkpads. </li><br><br>                                    <li class="li">On its sourcepad, produces GST buffers with contents<br>                                        composed of video streams from its sinkpads.</li><br><br>                                </ol> |
| **Output** | **Output** |
| [Waylandsink](https://docs.qualcomm.com/doc/80-70023-50/topic/waylandsink.html) | <ol class="ol" id="single-camera-stream-with-image-classification-and-display-with-mobilenet-v1__ol_cgt_mwl_vbc"><br>                                    <li class="li">Receives the video in its sinkpad</li><br><br>                                    <li class="li">Submits the video stream to Weston. </li><br><br>                                    <li class="li">Weston renders the video stream and possible classifications<br>                                        generated for that scene on a local display device.</li><br><br>                                </ol> |

**Parent Topic:** [Qualcomm Neural Processing SDK use cases](https://docs.qualcomm.com/doc/80-70023-50/topic/qualcomm-neural-processing-sdk-use-cases.html)

Last Published: Mar 27, 2026

[Previous Topic
Qualcomm Neural Processing SDK use cases](https://docs.qualcomm.com/bundle/publicresource/80-70023-50/topics/qualcomm-neural-processing-sdk-use-cases.md) [Next Topic
Image classification and encode with Neural Processing SDK](https://docs.qualcomm.com/bundle/publicresource/80-70023-50/topics/single-camera-stream-with-image-classification-and-encode-with-mobilenet-v1.md)