# Image segmentation and display with Neural Processing SDK

Source: [https://docs.qualcomm.com/doc/80-70023-50/topic/single-camera-stream-with-image-segmentation-and-display-with-deeplabv3-quantized.html](https://docs.qualcomm.com/doc/80-70023-50/topic/single-camera-stream-with-image-segmentation-and-display-with-deeplabv3-quantized.html)

The use case implements the DeepLab v3 model with the Qualcomm Neural Processing SDK
        runtime to identify the semantic segmentations in a scene from a camera stream. The use case
        is to compose the semantics and the video stream together using qtivcomposer, and then
        display the results.

## Use qtivcomposer to mix original frame with segmentation mask

Run the use case on the target
                device:

    gst-launch-1.0 -e --gst-debug=2 \
    qtiqmmfsrc name=camsrc ! video/x-raw,format=NV12_Q08C,width=1920,height=1080,framerate=30/1 ! queue ! tee name=split  \
    split. ! queue ! qtivcomposer name=mixer sink_1::dimensions="<1920,1080>" sink_1::alpha=0.5 ! queue ! waylandsink fullscreen=true \
    split. ! queue ! qtimlvconverter ! queue ! qtimlsnpe delegate=dsp model=/etc/models/deeplabv3_plus_mobilenet.dlc ! queue ! \
    qtimlpostprocess module=deeplab-argmax labels=/etc/labels/deeplabv3_resnet50.json ! video/x-raw,width=640,height=360 ! queue ! mixer.Copy to clipboard

To stop the use case, use CTRL + C.

The following figure shows the flow of the use case execution:

1. Identify scenes from a video stream coming through a camera source.
2. Compose semantic segmentation and video stream using qtivcomposer.
3. Display the results.

Figure : Pipeline for segmentation with qtivcomposer
                
                <!--?xml version="1.0" encoding="UTF-8"?-->
<svg id="Layer_2" data-name="Layer 2" xmlns="http://www.w3.org/2000/svg" width="755" height="444.133987426757812" viewbox="0 0 755 444.133987426757812">
  <defs>
    <style>.svg-1 .cls-1 { fill: none; stroke: #000; stroke-miterlimit: 10 }
.svg-1 .cls-2 { fill: #fff; font-size: 16px }
.svg-1 .cls-2,.svg-1 .cls-3 { font-family: Roboto-Regular, Roboto }
.svg-1 .cls-4 { fill: #007884 }
.svg-1 .cls-5 { fill: #d2d7e1 }
.svg-1 .cls-6 { fill: #2a2aea }
.svg-1 .cls-3 { font-size: 14px }
.svg-1 .cls-7 { fill: #fafafa }</style>
  </defs>
  <g>
    <rect class="cls-7" x=".5" y=".5" width="753.99951171875" height="443.1337890625" rx="7.499999999999983" ry="7.499999999999983"></rect>
    <path class="cls-5" d="M747,1c3.8597412109375,0,7,3.14019775390625,7,7v428.133987426757812c0,3.85980224609375-3.1402587890625,7-7,7H8c-3.85980224609375,0-7-3.14019775390625-7-7V8c0-3.85980224609375,3.14019775390625-7,7-7h739M747,0H8C3.581695556640625,0,0,3.581695556640625,0,8v428.133987426757812c0,4.418296813964844,3.581695556640625,8,8,8h739c4.418212890625,0,8-3.581703186035156,8-8V8c0-4.418304443359375-3.581787109375-8-8-8h0Z"></path>
  </g>
  <g>
    <g>
      <text class="cls-3" transform="translate(557.492887065049217 420.225437934667752)"><tspan x="0" y="0">Qualcomm </tspan></text>
      <rect class="cls-6" x="537.24186873979852" y="408.134004540469505" width="16" height="16" rx="2" ry="2"></rect>
    </g>
    <g>
      <text class="cls-3" transform="translate(656.074674174424217 420.225437934667752)"><tspan x="0" y="0">Open source</tspan></text>
      <rect class="cls-4" x="635.823650152655318" y="408.134004540469505" width="16" height="16" rx="2" ry="2"></rect>
    </g>
  </g>
  <g>
    <g>
      <line class="cls-1" x1="99.999967143174217" y1="163.133997288068713" x2="99.999967143174217" y2="182.393030491193713"></line>
      <polygon points="96.010892436142967 181.225824680646838 99.999967143174217 188.133997288068713 103.989041850205467 181.225824680646838 96.010892436142967 181.225824680646838"></polygon>
    </g>
    <g>
      <rect class="cls-4" x="19.999967143174217" y="114.15942731195355" width="160" height="50" rx="4" ry="4"></rect>
      <text class="cls-2" transform="translate(88.910163447495506 142.666142280370877)"><tspan x="0" y="0">tee</tspan></text>
    </g>
    <g>
      <rect class="cls-6" x="19.999967143174217" y="188.134002911566313" width="160" height="50" rx="4" ry="4"></rect>
      <text class="cls-2" transform="translate(44.527461573716209 217.809971626074002)"><tspan x="0" y="0">qtimlvconverter</tspan></text>
    </g>
    <g>
      <line class="cls-1" x1="99.999967143174217" y1="238.133997288068713" x2="99.999967143174217" y2="257.39301523240465"></line>
      <polygon points="96.010892436142967 256.225824680646838 99.999967143174217 263.133997288068713 103.989041850205467 256.225824680646838 96.010892436142967 256.225824680646838"></polygon>
    </g>
    <g>
      <rect class="cls-6" x="19.999967143174217" y="263.134002911566313" width="160" height="50" rx="4" ry="4"></rect>
      <text class="cls-2" transform="translate(64.679805323716209 292.80995636728494)"><tspan x="0" y="0">qtimlsnpe</tspan></text>
    </g>
    <g>
      <line class="cls-1" x1="99.999967143174217" y1="313.133997288068713" x2="99.999967143174217" y2="332.39301523240465"></line>
      <polygon points="96.010892436142967 331.225824680646838 99.999967143174217 338.134004917463244 103.989041850205467 331.225824680646838 96.010892436142967 331.225824680646838"></polygon>
    </g>
    <g>
      <rect class="cls-6" x="19.999967143174217" y="338.134002911566313" width="160" height="49.999999999998181" rx="4" ry="4"></rect>
      <text class="cls-2" transform="translate(37.722774073716209 367.80995636728494)"><tspan x="0" y="0">qtimlpostprocess</tspan></text>
    </g>
    <g>
      <line class="cls-1" x1="179.999967143174217" y1="139.159418430646838" x2="199.258969828721092" y2="139.159418430646838"></line>
      <polygon points="198.091794535752342 143.148493137678088 204.999967143174217 139.159418430646838 198.091794535752342 135.170374241193713 198.091794535752342 143.148493137678088"></polygon>
    </g>
    <g>
      <rect class="cls-6" x="204.999967143174217" y="114.15942731195355" width="160" height="50" rx="4" ry="4"></rect>
      <text class="cls-2" transform="translate(236.140622660752342 143.835331733495877)"><tspan x="0" y="0">qtivcomposer</tspan></text>
    </g>
    <g>
      <line class="cls-1" x1="364.999967143174217" y1="139.159418430646838" x2="384.259000346299217" y2="139.159418430646838"></line>
      <polygon points="383.091764018174217 143.148493137678088 389.999967143174217 139.159418430646838 383.091764018174217 135.170374241193713 383.091764018174217 143.148493137678088"></polygon>
    </g>
    <g>
      <rect class="cls-4" x="389.999967143174217" y="114.15942731195355" width="160" height="50" rx="4" ry="4"></rect>
      <text class="cls-2" transform="translate(426.097684428330467 142.666142280370877)"><tspan x="0" y="0">waylandsink</tspan></text>
    </g>
    <g>
      <polyline class="cls-1" points="179.999967143174217 365.497721042951525 284.999967143174217 365.497728672346057 284.999967143174217 169.799982395489678"></polyline>
      <polygon points="288.989041850205467 170.967188206037463 284.999967143174217 164.059015598614678 281.010892436142967 170.967188206037463 288.989041850205467 170.967188206037463"></polygon>
    </g>
    <rect class="cls-4" x="19.999967143174217" y="24.113862343800974" width="160" height="50" rx="4" ry="4"></rect>
    <text class="cls-2" transform="translate(78.082038447495506 52.620579536230252)"><tspan x="0" y="0">filesrc</tspan></text>
    <g>
      <line class="cls-1" x1="179.999967143174217" y1="49.113862938907005" x2="199.258969828721092" y2="49.113862938907005"></line>
      <polygon points="198.091794535752342 53.102926201846458 204.999967143174217 49.113862938907005 198.091794535752342 45.124797768618919 198.091794535752342 53.102926201846458"></polygon>
    </g>
    <rect class="cls-4" x="204.999967143174217" y="24.113862343800974" width="160" height="50" rx="4" ry="4"></rect>
    <text class="cls-2" transform="translate(253.703122660752342 52.620579536230252)"><tspan x="0" y="0">qtdemux</tspan></text>
    <g>
      <line class="cls-1" x1="364.999967143174217" y1="49.113862938907005" x2="384.259000346299217" y2="49.113862938907005"></line>
      <polygon points="383.091764018174217 53.102926201846458 389.999967143174217 49.113862938907005 383.091764018174217 45.124797768618919 383.091764018174217 53.102926201846458"></polygon>
    </g>
    <rect class="cls-4" x="389.999967143174217" y="24.113862343800974" width="160" height="50" rx="4" ry="4"></rect>
    <text class="cls-2" transform="translate(432.206998393174217 52.620579536230252)"><tspan x="0" y="0">h264parse</tspan></text>
    <g>
      <line class="cls-1" x1="549.999967143174217" y1="49.113862938907005" x2="569.259000346299217" y2="49.113862938907005"></line>
      <polygon points="568.091764018174217 53.102926201846458 574.999967143174217 49.113862938907005 568.091764018174217 45.124797768618919 568.091764018174217 53.102926201846458"></polygon>
    </g>
    <g>
      <polyline class="cls-1" points="99.999723002549217 108.418443627382658 99.999723002549217 94.159427590396263 654.999967143174217 94.159427590396263 654.999967143174217 74.159427351977683"></polyline>
      <polygon points="103.988797709580467 107.251253075625755 99.999723002549217 114.159425683048539 96.01066355430703 107.251253075625755 103.988797709580467 107.251253075625755"></polygon>
    </g>
    <rect class="cls-4" x="574.999967143174217" y="24.113862343800974" width="160" height="50" rx="4" ry="4"></rect>
    <text class="cls-2" transform="translate(609.378934428330467 52.620579536230252)"><tspan x="0" y="0">v4l2h264dec</tspan></text>
  </g>
</svg>

The following table provides the sequential processing stages of the pipeline
                execution:

| Process | Description |
| --- | --- |
| [qtiqmmfsrc](https://docs.qualcomm.com/doc/80-70023-50/topic/qtiqmmfsrc.html) | <ol class="ol" id="single-camera-stream-with-image-segmentation-and-display-with-deeplabv3-quantized__ol_f5k_g5n_vbc"><br>                                    <li class="li">Collects the video stream (source) and creates two copies of<br>                                        the source:<ul class="ul" id="single-camera-stream-with-image-segmentation-and-display-with-deeplabv3-quantized__ul_n44_nwl_vbc"><br>                                            <li class="li">One stream is sent to the qtivcomposer plugin to<br>                                                retain the video stream.</li><br><br>                                            <li class="li">The other stream is sent to the ML inferencing<br>                                                branch in the pipeline.</li><br><br>                                        </ul><br></li><br><br>                                </ol> |
| **Preprocessing** | **Preprocessing** |
| [qtimlvconverter](https://docs.qualcomm.com/doc/80-70023-50/topic/qtimlvconverter.html) | <ol class="ol" id="single-camera-stream-with-image-segmentation-and-display-with-deeplabv3-quantized__ol_xsf_q5l_vbc"><br>                                    <li class="li">Receives the video stream on its sink pad.</li><br><br>                                    <li class="li">Performs preprocessing:<ul class="ul" id="single-camera-stream-with-image-segmentation-and-display-with-deeplabv3-quantized__ul_ff2_twl_vbc"><br>                                            <li class="li">Color conversion</li><br><br>                                            <li class="li">Scaling down/up</li><br><br>                                            <li class="li">Normalization on the stream data when the model<br>                                                expects the floating point values as input</li><br><br>                                        </ul><br></li><br><br>                                    <li class="li">Converts the video stream to a tensor stream on its source<br>                                            pad.<p class="p">The segmentation model uses this tensor stream<br>                                            for inferencing.</p><br></li><br><br>                                </ol> |
| **Inferencing** | **Inferencing** |
| [qtimlsnpe](https://docs.qualcomm.com/doc/80-70023-50/topic/qtimlsnpe.html) | <ol class="ol" id="single-camera-stream-with-image-segmentation-and-display-with-deeplabv3-quantized__ol_lfr_35n_vbc"><br>                                    <li class="li">Loads the segmentation model.</li><br><br>                                    <li class="li">Modifies the graph for the chosen delegate.</li><br><br>                                    <li class="li">Receives the tensor stream on its sinkpad.</li><br><br>                                    <li class="li">Runs the inference and produces a tensor stream with the<br>                                        segmentation results on its source pad.</li><br><br>                                </ol> |
| **Postprocessing** | **Postprocessing** |
| qtimlpostprocess | <ol class="ol" id="single-camera-stream-with-image-segmentation-and-display-with-deeplabv3-quantized__ol_mtr_k5n_vbc"><br>                                    <li class="li">Receives the inference tensors on its sinkpad.</li><br><br>                                    <li class="li">Converts the inference tensors into video formats that the<br>                                        multimedia plugins can process later.</li><br><br>                                    <li class="li">Produces the semantic segmentations for the frame.</li><br><br>                                    <li class="li">Loads the corresponding modules for the segmentation<br>                                            models.<p class="p">In this use case, qtimlpostprocess does the<br>                                            following: </p><ol class="ol" type="a" id="single-camera-stream-with-image-segmentation-and-display-with-deeplabv3-quantized__ol_ntr_k5n_vbc"><br>                                            <li class="li">Loads the deeplab-argmax submodule.</li><br><br>                                            <li class="li">Produces video frames with segmentation masks.</li><br><br>                                            <li class="li">Sends them to the sinkpad of qtivcomposer.</li><br><br>                                        </ol><br><br>                                    </li><br><br>                                </ol> |
| [qtivcomposer](https://docs.qualcomm.com/doc/80-70023-50/topic/qtivcomposer.html) | <ol class="ol" id="single-camera-stream-with-image-segmentation-and-display-with-deeplabv3-quantized__ol_nmc_lxl_vbc"><br>                                    <li class="li">Receives the original video stream with segmentation mask on<br>                                        its sinkpads.</li><br><br>                                    <li class="li">On its sourcepad, produces GST buffers with contents<br>                                        composed of video streams from its sinkpads.</li><br><br>                                </ol> |
| **Output** | **Output** |
| [Waylandsink](https://docs.qualcomm.com/doc/80-70023-50/topic/waylandsink.html) | <ol class="ol" id="single-camera-stream-with-image-segmentation-and-display-with-deeplabv3-quantized__ol_qqc_c5n_vbc"><br>                                    <li class="li">Receives the video stream on its sinkpad.</li><br><br>                                    <li class="li">Submits the video stream to Weston.</li><br><br>                                    <li class="li">Weston displays the following on the local display device:<ul class="ul" id="single-camera-stream-with-image-segmentation-and-display-with-deeplabv3-quantized__ol_cjl_r5n_vbc"><br>                                            <li class="li">The video stream that's captured from the camera. </li><br><br>                                            <li class="li">The segmentation masks that are drawn over<br>                                                objects/components in that scene.</li><br><br>                                        </ul><br></li><br><br>                                </ol> |

**Parent Topic:** [Qualcomm Neural Processing SDK use cases](https://docs.qualcomm.com/doc/80-70023-50/topic/qualcomm-neural-processing-sdk-use-cases.html)

Last Published: Mar 27, 2026

[Previous Topic
Object detection and encode with Neural Processing SDK](https://docs.qualcomm.com/bundle/publicresource/80-70023-50/topics/single-camera-stream-with-object-detection-and-encode-with-mobilenet-v2-ssd.md) [Next Topic
Image segmentation and encode with Neural Processing SDK](https://docs.qualcomm.com/bundle/publicresource/80-70023-50/topics/single-camera-stream-with-image-segmentation-and-encode-with-deeplabv3-quantized.md)