# Image segmentation and encode with Neural Processing SDK

Source: [https://docs.qualcomm.com/doc/80-70023-50/topic/single-camera-stream-with-image-segmentation-and-encode-with-deeplabv3-quantized.html](https://docs.qualcomm.com/doc/80-70023-50/topic/single-camera-stream-with-image-segmentation-and-encode-with-deeplabv3-quantized.html)

The use case implements the DeepLab v3 model with the Qualcomm Neural Processing SDK
        runtime. The use case is to compose the semantic segmentations and original video stream,
        encode this stream, and then multiplex it in an MP4 container.

Note: For Ubuntu Server, `sudo` access is necessary to
            write the encoded stream to the `/etc/media` folder.

Run the use case on the target
            device:

    gst-launch-1.0 -e --gst-debug=2 \
    qtiqmmfsrc name=camsrc ! video/x-raw,format=NV12_Q08C,width=1920,height=1080,framerate=30/1 ! queue ! tee name=split  \
    split. ! queue ! qtivcomposer name=mixer sink_1::dimensions="<1920,1080>" sink_1::alpha=0.5 ! queue ! v4l2h264enc capture-io-mode=4 output-io-mode=5 ! \
    h264parse ! queue ! mp4mux ! queue ! filesink location=/etc/media/video.mp4 \
    split. ! queue ! qtimlvconverter ! queue ! qtimlsnpe delegate=dsp model=/etc/models/deeplabv3_plus_mobilenet.dlc ! queue ! \
    qtimlpostprocess module=deeplab-argmax labels=/etc/labels/deeplabv3_resnet50.json ! video/x-raw,width=640,height=360 ! queue ! mixer.Copy to clipboard

To stop the use case, use CTRL + C.

The following figure shows the flow of the use case execution:

1. Identify scenes from a video stream coming through a camera source.
2. Compose semantic segmentation and video stream using qtivcomposer.
3. Encode the stream as an H.264 bit stream and multiplex the stream in an MP4
                container.

Figure : Pipeline for image segmentation and encode with qtivcomposer
            
            <!--?xml version="1.0" encoding="UTF-8"?-->
<svg id="Layer_2" data-name="Layer 2" xmlns="http://www.w3.org/2000/svg" width="947.865921020507812" height="448.549874305725098" viewbox="0 0 947.865921020507812 448.549874305725098">
  <defs>
    <style>.svg-1 .cls-1 { fill: none; stroke: #000; stroke-miterlimit: 10 }
.svg-1 .cls-2 { fill: #fff; font-size: 16px }
.svg-1 .cls-2,.svg-1 .cls-3 { font-family: Roboto-Regular, Roboto }
.svg-1 .cls-4 { fill: #007884 }
.svg-1 .cls-5 { fill: #d2d7e1 }
.svg-1 .cls-6 { fill: #2a2aea }
.svg-1 .cls-3 { font-size: 14px }
.svg-1 .cls-7 { fill: #fafafa }</style>
  </defs>
  <g>
    <rect class="cls-7" x=".500076293945312" y=".499786376953125" width="946.8662109375" height="447.55029296875" rx="7.500000000000003" ry="7.500000000000003"></rect>
    <path class="cls-5" d="M939.865859985351562,1c3.85980224609375,0,7,3.140228271484375,7,7v432.549873352050781c0,3.859766006469727-3.14019775390625,7.000000953674316-7,7.000000953674316H8c-3.859771728515625,0-7-3.14023494720459-7-7.000000953674316V8c0-3.859771728515625,3.140228271484375-7,7-7h931.865859985351562M939.865859985351562,0H8C3.581642150878906,0,0,3.581634521484375,0,8v432.549873352050781c0,4.418235778808594,3.581642150878906,8.000000953674316,8,8.000000953674316h931.865859985351562c4.41839599609375,0,8.00006103515625-3.581765174865723,8.00006103515625-8.000000953674316V8c0-4.418365478515625-3.5816650390625-8-8.00006103515625-8h0Z"></path>
  </g>
  <g>
    <g>
      <text class="cls-3" transform="translate(757.439041137695312 424.641265869140625)"><tspan x="0" y="0">Qualcomm </tspan></text>
      <rect class="cls-6" x="737.188025398418176" y="412.549798654064944" width="16" height="16" rx="2" ry="2"></rect>
    </g>
    <g>
      <text class="cls-3" transform="translate(856.020828247070312 424.641265869140625)"><tspan x="0" y="0">Open source</tspan></text>
      <rect class="cls-4" x="835.769806811271337" y="412.549798654064944" width="16" height="16" rx="2" ry="2"></rect>
    </g>
  </g>
  <g>
    <rect class="cls-4" x="19.999960174422085" y="111.014282316735262" width="160" height="50" rx="4" ry="4"></rect>
    <text class="cls-2" transform="translate(88.910171508789062 139.520965576171875)"><tspan x="0" y="0">tee</tspan></text>
  </g>
  <g>
    <rect class="cls-6" x="19.999960174422085" y="187.855341799882808" width="160" height="50" rx="4" ry="4"></rect>
    <text class="cls-2" transform="translate(44.527053833007812 216.362213134765625)"><tspan x="0" y="0">qtimlvconverter</tspan></text>
  </g>
  <g>
    <rect class="cls-6" x="205.996807940008694" y="111.014282316735262" width="140" height="50" rx="4" ry="4"></rect>
    <text class="cls-2" transform="translate(227.137573719024658 140.68994140625)"><tspan x="0" y="0">qtivcomposer</tspan></text>
  </g>
  <g>
    <line class="cls-1" x1="179.999954223632812" y1="136.0142822265625" x2="199.258987426757812" y2="136.0142822265625"></line>
    <polygon points="198.091781616210938 140.00335693359375 204.999954223632812 136.0142822265625 198.091781616210938 132.02520751953125 198.091781616210938 140.00335693359375"></polygon>
  </g>
  <g>
    <line class="cls-1" x1="345.996810913085938" y1="136.0142822265625" x2="365.255813598632812" y2="136.0142822265625"></line>
    <polygon points="364.088638305664062 140.00335693359375 370.996810913085938 136.0142822265625 364.088638305664062 132.02520751953125 364.088638305664062 140.00335693359375"></polygon>
  </g>
  <g>
    <rect class="cls-4" x="371.263823351411702" y="111.014282316735262" width="119.999999999998181" height="50" rx="4" ry="4"></rect>
    <text class="cls-2" transform="translate(385.740463256835938 139.520965576171875)"><tspan x="0" y="0">v4l2h264enc</tspan></text>
  </g>
  <g>
    <rect class="cls-4" x="516.797854174217719" y="111.014282316735262" width="120" height="50" rx="4" ry="4"></rect>
    <text class="cls-2" transform="translate(539.004928588867188 139.520965576171875)"><tspan x="0" y="0">h264parse</tspan></text>
  </g>
  <g>
    <line class="cls-1" x1="491.263809204101562" y1="136.0142822265625" x2="510.522842407226562" y2="136.0142822265625"></line>
    <polygon points="509.355667114257812 140.00335693359375 516.263809204101562 136.0142822265625 509.355667114257812 132.02520751953125 509.355667114257812 140.00335693359375"></polygon>
  </g>
  <g>
    <rect class="cls-4" x="662.331884997025554" y="111.014282316735262" width="120" height="50" rx="4" ry="4"></rect>
    <text class="cls-2" transform="translate(690.953048706054688 139.520965576171875)"><tspan x="0" y="0">mp4mux</tspan></text>
  </g>
  <g>
    <line class="cls-1" x1="636.797866821289062" y1="136.0142822265625" x2="656.056900024414062" y2="136.0142822265625"></line>
    <polygon points="654.889663696289062 140.00335693359375 661.797866821289062 136.0142822265625 654.889663696289062 132.02520751953125 654.889663696289062 140.00335693359375"></polygon>
  </g>
  <g>
    <rect class="cls-4" x="807.865915819837028" y="111.014282316735262" width="120.000000000009095" height="50" rx="4" ry="4"></rect>
    <text class="cls-2" transform="translate(842.432449340820312 139.520965576171875)"><tspan x="0" y="0">filesink</tspan></text>
  </g>
  <g>
    <line class="cls-1" x1="782.331863403320312" y1="136.0142822265625" x2="801.590896606445312" y2="136.0142822265625"></line>
    <polygon points="800.423721313476562 140.00335693359375 807.331863403320312 136.0142822265625 800.423721313476562 132.02520751953125 800.423721313476562 140.00335693359375"></polygon>
  </g>
  <g>
    <line class="cls-1" x1="99.999954223632812" y1="161.475860595703125" x2="99.999954223632812" y2="180.734893798828125"></line>
    <polygon points="96.010894775390625 179.56768798828125 99.999954223632812 186.475860595703125 103.989028930664062 179.56768798828125 96.010894775390625 179.56768798828125"></polygon>
  </g>
  <g>
    <rect class="cls-6" x="19.999960174422085" y="264.790899244963839" width="160" height="50" rx="4" ry="4"></rect>
    <text class="cls-2" transform="translate(64.679397583007812 293.29779052734375)"><tspan x="0" y="0">qtimlsnpe</tspan></text>
  </g>
  <g>
    <line class="cls-1" x1="99.999954223632812" y1="238.411422729492188" x2="99.999954223632812" y2="257.670440673828125"></line>
    <polygon points="96.010894775390625 256.503250122070312 99.999954223632812 263.411422729492188 103.989028930664062 256.503250122070312 96.010894775390625 256.503250122070312"></polygon>
  </g>
  <g>
    <rect class="cls-6" x="19.999960174422085" y="342.549853565613375" width="160" height="50" rx="4" ry="4"></rect>
    <text class="cls-2" transform="translate(37.722366333007812 371.056732177734375)"><tspan x="0" y="0">qtimlpostprocess</tspan></text>
  </g>
  <g>
    <line class="cls-1" x1="99.999954223632812" y1="316.170379638671875" x2="99.999954223632812" y2="335.429397583007812"></line>
    <polygon points="96.010894775390625 334.26220703125 99.999954223632812 341.170379638671875 103.989028930664062 334.26220703125 96.010894775390625 334.26220703125"></polygon>
  </g>
  <g>
    <polyline class="cls-1" points="179.999954223632812 367.549850463867188 265.533981323242188 367.549858093261719 265.533981323242188 167.21685791015625"></polyline>
    <polygon points="269.523056030273438 168.384033203125 265.533981323242188 161.475860595703125 261.544937133789062 168.384033203125 269.523056030273438 168.384033203125"></polygon>
  </g>
  <rect class="cls-4" x="20.000196868650164" y="20.04555270293713" width="160" height="50" rx="4" ry="4"></rect>
  <text class="cls-2" transform="translate(78.082244873046875 48.55224609375)"><tspan x="0" y="0">filesrc</tspan></text>
  <g>
    <line class="cls-1" x1="180.000198364257812" y1="45.045562744140625" x2="199.259201049804688" y2="45.045562744140625"></line>
    <polygon points="198.092025756835938 49.03460693359375 205.000198364257812 45.045562744140625 198.092025756835938 41.056488037109375 198.092025756835938 49.03460693359375"></polygon>
  </g>
  <rect class="cls-4" x="205.000196868650164" y="20.04555270293713" width="160" height="50" rx="4" ry="4"></rect>
  <text class="cls-2" transform="translate(253.703353881835938 48.55224609375)"><tspan x="0" y="0">qtdemux</tspan></text>
  <g>
    <line class="cls-1" x1="365.000198364257812" y1="45.045562744140625" x2="384.259201049804688" y2="45.045562744140625"></line>
    <polygon points="383.092025756835938 49.03460693359375 390.000198364257812 45.045562744140625 383.092025756835938 41.056488037109375 383.092025756835938 49.03460693359375"></polygon>
  </g>
  <rect class="cls-4" x="390.000196868650164" y="20.04555270293713" width="160" height="50" rx="4" ry="4"></rect>
  <text class="cls-2" transform="translate(432.20721435546875 48.55224609375)"><tspan x="0" y="0">h264parse</tspan></text>
  <g>
    <line class="cls-1" x1="550.000198364257812" y1="45.045562744140625" x2="569.259231567382812" y2="45.045562744140625"></line>
    <polygon points="568.091995239257812 49.03460693359375 575.000198364257812 45.045562744140625 568.091995239257812 41.056488037109375 568.091995239257812 49.03460693359375"></polygon>
  </g>
  <g>
    <polyline class="cls-1" points="99.999954223632812 105.273284912109375 99.999954223632812 90.552703857421875 655.000198364257812 90.552703857421875 655.000198364257812 70.09112548828125"></polyline>
    <polygon points="103.989028930664062 104.106109619140625 99.999954223632812 111.0142822265625 96.010894775390625 104.106109619140625 103.989028930664062 104.106109619140625"></polygon>
  </g>
  <rect class="cls-4" x="575.000196868650164" y="20.04555270293713" width="160" height="50" rx="4" ry="4"></rect>
  <text class="cls-2" transform="translate(609.379135131835938 48.55224609375)"><tspan x="0" y="0">v4l2h264dec</tspan></text>
</svg>

The following table provides the sequential processing stages of the pipeline
            execution:

| Process | Description |
| --- | --- |
| [qtiqmmfsrc](https://docs.qualcomm.com/doc/80-70023-50/topic/qtiqmmfsrc.html) | <ol class="ol" id="single-camera-stream-with-image-segmentation-and-encode-with-deeplabv3-quantized__ol_f5k_g5n_vbc"><br>                                <li class="li">Collects the video stream (source) and creates two copies of the<br>                                        source:<ul class="ul" id="single-camera-stream-with-image-segmentation-and-encode-with-deeplabv3-quantized__ul_n44_nwl_vbc"><br>                                        <li class="li">One stream is sent to the qtivcomposer plugin to retain<br>                                            the video stream.</li><br><br>                                        <li class="li">The other stream is sent to an ML inferencing branch in<br>                                            the pipeline.</li><br><br>                                    </ul><br></li><br><br>                            </ol> |
| **Preprocessing** | **Preprocessing** |
| [qtimlvconverter](https://docs.qualcomm.com/doc/80-70023-50/topic/qtimlvconverter.html) | <ol class="ol" id="single-camera-stream-with-image-segmentation-and-encode-with-deeplabv3-quantized__ol_xsf_q5l_vbc"><br>                                <li class="li">Receives the video stream on its sink pad.</li><br><br>                                <li class="li">Performs preprocessing:<ul class="ul" id="single-camera-stream-with-image-segmentation-and-encode-with-deeplabv3-quantized__ul_ff2_twl_vbc"><br>                                        <li class="li">Color conversion</li><br><br>                                        <li class="li">Scaling down/up</li><br><br>                                        <li class="li">Normalization on the stream data when the model expects<br>                                            the floating point values as input</li><br><br>                                    </ul><br></li><br><br>                                <li class="li">Converts the video stream to a tensor stream on its source<br>                                        pad.<p class="p">The segmentation model uses this tensor stream for<br>                                        inferencing.</p><br></li><br><br>                            </ol> |
| **Inferencing** | **Inferencing** |
| [qtimlsnpe](https://docs.qualcomm.com/doc/80-70023-50/topic/qtimlsnpe.html) | <ol class="ol" id="single-camera-stream-with-image-segmentation-and-encode-with-deeplabv3-quantized__ol_lfr_35n_vbc"><br>                                <li class="li">Loads the segmentation model.</li><br><br>                                <li class="li">Modifies the graph for the chosen delegate.</li><br><br>                                <li class="li">Receives the tensor stream on its sinkpad.</li><br><br>                                <li class="li">Runs the inference and produces a tensor stream with the<br>                                    segmentation results on its source pad.</li><br><br>                            </ol> |
| **Postprocessing** | **Postprocessing** |
| qtimlpostprocess | <ol class="ol" id="single-camera-stream-with-image-segmentation-and-encode-with-deeplabv3-quantized__ol_mtr_k5n_vbc"><br>                                <li class="li">Receives the inference tensors on its sinkpad.</li><br><br>                                <li class="li">Converts the inference tensors into video formats that the<br>                                    multimedia plugins can process later.</li><br><br>                                <li class="li">Produces the semantic segmentations for the frame.</li><br><br>                                <li class="li">Loads the corresponding modules for the segmentation<br>                                        models.<p class="p">In this use case, qtimlpostprocess does the<br>                                        following: </p><ol class="ol" type="a" id="single-camera-stream-with-image-segmentation-and-encode-with-deeplabv3-quantized__ol_ntr_k5n_vbc"><br>                                        <li class="li">Loads deeplab-argmax submodule.</li><br><br>                                        <li class="li">Produces video frames with segmentation masks.</li><br><br>                                        <li class="li">Sends them to the sinkpad of qtivcomposer.</li><br><br>                                    </ol><br><br>                                </li><br><br>                            </ol> |
| [qtivcomposer](https://docs.qualcomm.com/doc/80-70023-50/topic/qtivcomposer.html) | <ol class="ol" id="single-camera-stream-with-image-segmentation-and-encode-with-deeplabv3-quantized__ol_nmc_lxl_vbc"><br>                                <li class="li">Receives the original video stream with segmentation mask on its<br>                                    sinkpads. </li><br><br>                                <li class="li">Produces on its sourcepad GST buffers with contents composed of<br>                                    video streams from its sinkpads.</li><br><br>                            </ol> |
| [v4l2h264enc](https://docs.qualcomm.com/doc/80-70023-50/topic/v4l2h264enc.html) | <ol class="ol" id="single-camera-stream-with-image-segmentation-and-encode-with-deeplabv3-quantized__ol_wsc_bsn_vbc"><br>                                <li class="li">Applies parameters to each frame of the video stream it's<br>                                    receiving on its sinkpad.</li><br><br>                                <li class="li">Encodes it into bitstream and sends it over its sourcepad.</li><br><br>                            </ol> |
| h264parse | Adds more information about the bitstream to the GStreamer buffer<br>                            meta. |
| mp4mux | Receives these buffers and creates containers format specification<br>                            buffers. |
| **Output** | **Output** |
| Filesink | Stores the resulting stream in a<br>                                /etc/media/video.mp4 file. |
| Playback | Pull video.mp4 from the host computer and play<br>                            it on a media player:<br>`scp root@<IP address of target<br>                                    device>:/etc/media/video.mp4 <destination<br>                                directory>` |

**Parent Topic:** [Qualcomm Neural Processing SDK use cases](https://docs.qualcomm.com/doc/80-70023-50/topic/qualcomm-neural-processing-sdk-use-cases.html)

Last Published: Mar 27, 2026

[Previous Topic
Image segmentation and display with Neural Processing SDK](https://docs.qualcomm.com/bundle/publicresource/80-70023-50/topics/single-camera-stream-with-image-segmentation-and-display-with-deeplabv3-quantized.md) [Next Topic
Custom Gstreamer pipeline use cases](https://docs.qualcomm.com/bundle/publicresource/80-70023-50/topics/custom-gstreamer-pipeline-use-cases.md)