# Object detection and classification 

Source: [https://docs.qualcomm.com/doc/80-70022-50/topic/camera-ai-detection-overlay-composer-display.html](https://docs.qualcomm.com/doc/80-70022-50/topic/camera-ai-detection-overlay-composer-display.html)

The **gst-camera-two-stream-detection-and-classification-side-by-side.py**
        application uses a YOLOX LiteRT model to detect and `inception_v3` to
        classify objects in the scene displayed by the AI overlay composer.

Figure : Pipeline for object detection and classification
            
            <!--?xml version="1.0" encoding="UTF-8"?-->
<svg xmlns="http://www.w3.org/2000/svg" width="1293.142303466796875" height="294.29766845703125" viewbox="0 0 1293.142303466796875 294.29766845703125">
  <g id="Layer_2" data-name="Layer 2">
    <g>
      <rect x=".499725341796875" y=".5003662109375" width="1292.14306640625" height="293.296875" rx="7.499999999999891" ry="7.499999999999891" style="fill: #fafafa;"></rect>
      <path d="M1285.142303466796875,1c3.85986328125,0,7,3.140167236328125,7,7v278.29766845703125c0,3.859832763671875-3.14013671875,7-7,7H8c-3.859832763671875,0-7-3.140167236328125-7-7V8c0-3.859832763671875,3.140167236328125-7,7-7h1277.142303466796875M1285.142303466796875,0H8C3.5816650390625,0,0,3.5816650390625,0,8v278.29766845703125c0,4.4183349609375,3.5816650390625,8,8,8h1277.142303466796875c4.4183349609375,0,8-3.5816650390625,8-8V8c0-4.4183349609375-3.5816650390625-8-8-8h0Z" style="fill: #d2d7e1;"></path>
    </g>
    <g>
      <g>
        <text transform="translate(1103.635467529296875 270.38916015625)" style="font-family: Roboto-Regular, Roboto; font-size: 14px;"><tspan x="0" y="0">Qualcomm </tspan></text>
        <rect x="1083.384469218361119" y="258.29766845703125" width="16" height="16" rx="2" ry="2" style="fill: #2a2aea;"></rect>
      </g>
      <g>
        <text transform="translate(1202.217254638671875 270.38916015625)" style="font-family: Roboto-Regular, Roboto; font-size: 14px;"><tspan x="0" y="0">Open source</tspan></text>
        <rect x="1181.96625063121428" y="258.29766845703125" width="16" height="16" rx="2" ry="2" style="fill: #007884;"></rect>
      </g>
    </g>
  </g>
  <g id="Layer_3" data-name="Layer 3">
    <g>
      <g>
        <rect x="836.512669124729655" y="20.000005454401617" width="109.208205931601697" height="76.445744152120824" rx="4" ry="4" style="fill: #2a2aea;"></rect>
        <text transform="translate(860.874267578125 62.314208984375)" style="fill: #fff; font-family: Roboto-Regular, Roboto; font-size: 14px;"><tspan x="0" y="0">qtioverlay</tspan></text>
      </g>
      <g>
        <rect x="996.972359154366131" y="20.000005454401617" width="100.989371769333957" height="218.297657992483437" rx="4" ry="4" style="fill: #2a2aea;"></rect>
        <text transform="translate(1004.714920043945312 133.239898681640625)" style="fill: #fff; font-family: Roboto-Regular, Roboto; font-size: 14px;"><tspan x="0" y="0">qtivcomposer</tspan></text>
      </g>
      <g>
        <path d="M45.761920917384487,40.312731757680012h-9.828738533843534l-4.914369266922222,5.897243120307394h-5.897243120308303c-2.171304834052535,0-3.931495413535231,1.760190579481787-3.931495413535231,3.931495413535231v17.691729360920363c0,2.171306708734846,1.760190579482696,3.931495413540688,3.931495413535231,3.931495413540688h31.451963308305494c2.171306708734846,0,3.931495413535231-1.760188704805842,3.931495413535231-3.931495413540688v-17.691729360920363c0-2.171304834053444-1.760188704800385-3.931495413535231-3.931495413535231-3.931495413535231h-5.897243120308303l-4.914369266923131-5.897243120307394Z" style="fill: none; stroke: #000; stroke-linecap: round; stroke-linejoin: round; stroke-width: 2px;"></path>
        <circle cx="40.847551650463174" cy="56.03871341183094" r="5.897243120307393" style="fill: none; stroke: #000; stroke-linecap: round; stroke-linejoin: round; stroke-width: 2px;"></circle>
      </g>
      <rect x="75.319478127794355" y="20.00002753046283" width="93.919057101176804" height="76.445700000000215" rx="4" ry="4" style="fill: #2a2aea;"></rect>
      <g>
        <line x1="169.238540649414062" y1="58.222885131835938" x2="183.49755859375" y2="58.222885131835938" style="fill: none; stroke: #000; stroke-miterlimit: 10;"></line>
        <polygon points="182.330352783203125 62.211944580078125 189.238540649414062 58.222885131835938 182.330352783203125 54.233810424804688 182.330352783203125 62.211944580078125"></polygon>
      </g>
      <line x1="61.483931253958872" y1="58.222877530462938" x2="75.319478127794355" y2="58.222877530462938" style="fill: none; stroke: #000; stroke-miterlimit: 10;"></line>
      <text transform="translate(85.993608474731445 62.068878173828125)" style="fill: #fff; font-family: Roboto-Regular, Roboto; font-size: 14px;"><tspan x="0" y="0">qtiqmmfsrc</tspan></text>
      <g>
        <rect x="189.238535228971159" y="20.00002753046283" width="76.445744152121733" height="76.445700000000215" rx="3.999997689754309" ry="3.999997689754309" style="fill: #007884;"></rect>
        <text transform="translate(217.757568359375 62.69049072265625)" style="fill: #fff; font-family: Roboto-Regular, Roboto; font-size: 14px;"><tspan x="0" y="0">tee</tspan></text>
      </g>
      <g>
        <rect x="707.338435403369658" y="20.000005454401617" width="109.208205931599878" height="76.445744152120824" rx="4" ry="4" style="fill: #2a2aea;"></rect>
        <text transform="translate(724.576904296875 62.314208984375)" style="fill: #fff; font-family: Roboto-Regular, Roboto; font-size: 14px;"><tspan x="0" y="0">qtimetamux</tspan></text>
      </g>
      <g>
        <g>
          <line x1="265.566024780273438" y1="69.143707275390625" x2="279.902084350585938" y2="69.143707275390625" style="fill: none; stroke: #000; stroke-miterlimit: 10;"></line>
          <polygon points="278.734878540039062 73.13275146484375 285.64306640625 69.143707275390625 278.734878540039062 65.154632568359375 278.734878540039062 73.13275146484375"></polygon>
        </g>
        <g>
          <line x1="265.566024780273438" y1="32.192886352539062" x2="699.337615966796875" y2="32.192886352539062" style="fill: none; stroke: #000; stroke-miterlimit: 10;"></line>
          <polygon points="698.17041015625 36.18194580078125 705.078582763671875 32.192886352539062 698.17041015625 28.203811645507812 698.17041015625 36.18194580078125"></polygon>
        </g>
        <rect x="285.718740873407114" y="41.841646640721592" width="121.049847117930767" height="54.604102965800848" rx="4.000000000000001" ry="4.000000000000001" style="fill: #2a2aea;"></rect>
        <g>
          <line x1="406.768588066101074" y1="69.143707275390625" x2="421.104634806513786" y2="69.143707275390625" style="fill: none; stroke: #000; stroke-miterlimit: 10;"></line>
          <polygon points="419.937442779541016 73.13275146484375 426.845618724822998 69.143707275390625 419.937442779541016 65.154632568359375 419.937442779541016 73.13275146484375"></polygon>
        </g>
        <rect x="426.921298597914756" y="41.841646640721592" width="97.703504413963856" height="54.604102965800848" rx="3.999999999999999" ry="3.999999999999999" style="fill: #2a2aea;"></rect>
        <g>
          <line x1="524.624801635742188" y1="69.143707275390625" x2="538.960853576660156" y2="69.143707275390625" style="fill: none; stroke: #000; stroke-miterlimit: 10;"></line>
          <polygon points="537.793655395507812 73.13275146484375 544.701835632324219 69.143707275390625 537.793655395507812 65.154632568359375 537.793655395507812 73.13275146484375"></polygon>
        </g>
        <g>
          <line x1="686.895233154296875" y1="69.143707275390625" x2="701.231292724609375" y2="69.143707275390625" style="fill: none; stroke: #000; stroke-miterlimit: 10;"></line>
          <polygon points="700.0640869140625 73.13275146484375 706.9722900390625 69.143707275390625 700.0640869140625 65.154632568359375 700.0640869140625 73.13275146484375"></polygon>
        </g>
        <rect x="544.777513618457306" y="41.841646640721592" width="142.117730558111361" height="54.604102965800848" rx="4" ry="4" style="fill: #2a2aea;"></rect>
        <text transform="translate(297.70489501953125 73.61138916015625)" style="fill: #fff; font-family: Roboto-Regular, Roboto; font-size: 14px;"><tspan x="0" y="0">qtimlvconverter</tspan></text>
        <text transform="translate(445.864990234375 73.61138916015625)" style="fill: #fff; font-family: Roboto-Regular, Roboto; font-size: 14px;"><tspan x="0" y="0">qtimltflite</tspan></text>
        <text transform="translate(561.343505859375 73.61138916015625)" style="fill: #fff; font-family: Roboto-Regular, Roboto; font-size: 14px;"><tspan x="0" y="0">qtimlpostprocess</tspan></text>
      </g>
      <g>
        <line x1="816.435638427734375" y1="58.222885131835938" x2="830.771697998046875" y2="58.222885131835938" style="fill: none; stroke: #000; stroke-miterlimit: 10;"></line>
        <polygon points="829.6044921875 62.211944580078125 836.512664794921875 58.222885131835938 829.6044921875 54.233810424804688 829.6044921875 62.211944580078125"></polygon>
      </g>
      <g>
        <line x1="945.720855712890625" y1="58.222885131835938" x2="991.231353759765625" y2="58.222885131835938" style="fill: none; stroke: #000; stroke-miterlimit: 10;"></line>
        <polygon points="990.064178466796875 62.211944580078125 996.972381591796875 58.222885131835938 990.064178466796875 54.233810424804688 990.064178466796875 62.211944580078125"></polygon>
      </g>
      <g>
        <line x1="1097.659820556640625" y1="129.148834228515625" x2="1111.995880126953125" y2="129.148834228515625" style="fill: none; stroke: #000; stroke-miterlimit: 10;"></line>
        <polygon points="1110.828643798828125 133.137908935546875 1117.736846923828125 129.148834228515625 1110.828643798828125 125.159759521484375 1110.828643798828125 133.137908935546875"></polygon>
      </g>
      <g>
        <rect x="1117.736846887601132" y="90.925962374583833" width="100.989371769333957" height="76.445744152120824" rx="4" ry="4" style="fill: #007884;"></rect>
        <text transform="translate(1129.13299560546875 133.616455078125)" style="fill: #fff; font-family: Roboto-Regular, Roboto; font-size: 14px;"><tspan x="0" y="0">Waylandsink </tspan></text>
      </g>
      <g>
        <g>
          <rect x="1233.827429777935322" y="115.547894662518956" width="39.314954135375956" height="27.520467894764806" rx="4" ry="4" style="fill: none; stroke: #000; stroke-linecap: round; stroke-linejoin: round; stroke-width: 2px;"></rect>
          <line x1="1245.621916018542834" y1="150.931353384356044" x2="1261.347897672691033" y2="150.931353384356044" style="fill: none; stroke: #000; stroke-linecap: round; stroke-linejoin: round; stroke-width: 2px;"></line>
        </g>
        <line x1="1253.48490684562239" y1="143.068362557283763" x2="1253.48490684562239" y2="150.931353384356044" style="fill: none; stroke: #000; stroke-linecap: round; stroke-linejoin: round; stroke-width: 2px;"></line>
      </g>
      <text transform="translate(950.045249938964844 51.663177490234375)" style="font-family: Roboto-Regular, Roboto; font-size: 14px;"><tspan x="0" y="0">sink_0</tspan></text>
      <g>
        <rect x="836.512669124729655" y="159.851941370823624" width="109.208205931601697" height="76.445744152120824" rx="4" ry="4" style="fill: #2a2aea;"></rect>
        <text transform="translate(860.874267578125 202.1661376953125)" style="fill: #fff; font-family: Roboto-Regular, Roboto; font-size: 14px;"><tspan x="0" y="0">qtioverlay</tspan></text>
      </g>
      <g>
        <path d="M45.761920917384487,180.16466767410202h-9.828738533843534l-4.914369266922222,5.897243120307394h-5.897243120308303c-2.171304834052535,0-3.931495413535231,1.760190579481787-3.931495413535231,3.931495413535231v17.691729360920363c0,2.171306708734846,1.760190579482696,3.931495413540688,3.931495413535231,3.931495413540688h31.451963308305494c2.171306708734846,0,3.931495413535231-1.760188704805842,3.931495413535231-3.931495413540688v-17.691729360920363c0-2.171304834053444-1.760188704800385-3.931495413535231-3.931495413535231-3.931495413535231h-5.897243120308303l-4.914369266923131-5.897243120307394Z" style="fill: none; stroke: #000; stroke-linecap: round; stroke-linejoin: round; stroke-width: 2px;"></path>
        <circle cx="40.847551650463174" cy="195.890649328252948" r="5.897243120307393" style="fill: none; stroke: #000; stroke-linecap: round; stroke-linejoin: round; stroke-width: 2px;"></circle>
      </g>
      <rect x="75.319478127794355" y="159.851963446884838" width="93.919057101176804" height="76.445700000000215" rx="4" ry="4" style="fill: #2a2aea;"></rect>
      <g>
        <line x1="169.238540649414062" y1="198.074798583984375" x2="183.49755859375" y2="198.074798583984375" style="fill: none; stroke: #000; stroke-miterlimit: 10;"></line>
        <polygon points="182.330352783203125 202.063873291015625 189.238540649414062 198.074798583984375 182.330352783203125 194.08575439453125 182.330352783203125 202.063873291015625"></polygon>
      </g>
      <line x1="61.483931253958872" y1="198.074813446884946" x2="75.319478127794355" y2="198.074813446884946" style="fill: none; stroke: #000; stroke-miterlimit: 10;"></line>
      <text transform="translate(85.993608474731445 201.920806884765625)" style="fill: #fff; font-family: Roboto-Regular, Roboto; font-size: 14px;"><tspan x="0" y="0">qtiqmmfsrc</tspan></text>
      <g>
        <rect x="189.238535228971159" y="159.851963446884838" width="76.445744152121733" height="76.445700000000215" rx="4" ry="4" style="fill: #007884;"></rect>
        <text transform="translate(217.757568359375 202.54241943359375)" style="fill: #fff; font-family: Roboto-Regular, Roboto; font-size: 14px;"><tspan x="0" y="0">tee</tspan></text>
      </g>
      <g>
        <rect x="707.338435403369658" y="159.851941370823624" width="109.208205931599878" height="76.445744152120824" rx="4" ry="4" style="fill: #2a2aea;"></rect>
        <text transform="translate(724.576904296875 202.1661376953125)" style="fill: #fff; font-family: Roboto-Regular, Roboto; font-size: 14px;"><tspan x="0" y="0">qtimetamux</tspan></text>
      </g>
      <g>
        <g>
          <line x1="265.566024780273438" y1="208.995635986328125" x2="279.902084350585938" y2="208.995635986328125" style="fill: none; stroke: #000; stroke-miterlimit: 10;"></line>
          <polygon points="278.734878540039062 212.984710693359375 285.64306640625 208.995635986328125 278.734878540039062 205.006561279296875 278.734878540039062 212.984710693359375"></polygon>
        </g>
        <g>
          <line x1="265.566024780273438" y1="172.044830322265625" x2="699.337615966796875" y2="172.044830322265625" style="fill: none; stroke: #000; stroke-miterlimit: 10;"></line>
          <polygon points="698.17041015625 176.03387451171875 705.078582763671875 172.044830322265625 698.17041015625 168.055755615234375 698.17041015625 176.03387451171875"></polygon>
        </g>
        <rect x="285.718740873407114" y="181.693582557145419" width="121.049847117930767" height="54.604102965799029" rx="4.000000000000001" ry="4.000000000000001" style="fill: #2a2aea;"></rect>
        <g>
          <line x1="406.768588066101074" y1="208.995635986328125" x2="421.104634806513786" y2="208.995635986328125" style="fill: none; stroke: #000; stroke-miterlimit: 10;"></line>
          <polygon points="419.937442779541016 212.984710693359375 426.845618724822998 208.995635986328125 419.937442779541016 205.006561279296875 419.937442779541016 212.984710693359375"></polygon>
        </g>
        <rect x="426.921298597914756" y="181.693582557145419" width="97.703504413963856" height="54.604102965799029" rx="4" ry="4" style="fill: #2a2aea;"></rect>
        <g>
          <line x1="524.624801635742188" y1="208.995635986328125" x2="538.960853576660156" y2="208.995635986328125" style="fill: none; stroke: #000; stroke-miterlimit: 10;"></line>
          <polygon points="537.793655395507812 212.984710693359375 544.701835632324219 208.995635986328125 537.793655395507812 205.006561279296875 537.793655395507812 212.984710693359375"></polygon>
        </g>
        <g>
          <line x1="686.895233154296875" y1="208.995635986328125" x2="701.231292724609375" y2="208.995635986328125" style="fill: none; stroke: #000; stroke-miterlimit: 10;"></line>
          <polygon points="700.0640869140625 212.984710693359375 706.9722900390625 208.995635986328125 700.0640869140625 205.006561279296875 700.0640869140625 212.984710693359375"></polygon>
        </g>
        <rect x="544.777513618457306" y="181.693582557145419" width="142.117730558111361" height="54.604102965799029" rx="4" ry="4" style="fill: #2a2aea;"></rect>
        <text transform="translate(297.70489501953125 213.46331787109375)" style="fill: #fff; font-family: Roboto-Regular, Roboto; font-size: 14px;"><tspan x="0" y="0">qtimlvconverter</tspan></text>
        <text transform="translate(445.864990234375 213.46331787109375)" style="fill: #fff; font-family: Roboto-Regular, Roboto; font-size: 14px;"><tspan x="0" y="0">qtimltflite</tspan></text>
        <text transform="translate(561.343505859375 213.46331787109375)" style="fill: #fff; font-family: Roboto-Regular, Roboto; font-size: 14px;"><tspan x="0" y="0">qtimlpostprocess</tspan></text>
      </g>
      <g>
        <line x1="816.435638427734375" y1="198.074798583984375" x2="830.771697998046875" y2="198.074798583984375" style="fill: none; stroke: #000; stroke-miterlimit: 10;"></line>
        <polygon points="829.6044921875 202.063873291015625 836.512664794921875 198.074798583984375 829.6044921875 194.08575439453125 829.6044921875 202.063873291015625"></polygon>
      </g>
      <g>
        <line x1="945.720855712890625" y1="198.074798583984375" x2="991.231353759765625" y2="198.074798583984375" style="fill: none; stroke: #000; stroke-miterlimit: 10;"></line>
        <polygon points="990.064178466796875 202.063873291015625 996.972381591796875 198.074798583984375 990.064178466796875 194.08575439453125 990.064178466796875 202.063873291015625"></polygon>
      </g>
      <text transform="translate(950.045249938964844 191.515106201171875)" style="font-family: Roboto-Regular, Roboto; font-size: 14px;"><tspan x="0" y="0">sink_1</tspan></text>
      <line x1="1218.726218656935089" y1="129.148834450643335" x2="1232.561765530770572" y2="129.148834450643335" style="fill: none; stroke: #000; stroke-miterlimit: 10;"></line>
    </g>
  </g>
</svg>

For information about the plugins used in this pipeline, see [Pipeline flow](https://docs.qualcomm.com/doc/80-70022-50/topic/camera-ai-detection-overlay-composer-display.html#camera-ai-detection-overlay-composer-display__section_l1h_cpk_bdc).

## Model files

Table : Models used for detection and classification

| Purpose | LiteRT model | Description |
| :--- | :--- | :--- |
| Object detection | YOLOX | <ol class="ol" id="camera-ai-detection-overlay-composer-display__ul_cfw_r4k_bdc"><br>                                    <li class="li">Identify the object in a scene from a camera stream.</li><br><br>                                    <li class="li">Overlay the bounding boxes over the detected objects.</li><br><br>                                </ol> |
| Image classification | InceptionV3 | <ol class="ol" id="camera-ai-detection-overlay-composer-display__ol_jll_v4k_bdc"><br>                                    <li class="li">Classify a scene from a camera stream.</li><br><br>                                    <li class="li">Overlay the classification labels on the screen.</li><br><br>                                </ol> |

## Run the application on the target device

1. Ensure that you complete the [Prerequisites](https://docs.qualcomm.com/doc/80-70022-50/topic/prerequisites-for-python-sample-applications.html).
2. Run the detection and classification script on the target
                    device:

        gst-camera-two-stream-detection-and-classification-side-by-side.pyCopy to clipboard
3. To display the available help options, run the following
                        command:

        gst-camera-two-stream-detection-and-classification-side-by-side.py -hCopy to clipboard

Table : Default directories for model and label files for object detection and
                    classification python application

| Model and label files | Directory |
| :--- | :--- |
| Detection model | /etc/models/yolox\_quantized.tflite |
| Detection labels | /etc/labels/yolox.json |
| Classification model | /etc/models/inception\_v3\_quantized.tflite |
| Classification labels | /etc/labels/classification.json |

## Expected output

The images are shown side by side on the display.

## Pipeline flow

| Process | Description |
| --- | --- |
| [qtiqmmfsrc](https://docs.qualcomm.com/doc/80-70022-50/topic/qtiqmmfsrc.html) | Collects two video streams from the camera:<ul class="ul"><br>                                    <li class="li">Stream for detection is split using tee and sent to the<br>                                            following:<ul class="ul" id="camera-ai-detection-overlay-composer-display__ul_elf_dtk_bdc"><br>                                            <li class="li">qtimetamux to retain the video stream.</li><br><br>                                            <li class="li">qtimlvconverter to convert the video stream to input<br>                                                tensors for the detection inference.</li><br><br>                                        </ul><br></li><br><br>                                    <li class="li">Stream for classification is split using tee and sent to the<br>                                            following:<ul class="ul" id="camera-ai-detection-overlay-composer-display__ul_nsj_htk_bdc"><br>                                            <li class="li">qtimetamux to retain the video stream.</li><br><br>                                            <li class="li">qtimlvconverter to convert the video stream to input<br>                                                tensors for the classification inference.</li><br><br>                                        </ul><br><br>                                    </li><br><br>                                </ul> |
| **Preprocessing** | **Preprocessing** |
| [qtimlvconverter](https://docs.qualcomm.com/doc/80-70022-50/topic/qtimlvconverter.html) | <ol class="ol" id="camera-ai-detection-overlay-composer-display__ol_i5w_4wl_vbc"><br>                                    <li class="li">Receives the video stream on its sink pad.</li><br><br>                                    <li class="li">Performs preprocessing:<ul class="ul" id="camera-ai-detection-overlay-composer-display__ol_zdw_qwl_vbc"><br>                                            <li class="li">Color conversion</li><br><br>                                            <li class="li">Scaling down/up</li><br><br>                                            <li class="li">Normalization on the stream data when the model<br>                                                expects the floating point values as an input</li><br><br>                                        </ul><br></li><br><br>                                    <li class="li">Converts the video stream to a tensor stream on its source<br>                                            pad.<p class="p">The classification model uses this tensor stream<br>                                            for inferencing.</p><br></li><br><br>                                </ol> |
| **Inferencing** | **Inferencing** |
| [qtimltflite](https://docs.qualcomm.com/doc/80-70022-50/topic/qtimltflite.html) | <ol class="ol" id="camera-ai-detection-overlay-composer-display__ol_u1l_cxl_vbc"><br>                                    <li class="li">Loads the model.</li><br><br>                                    <li class="li">Modifies the graph for the chosen delegate.</li><br><br>                                    <li class="li">Receives the tensor stream on its sinkpad.</li><br><br>                                    <li class="li">Runs the inference and produces a tensor stream with the<br>                                        inference results on its source pad.</li><br><br>                                </ol> |
| **Postprocessing** | **Postprocessing** |
| qtimlpostprocess | <ol class="ol" id="camera-ai-detection-overlay-composer-display__ol_ky5_grn_vbc"><br>                                    <li class="li"> Receives the inference tensors from the object detection<br>                                        model. </li><br><br>                                    <li class="li">Converts the inference tensors on its sinkpad into formats<br>                                        such as video or text that the multimedia plugins can<br>                                        process later.</li><br><br>                                    <li class="li">Applies the threshold to the chosen number of results. </li><br><br>                                    <li class="li">Loads the corresponding modules for detection models. <p class="p">In<br>                                            this use case, qtimlpostprocess does the following:<br>                                            </p><ol class="ol" type="a" id="camera-ai-detection-overlay-composer-display__ol_jcd_wnk_5bc"><br>                                            <li class="li">Loads the YOLOv8 submodule. </li><br><br>                                            <li class="li">Produces results as structures of text.</li><br><br>                                            <li class="li">Sends them to the sinkpad of qtimetamux.</li><br><br>                                        </ol><br></li><br><br>                                </ol> |
| qtimlpostprocess | <ol class="ol" id="camera-ai-detection-overlay-composer-display__ol_o3v_2xl_vbc"><br>                                    <li class="li">Receives the inference results from a classification model<br>                                        on its sinkpad. </li><br><br>                                    <li class="li">Converts the inference tensors into formats such as video or<br>                                        text that the multimedia plugins can process later. </li><br><br>                                    <li class="li">Applies the threshold to the chosen number of results. </li><br><br>                                    <li class="li">Loads the corresponding modules for the classification<br>                                        models. <p class="p">In this use case, qtimlpostprocess does the<br>                                            following: </p><ol class="ol" type="a" id="camera-ai-detection-overlay-composer-display__ol_p3v_2xl_vbc"><br>                                            <li class="li">Loads the submodule of the model.</li><br><br>                                            <li class="li">Produces results as video frames with classification<br>                                                labels.</li><br><br>                                            <li class="li">Sends them to the sinkpad of qtivcomposer.</li><br><br>                                        </ol><br></li><br><br>                                </ol> |
| [qtimetamux](https://docs.qualcomm.com/doc/80-70022-50/topic/qtimetamux.html) | <ol class="ol" id="camera-ai-detection-overlay-composer-display__ol_ll3_x5l_vbc"><br>                                    <li class="li">Receives video stream and text stream with bounding box<br>                                        results corresponding to the video stream on its<br>                                        sinkpads.</li><br><br>                                    <li class="li">Produces GST buffers with contents of the video stream from<br>                                        its sink pad.</li><br><br>                                    <li class="li">Adds bounding boxes as GstVideoRegionOfInterest from data<br>                                        sinkpad to GST buffers meta (meta muxing) on its source<br>                                        pad.</li><br><br>                                </ol> |
| [qtivoverlay](https://docs.qualcomm.com/doc/80-70022-50/topic/qtioverlay.html) | <ol class="ol" id="camera-ai-detection-overlay-composer-display__ol_wst_y5l_vbc"><br>                                    <li class="li">Receives the multiplexed stream.</li><br><br>                                    <li class="li">Overlays the bounding boxes on the VideoFrame using CL.</li><br><br>                                    <li class="li">Produces GST buffers with overlays in its source pad.</li><br><br>                                </ol> |
| [qtivcomposer](https://docs.qualcomm.com/doc/80-70022-50/topic/qtivcomposer.html) | <ol class="ol" id="camera-ai-detection-overlay-composer-display__ol_nmc_lxl_vbc"><br>                                    <li class="li">Receives the original video stream with classification<br>                                        results on its sinkpads. </li><br><br>                                    <li class="li">On its sourcepad, produces GST buffers with contents<br>                                        composed of video streams from its sinkpads.</li><br><br>                                </ol> |
| **Output** | **Output** |
| [Waylandsink](https://docs.qualcomm.com/doc/80-70022-50/topic/waylandsink.html) | <ol class="ol" id="camera-ai-detection-overlay-composer-display__ol_cgt_mwl_vbc"><br>                                    <li class="li">Receives the video in its sinkpad</li><br><br>                                    <li class="li">Submits the video stream to Weston. </li><br><br>                                    <li class="li">Weston renders the video stream on a local display<br>                                        device.</li><br><br>                                </ol> |
|  |  |

## Known issues

- Output labels are blurred.
- For camera use case, use a camera with 640 × 360 resolution support.

## Related information

- [Object detection](https://docs.qualcomm.com/doc/80-70022-50/topic/gst-ai-object-detection.html)
- [Image classification](https://docs.qualcomm.com/doc/80-70022-50/topic/gst-ai-classification.html)

**Parent Topic:** [Run Python-based applications](https://docs.qualcomm.com/doc/80-70022-50/topic/python-sample-applications.html)

Last Published: Feb 20, 2026

[Previous Topic
Decode JPEG images using Python](https://docs.qualcomm.com/bundle/publicresource/80-70022-50/topics/decode-jpeg-images-using-python.md) [Next Topic
Transform and encode a camera stream](https://docs.qualcomm.com/bundle/publicresource/80-70022-50/topics/camera-transform-downscale-and-rotate-encode.md)