# Object detection using USB camera source

Source: [https://docs.qualcomm.com/doc/80-70023-50/topic/object-detection-using-usb-camera-source.html](https://docs.qualcomm.com/doc/80-70023-50/topic/object-detection-using-usb-camera-source.html)

The use case streams video from a USB webcam connected to the Qualcomm EVK. This
        webcam should be accessible as a /dev/videoX device. Additionally, you
        can perform object detection and preview the results.

Note: For USB camera input, set the `video-format`,
                `resolution`, and `framerate` parameters in the
            configuration file to match the capabilities of the camera. To check the camera
            capabilities, see [Configure USB camera](https://docs.qualcomm.com/bundle/publicresource/topics/80-70023-8/usb.html#configure-usb-camera).

Run the following commands on the target device for different use cases: 
- MJPEG video
                    format:

        gst-launch-1.0 -v -e --gst-debug=2 v4l2src device="/dev/video2" ! image/jpeg,width=1920,height=1080,framerate=30/1 ! jpegdec ! \
        videoconvert ! video/x-raw,format=NV12 ! qtivtransform ! queue ! tee name=split ! queue ! qtivcomposer name=mixer ! queue ! \
        fpsdisplaysink sync=true text-overlay=true video-sink="waylandsink sync=true fullscreen=true" split. ! queue ! qtimlvconverter ! queue ! \
        qtimltflite delegate=external external-delegate-path=libQnnTFLiteDelegate.so external-delegate-options="QNNExternalDelegate,backend_type=htp;" \
        model=/etc/models/yolox_quantized.tflite ! queue ! qtimlpostprocess settings="{\"confidence\": 75.0}" results=10 module=yolov8 labels=/etc/labels/yolox.json \
        ! video/x-raw,format=BGRA,width=640,height=360 ! queue ! mixer.Copy to clipboard
- YUY2 video
                    format:

        gst-launch-1.0 -v -e --gst-debug=2 v4l2src io-mode=4 device="/dev/video2" ! \
        video/x-raw,format=YUY2,width=640,height=480,framerate=30/1 ! queue ! tee name=split ! queue ! \
        qtivcomposer name=mixer ! queue ! fpsdisplaysink sync=true text-overlay=true video-sink="waylandsink sync=true fullscreen=true" split. ! \
        queue ! qtimlvconverter ! queue ! qtimltflite delegate=external external-delegate-path=libQnnTFLiteDelegate.so \
        external-delegate-options="QNNExternalDelegate,backend_type=htp;" model=/etc/models/yolox_quantized.tflite ! queue ! \
        qtimlpostprocess settings="{\"confidence\": 75.0}" results=10 module=yolov8 labels=/etc/labels/yolox.json \
        ! video/x-raw,format=BGRA,width=640,height=360 ! queue ! mixer.Copy to clipboard
- NV12 video
                    format:

        gst-launch-1.0 -v -e --gst-debug=2 v4l2src io-mode=4 device="/dev/video2" ! video/x-raw,format=NV12,width=640,height=480,framerate=30/1 ! queue ! tee name=split split. ! \
        queue ! qtivcomposer name=mixer ! queue ! waylandsink fullscreen=true split. ! queue ! qtimlvconverter ! queue ! \
        qtimltflite delegate=external external-delegate-path=libQnnTFLiteDelegate.so external-delegate-options="QNNExternalDelegate,backend_type=htp;" \
        model=/etc/models/yolox_quantized.tflite ! queue ! qtimlpostprocess settings="{\"confidence\": 75.0}" results=10 module=yolov8 labels=/etc/labels/yolox.json \
        ! video/x-raw,format=BGRA,width=640,height=360 ! queue ! mixer.Copy to clipboard

The following figures show a pipeline, which processes the input from the USB camera to
            generate various outputs.
Figure : Object detection using USB camera source
                
                <?xml version="1.0" encoding="UTF-8"?>
<svg id="Layer_3" data-name="Layer 3" xmlns="http://www.w3.org/2000/svg" width="1281.08" height="147.76" viewbox="0 0 1281.08 147.76">
  <g>
    <rect x=".5" y=".5" width="1280.08" height="146.76" rx="7.5" ry="7.5" style="fill: #fafafa;"></rect>
    <path d="M1273.08,1c3.86,0,7,3.14,7,7v131.76c0,3.86-3.14,7-7,7H8c-3.86,0-7-3.14-7-7V8c0-3.86,3.14-7,7-7h1265.08M1273.08,0H8C3.58,0,0,3.58,0,8v131.76c0,4.42,3.58,8,8,8h1265.08c4.42,0,8-3.58,8-8V8c0-4.42-3.58-8-8-8h0Z" style="fill: #d2d7e1;"></path>
  </g>
  <rect x="1220" y="41.09" width="40" height="28" rx="4" ry="4" style="fill: none; stroke: #000; stroke-linecap: round; stroke-linejoin: round; stroke-width: 2px;"></rect>
  <line x1="1232" y1="77.09" x2="1248" y2="77.09" style="fill: none; stroke: #000; stroke-linecap: round; stroke-linejoin: round; stroke-width: 2px;"></line>
  <line x1="1240" y1="69.09" x2="1240" y2="77.09" style="fill: none; stroke: #000; stroke-linecap: round; stroke-linejoin: round; stroke-width: 2px;"></line>
  <rect x="75.8" y="20" width="90" height="75" rx="4" ry="4" style="fill: #007884;"></rect>
  <line x1="60.8" y1="57.5" x2="75.8" y2="57.5" style="fill: none; stroke: #000; stroke-miterlimit: 10;"></line>
  <line x1="1205" y1="57.5" x2="1220" y2="57.5" style="fill: none; stroke: #000; stroke-miterlimit: 10;"></line>
  <text transform="translate(94.98 62.18)" style="fill: #fff; font-family: Roboto-Regular, Roboto; font-size: 16px;"><tspan x="0" y="0">v4l2src</tspan></text>
  <rect x="385.8" y="35" width="120" height="60" rx="4" ry="4" style="fill: #2a2aea;"></rect>
  <g>
    <line x1="365.8" y1="65" x2="380.06" y2="65" style="fill: none; stroke: #000; stroke-miterlimit: 10;"></line>
    <polygon points="378.89 68.99 385.8 65 378.89 61.01 378.89 68.99"></polygon>
  </g>
  <text transform="translate(395.75 69.09)" style="fill: #fff; font-family: Roboto-Regular, Roboto; font-size: 14px;"><tspan x="0" y="0">qtimlvtransform</tspan></text>
  <rect x="525.8" y="35" width="120" height="60" rx="4" ry="4" style="fill: #2a2aea;"></rect>
  <g>
    <line x1="505.8" y1="65" x2="520.06" y2="65" style="fill: none; stroke: #000; stroke-miterlimit: 10;"></line>
    <polygon points="518.89 68.99 525.8 65 518.89 61.01 518.89 68.99"></polygon>
  </g>
  <text transform="translate(537.26 69.09)" style="fill: #fff; font-family: Roboto-Regular, Roboto; font-size: 14px;"><tspan x="0" y="0">qtimlvconverter</tspan></text>
  <rect x="665.8" y="35" width="120" height="60" rx="4" ry="4" style="fill: #2a2aea;"></rect>
  <g>
    <line x1="645.8" y1="65" x2="660.06" y2="65" style="fill: none; stroke: #000; stroke-miterlimit: 10;"></line>
    <polygon points="658.89 68.99 665.8 65 658.89 61.01 658.89 68.99"></polygon>
  </g>
  <text transform="translate(693 52.29)" style="fill: #fff; font-family: Roboto-Regular, Roboto; font-size: 14px;"><tspan x="0" y="0">qtimltflite/</tspan><tspan x="-1" y="16.8">qtimlsnpe/</tspan><tspan x="5.29" y="33.6">qtimlqnn</tspan></text>
  <rect x="805.8" y="35" width="120" height="60" rx="4" ry="4" style="fill: #2a2aea;"></rect>
  <g>
    <line x1="785.8" y1="65" x2="800.05" y2="65" style="fill: none; stroke: #000; stroke-miterlimit: 10;"></line>
    <polygon points="798.89 68.99 805.8 65 798.89 61.01 798.89 68.99"></polygon>
  </g>
  <text transform="translate(811.3 69.09)" style="fill: #fff; font-family: Roboto-Regular, Roboto; font-size: 14px;"><tspan x="0" y="0">qtimlpostprocess</tspan></text>
  <rect x="945.8" y="20" width="120" height="75" rx="4" ry="4" style="fill: #2a2aea;"></rect>
  <g>
    <line x1="925.8" y1="65" x2="940.05" y2="65" style="fill: none; stroke: #000; stroke-miterlimit: 10;"></line>
    <polygon points="938.89 68.99 945.8 65 938.89 61.01 938.89 68.99"></polygon>
  </g>
  <text transform="translate(963.04 61.59)" style="fill: #fff; font-family: Roboto-Regular, Roboto; font-size: 14px;"><tspan x="0" y="0">qtivcomposer</tspan></text>
  <rect x="1085.8" y="20" width="120" height="75" rx="4" ry="4" style="fill: #007884;"></rect>
  <g>
    <line x1="1065.8" y1="57.5" x2="1080.05" y2="57.5" style="fill: none; stroke: #000; stroke-miterlimit: 10;"></line>
    <polygon points="1078.89 61.49 1085.8 57.5 1078.89 53.51 1078.89 61.49"></polygon>
  </g>
  <text transform="translate(1104.7 61.59)" style="fill: #fff; font-family: Roboto-Regular, Roboto; font-size: 14px;"><tspan x="0" y="0">Waylandsink </tspan></text>
  <rect x="185.8" y="20" width="90" height="75" rx="4" ry="4" style="fill: #007884;"></rect>
  <g>
    <line x1="165.8" y1="57.5" x2="180.06" y2="57.5" style="fill: none; stroke: #000; stroke-miterlimit: 10;"></line>
    <polygon points="178.89 61.49 185.8 57.5 178.89 53.51 178.89 61.49"></polygon>
  </g>
  <text transform="translate(201.6 61.59)" style="fill: #fff; font-family: Roboto-Regular, Roboto; font-size: 14px;"><tspan x="0" y="0">capsfilter</tspan></text>
  <rect x="295.8" y="20" width="70" height="75" rx="4" ry="4" style="fill: #007884;"></rect>
  <g>
    <line x1="275.8" y1="57.5" x2="290.06" y2="57.5" style="fill: none; stroke: #000; stroke-miterlimit: 10;"></line>
    <polygon points="288.89 61.49 295.8 57.5 288.89 53.51 288.89 61.49"></polygon>
  </g>
  <text transform="translate(321.09 61.59)" style="fill: #fff; font-family: Roboto-Regular, Roboto; font-size: 14px;"><tspan x="0" y="0">tee</tspan></text>
  <g>
    <line x1="365.75" y1="28.88" x2="940.05" y2="28.88" style="fill: none; stroke: #000; stroke-miterlimit: 10;"></line>
    <polygon points="938.89 32.87 945.8 28.88 938.89 24.89 938.89 32.87"></polygon>
  </g>
  <circle cx="40" cy="54.96" r="20" style="fill: none; stroke: #000; stroke-linecap: round; stroke-linejoin: round; stroke-width: 2px;"></circle>
  <circle cx="40" cy="54.96" r="7.5" style="fill: none; stroke: #000; stroke-linecap: round; stroke-linejoin: round; stroke-width: 2px;"></circle>
  <path d="M27.5,84.96h25" style="fill: none; stroke: #000; stroke-linecap: round; stroke-linejoin: round; stroke-width: 2px;"></path>
  <path d="M40,84.96v-10" style="fill: none; stroke: #000; stroke-linecap: round; stroke-linejoin: round; stroke-width: 2px;"></path>
  <g>
    <g>
      <text transform="translate(1082.92 125.85)" style="font-family: Roboto-Regular, Roboto; font-size: 14px;"><tspan x="0" y="0">Qualcomm </tspan></text>
      <rect x="1062.67" y="113.76" width="16" height="16" rx="2" ry="2" style="fill: #2a2aea;"></rect>
    </g>
    <g>
      <text transform="translate(1181.51 125.85)" style="font-family: Roboto-Regular, Roboto; font-size: 14px;"><tspan x="0" y="0">Open source</tspan></text>
      <rect x="1161.26" y="113.76" width="16" height="16" rx="2" ry="2" style="fill: #007884;"></rect>
    </g>
  </g>
</svg>

The following table provides the sequential processing stages of the pipeline
                execution:

| Pipeline | Description |
| --- | --- |
| USB camera and object detection on wayland | <ol class="ol" id="object_detection_using_usb_camera_source__ol_oct_vln_42c"><br>                                    <li class="li">USB camera captures the camera live stream.</li><br><br>                                    <li class="li">Capsfilter is applied to enforce constraints on the raw<br>                                        video data.</li><br><br>                                    <li class="li">tee is used to split the stream for inferencing.</li><br><br>                                    <li class="li"><a href="https://docs.qualcomm.com/doc/80-70023-50/topic/qtivtransform.html">qtivtransform</a> transforms the stream<br>                                        data.</li><br><br>                                    <li class="li"><a href="https://docs.qualcomm.com/doc/80-70023-50/topic/qtimlvconverter.html">qtimlvconverter</a> does preprocessing and<br>                                        converts the video stream to a tensor stream, which is used<br>                                        for inferencing.</li><br><br>                                    <li class="li"><a href="https://docs.qualcomm.com/doc/80-70023-50/topic/qtimltflite.html">qtimltflite</a> runs the inference on the<br>                                        stream.</li><br><br>                                    <li class="li">qtimlpostprocess handles the inference results from any<br>                                        object detection model and produces video frames.</li><br><br>                                    <li class="li"><a href="https://docs.qualcomm.com/doc/80-70023-50/topic/qtivcomposer.html">qtivcomposer</a> composes the video frames<br>                                        and shares them with Waylandsink.</li><br><br>                                    <li class="li"><a href="https://docs.qualcomm.com/doc/80-70023-50/topic/waylandsink.html">Waylandsink</a> submits the composed video<br>                                        stream to Weston, which renders it on the local<br>                                        display.</li><br><br>                                </ol> |

**Parent Topic:** [LiteRT use cases](https://docs.qualcomm.com/doc/80-70023-50/topic/tensorflow-lite-use-cases.html)

Last Published: Mar 27, 2026

[Previous Topic
Four stream batching with LiteRT](https://docs.qualcomm.com/bundle/publicresource/80-70023-50/topics/four-stream-batching-with-litert.md) [Next Topic
Qualcomm Neural Processing SDK use cases](https://docs.qualcomm.com/bundle/publicresource/80-70023-50/topics/qualcomm-neural-processing-sdk-use-cases.md)