# Run a LiteRT model on NPU Use the [LiteRT](https://ai.google.dev/edge/litert) runtime to run existing quantized LiteRT models on the NPU of Qualcomm® Dragonwing™ devices. ## Prerequisites Before running the LiteRT sample applications, complete the prerequisites such as connecting to device with SSH, downloading reference Python app, models and run them with file output and output to a connected display. 1. Sign in with SSH and connect to the target device. For detailed instructions, see: - [Sign in using SSH for Qualcomm Linux](https://docs.qualcomm.com/bundle/publicresource/topics/80-70022-254/how_to.html#use-ssh) - [Sign in using SSH for Ubuntu Server](https://docs.qualcomm.com/bundle/publicresource/topics/80-90441-1/Use_Ubuntu_on_RB3_Gen2_3.html#sign-in-to-the-rb3-gen-2-console-using-ssh) Note If SSH is already set up and Wi-Fi is connected, skip this step. 2. Sign in to the target device using SSH: Tab Qualcomm Linux Tab Ubuntu Server ssh root@ Copy to clipboard ssh ubuntu@ Copy to clipboard 3. On the target device, obtain the `download_artifacts.sh` script, set executable permissions, and run it with the required arguments to download the model and label files to the device. curl -L -O https://raw.githubusercontent.com/quic/sample-apps-for-qualcomm-linux/refs/heads/main/qualcomm-linux/scripts/download_artifacts.sh Copy to clipboard chmod +x download_artifacts.sh Copy to clipboard ./download_artifacts.sh Copy to clipboard 4. Install the LiteRT runtime and other dependencies by setting up Python environment on the target device. Tab Qualcomm Linux Tab Ubuntu Server Install the LiteRT runtime, Pillow, and OpenCV packages. pip3 install ai-edge-litert==1.3.0 Pillow opencv-python Copy to clipboard 1. Install Python pip and virtual environment. sudo apt install python3-pip python3-venv Copy to clipboard 2. Create a new virtual environment, and install the LiteRT runtime, Pillow and OpenCV packages. python3 -m venv .venv-litert-demo --system-site-packages source .venv-litert-demo/bin/activate pip3 install ai-edge-litert==1.3.0 Pillow pip3 install opencv-python Copy to clipboard 3. Install the necessary python3 and GTK packages. sudo apt install -y python3-gi python3-gi-cairo gir1.2-gtk-3.0 python3-full pkg-config cmake libcairo2-dev libgirepository1.0-dev gir1.2-glib-2.0 build-essential python3-dev python3-pip pkg-config meson Copy to clipboard ## Run an objection detection application The following Python application performs object detection in real time on a video file using a quantized YoloX LiteRT model and displays the annotated frames to file or wayland display. It’s optimized for edge AI scenarios using hardware acceleration through the QNN LiteRT delegate. 1. Create and go to the `/etc/apps/` directory. mkdir -p /etc/apps/ && cd /etc/apps/ Copy to clipboard 2. Download the `object_detection.py` file. curl -L https://raw.githubusercontent.com/quic/sample-apps-for-qualcomm-linux/refs/heads/main/qualcomm-linux/applications/LiteRT/object_detection.py -o /etc/apps/object_detection.py Copy to clipboard To create your own local copy of `object_detection.py`, see [create object-detection.py](https://docs.qualcomm.com/doc/80-70022-15B/topic/run-a-litert-model-using-delegate.html#id1). 3. Run the application: Tab Output to file Tab Output to display python3 object_detection.py --output file Copy to clipboard In the terminal of the target device, run do the following: 1. Activate the display: - For Linux: export XDG_RUNTIME_DIR=/dev/socket/weston && export WAYLAND_DISPLAY=wayland-1 Copy to clipboard - For Ubuntu: export XDG_RUNTIME_DIR=/run/user/$(id -u ubuntu)/ && export WAYLAND_DISPLAY=wayland-1 Copy to clipboard 2. Run the object detection application: python3 object_detection.py --output wayland Copy to clipboard ## Create `object-detection.py` To create an application similar to the object-detection application described in the previous section, create an `object-detection.py` file as follows: 1. In the `/etc/apps/` folder, create an `object_detection.py` file. 2. Add the following code to your `object_detection.py` file. Note The postprocessing in the following code is compatible with object detection models from AI Hub. For custom models, you must update the post-processing logic to align with the model’s output format and specific requirements. 1. Import the required packages: #!/usr/bin/env python3 import cv2 import numpy as np import argparse import ai_edge_litert.interpreter as tflite import gi gi.require_version('Gst', '1.0') from gi.repository import Gst Copy to clipboard 2. Handle output arguments: parser = argparse.ArgumentParser(description="Run object detection and output to file or Wayland.") parser.add_argument("--output", choices=["file", "wayland"], default="file", help="Choose output mode: 'file' (default) or 'wayland'") args = parser.parse_args() Copy to clipboard 3. Initialize and configure model parameters: MODEL_PATH = "/etc/models/yolox_quantized.tflite" LABEL_PATH = "/etc/labels/coco_labels.txt" VIDEO_IN = "/etc/media/video.mp4" VIDEO_OUT = "output_object_detection.mp4" DELEGATE_PATH = "libQnnTFLiteDelegate.so" FRAME_W, FRAME_H = 1600, 900 FPS_OUT = 30 CONF_THRES = 0.25 NMS_IOU_THRES = 0.50 BOX_SCALE = 3.2108588218688965 BOX_ZP = 31.0 SCORE_SCALE = 0.0038042240776121616 Copy to clipboard 4. Load the model and set up the LiteRT delegate: delegate_options = {'backend_type': 'htp'} delegate = tflite.load_delegate(DELEGATE_PATH, delegate_options) interpreter = tflite.Interpreter(model_path=MODEL_PATH, experimental_delegates=[delegate]) interpreter.allocate_tensors() in_det = interpreter.get_input_details() out_det = interpreter.get_output_details() in_h, in_w = in_det[0]["shape"][1:3] labels = [l.strip() for l in open(LABEL_PATH)] Copy to clipboard 5. Set up video capture and preprocessing: cap = cv2.VideoCapture(VIDEO_IN) sx, sy = FRAME_W / in_w, FRAME_H / in_h frame_rs = np.empty((FRAME_H, FRAME_W, 3), np.uint8) input_tensor = np.empty((1, in_h, in_w, 3), np.uint8) Copy to clipboard 6. Create a GStreamer pipeline to stream frames to the wayland display: if args.output == "file": fourcc = cv2.VideoWriter_fourcc(*"mp4v") out_writer = cv2.VideoWriter(VIDEO_OUT, fourcc, FPS_OUT, (FRAME_W, FRAME_H)) else: Gst.init(None) # Enables real-time display of processed frames. pipeline = Gst.parse_launch( 'appsrc name=src is-live=true block=true format=time caps=video/x-raw,format=BGR,width=1600,height=900,framerate=30/1 ! videoconvert ! waylandsink' ) appsrc = pipeline.get_by_name('src') pipeline.set_state(Gst.State.PLAYING) frame_cnt = 0 Copy to clipboard 7. Initialize the main loop to open the video, run inference on each frame, and draw bounding boxes on the output: # -------------------- Main Loop -------------------- while True: ok, frame = cap.read() if not ok: break frame_cnt += 1 # Resizes and preprocesses each frame. cv2.resize(frame, (FRAME_W, FRAME_H), dst=frame_rs) cv2.resize(frame_rs, (in_w, in_h), dst=input_tensor[0]) # Runs inference on each frame. interpreter.set_tensor(in_det[0]['index'], input_tensor) interpreter.invoke() boxes_q = interpreter.get_tensor(out_det[0]['index'])[0] scores_q = interpreter.get_tensor(out_det[1]['index'])[0] classes_q = interpreter.get_tensor(out_det[2]['index'])[0] # Dequantizes the model outputs using predefined scales and zero-points. boxes = BOX_SCALE * (boxes_q.astype(np.float32) - BOX_ZP) scores = SCORE_SCALE * scores_q.astype(np.float32) classes = classes_q.astype(np.int32) # Applies a confidence threshold to filter low-probability detections. mask = scores >= CONF_THRES if np.any(mask): boxes_f = boxes[mask] scores_f = scores[mask] classes_f = classes[mask] x1, y1, x2, y2 = boxes_f.T boxes_cv2 = np.column_stack((x1, y1, x2 - x1, y2 - y1)) # Uses non-maximum suppression (NMS) to remove overlapping boxes. idx_cv2 = cv2.dnn.NMSBoxes( bboxes=boxes_cv2.tolist(), scores=scores_f.tolist(), score_threshold=CONF_THRES, nms_threshold=NMS_IOU_THRES ) if len(idx_cv2): idx = idx_cv2.flatten() sel_boxes = boxes_f[idx] sel_scores = scores_f[idx] sel_classes = classes_f[idx] sel_boxes[:, [0, 2]] *= sx sel_boxes[:, [1, 3]] *= sy sel_boxes = sel_boxes.astype(np.int32) sel_boxes[:, [0, 2]] = np.clip(sel_boxes[:, [0, 2]], 0, FRAME_W - 1) sel_boxes[:, [1, 3]] = np.clip(sel_boxes[:, [1, 3]], 0, FRAME_H - 1) for (x1i, y1i, x2i, y2i), sc, cl in zip(sel_boxes, sel_scores, sel_classes): # Draws bounding boxes and labels on the frame using OpenCV # and logs the highest detection score every 100 frames. cv2.rectangle(frame_rs, (x1i, y1i), (x2i, y2i), (0, 255, 0), 2) lab = labels[cl] if cl < len(labels) else str(cl) cv2.putText(frame_rs, f"{lab} {sc:.2f}", (x1i, max(10, y1i - 5)), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2) # Output based on mode # Write output to a file. if args.output == "file": out_writer.write(frame_rs) # Stream output to a Wayland display. else: data = frame_rs.tobytes() # Converts frames to GStreamer buffers and pushes them to the pipeline with timestamps for smooth playback. buf = Gst.Buffer.new_allocate(None, len(data), None) buf.fill(0, data) buf.duration = Gst.util_uint64_scale_int(1, Gst.SECOND, FPS_OUT) timestamp = cap.get(cv2.CAP_PROP_POS_MSEC) * Gst.MSECOND buf.pts = buf.dts = int(timestamp) appsrc.emit('push-buffer', buf) Copy to clipboard 8. Release the pipeline and notify user of completion. cap.release() if args.output == "file": out_writer.release() print(f"Done - processed video saved to {VIDEO_OUT}") else: appsrc.emit('end-of-stream') pipeline.set_state(Gst.State.NULL) print("Done - video streamed to Wayland sink") Copy to clipboard Last Published: Nov 28, 2025 [Previous Topic Run prebuilt AI models and applications](https://docs.qualcomm.com/bundle/publicresource/80-70022-15B/topics/run-prebuilt-models-and-apps.md) [Next Topic Experience AI applications with Qdemo UI](https://docs.qualcomm.com/bundle/publicresource/80-70022-15B/topics/run-the-gui-demo.md)