# LiteRT documentation

Use the Lite Runtime (LiteRT) framework to convert, optimize, and run LiteRT models with the help of delegates on the Qualcomm^®^ Linux^®^ development kit.

## LiteRT overview

<svg xmlns="http://www.w3.org/2000/svg" width="16" height="16" viewbox="0 0 16 16" fill="none" aria-label="icon3">
  <path d="M8 2V14M3.33333 2H12.6667C13.403 2 14 2.59695 14 3.33333V12.6667C14 13.403 13.403 14 12.6667 14H3.33333C2.59695 14 2 13.403 2 12.6667V3.33333C2 2.59695 2.59695 2 3.33333 2Z" stroke="#2A2AEA" stroke-width="1.5" stroke-linecap="round" stroke-linejoin="round"></path>
</svg> High-level LiteRT overview

Provides a high-level overview of the LiteRT framework, architecture, delegates, model conversion and quantization methods, and sample applications.

## Get started with running LiteRT models

<svg xmlns="http://www.w3.org/2000/svg" width="16" height="16" viewbox="0 0 16 16" fill="none" aria-label="icon3">
  <path d="M8 2V14M3.33333 2H12.6667C13.403 2 14 2.59695 14 3.33333V12.6667C14 13.403 13.403 14 12.6667 14H3.33333C2.59695 14 2 13.403 2 12.6667V3.33333C2 2.59695 2.59695 2 3.33333 2Z" stroke="#2A2AEA" stroke-width="1.5" stroke-linecap="round" stroke-linejoin="round"></path>
</svg> Prerequisites to run LiteRT models

Set up the Qualcomm Linux development kit, upgrade it to the latest available software release, and flash the software image.

<svg xmlns="http://www.w3.org/2000/svg" width="16" height="16" viewbox="0 0 16 16" fill="none" aria-label="icon3">
  <path d="M8 2V14M3.33333 2H12.6667C13.403 2 14 2.59695 14 3.33333V12.6667C14 13.403 13.403 14 12.6667 14H3.33333C2.59695 14 2 13.403 2 12.6667V3.33333C2 2.59695 2.59695 2 3.33333 2Z" stroke="#2A2AEA" stroke-width="1.5" stroke-linecap="round" stroke-linejoin="round"></path>
</svg> Run a LiteRT model using the GStreamer-based Qualcomm^®^ Intelligent Multimedia SDK

Download the required files and use the gst-ai-classification precompiled sample application to run a LiteRT classification model on the Qualcomm Linux development kit.

<svg xmlns="http://www.w3.org/2000/svg" width="16" height="16" viewbox="0 0 16 16" fill="none" aria-label="icon3">
  <path d="M8 2V14M3.33333 2H12.6667C13.403 2 14 2.59695 14 3.33333V12.6667C14 13.403 13.403 14 12.6667 14H3.33333C2.59695 14 2 13.403 2 12.6667V3.33333C2 2.59695 2.59695 2 3.33333 2Z" stroke="#2A2AEA" stroke-width="1.5" stroke-linecap="round" stroke-linejoin="round"></path>
</svg> Run a LiteRT model using the native LiteRT sample application

Download the required files and use the label\_image native sample application to run a LiteRT classification model on the Qualcomm Linux development kit.

## LiteRT architecture

<svg xmlns="http://www.w3.org/2000/svg" width="16" height="16" viewbox="0 0 16 16" fill="none" aria-label="icon3">
  <path d="M8 2V14M3.33333 2H12.6667C13.403 2 14 2.59695 14 3.33333V12.6667C14 13.403 13.403 14 12.6667 14H3.33333C2.59695 14 2 13.403 2 12.6667V3.33333C2 2.59695 2.59695 2 3.33333 2Z" stroke="#2A2AEA" stroke-width="1.5" stroke-linecap="round" stroke-linejoin="round"></path>
</svg> LiteRT on-device inference overview

Learn how LiteRT on-device inference loads a model, which is subsequently parsed and executed by the interpreter using a delegate.

<svg xmlns="http://www.w3.org/2000/svg" width="16" height="16" viewbox="0 0 16 16" fill="none" aria-label="icon3">
  <path d="M8 2V14M3.33333 2H12.6667C13.403 2 14 2.59695 14 3.33333V12.6667C14 13.403 13.403 14 12.6667 14H3.33333C2.59695 14 2 13.403 2 12.6667V3.33333C2 2.59695 2.59695 2 3.33333 2Z" stroke="#2A2AEA" stroke-width="1.5" stroke-linecap="round" stroke-linejoin="round"></path>
</svg> Accelerate LiteRT models using delegates

Use delegates to speed up models efficiently on the CPU, GPU, and specialized Qualcomm hardware, such as the Qualcomm^®^ Adreno^™^ GPU and the Qualcomm^®^ Hexagon^™^ Tensor Processor.

<svg xmlns="http://www.w3.org/2000/svg" width="16" height="16" viewbox="0 0 16 16" fill="none" aria-label="icon3">
  <path d="M8 2V14M3.33333 2H12.6667C13.403 2 14 2.59695 14 3.33333V12.6667C14 13.403 13.403 14 12.6667 14H3.33333C2.59695 14 2 13.403 2 12.6667V3.33333C2 2.59695 2.59695 2 3.33333 2Z" stroke="#2A2AEA" stroke-width="1.5" stroke-linecap="round" stroke-linejoin="round"></path>
</svg> Qualcomm^®^ AI Engine direct delegate interface

Include the `QnnTFLiteDelegate.h` header and link the appropriate Qualcomm^®^ Neural Network (QNN) delegate library for application compatibility.

## Deploy a LiteRT model

<svg xmlns="http://www.w3.org/2000/svg" width="16" height="16" viewbox="0 0 16 16" fill="none" aria-label="icon3">
  <path d="M8 2V14M3.33333 2H12.6667C13.403 2 14 2.59695 14 3.33333V12.6667C14 13.403 13.403 14 12.6667 14H3.33333C2.59695 14 2 13.403 2 12.6667V3.33333C2 2.59695 2.59695 2 3.33333 2Z" stroke="#2A2AEA" stroke-width="1.5" stroke-linecap="round" stroke-linejoin="round"></path>
</svg> Use a pre-optimized LiteRT model

Download and use ready-to-deploy LiteRT models from the open-source community
or Qualcomm^®^  AI Hub.

<svg xmlns="http://www.w3.org/2000/svg" width="16" height="16" viewbox="0 0 16 16" fill="none" aria-label="icon3">
  <path d="M8 2V14M3.33333 2H12.6667C13.403 2 14 2.59695 14 3.33333V12.6667C14 13.403 13.403 14 12.6667 14H3.33333C2.59695 14 2 13.403 2 12.6667V3.33333C2 2.59695 2.59695 2 3.33333 2Z" stroke="#2A2AEA" stroke-width="1.5" stroke-linecap="round" stroke-linejoin="round"></path>
</svg> Convert a TensorFlow model to a LiteRT model

Use Python APIs and the `tflite_convert` command to convert models to the LiteRT format.

<svg xmlns="http://www.w3.org/2000/svg" width="16" height="16" viewbox="0 0 16 16" fill="none" aria-label="icon3">
  <path d="M8 2V14M3.33333 2H12.6667C13.403 2 14 2.59695 14 3.33333V12.6667C14 13.403 13.403 14 12.6667 14H3.33333C2.59695 14 2 13.403 2 12.6667V3.33333C2 2.59695 2.59695 2 3.33333 2Z" stroke="#2A2AEA" stroke-width="1.5" stroke-linecap="round" stroke-linejoin="round"></path>
</svg> Create an application and run inference

Create an application using LiteRT C++ APIs to load a LiteRT model and run inference.

<svg xmlns="http://www.w3.org/2000/svg" width="16" height="16" viewbox="0 0 16 16" fill="none" aria-label="icon3">
  <path d="M8 2V14M3.33333 2H12.6667C13.403 2 14 2.59695 14 3.33333V12.6667C14 13.403 13.403 14 12.6667 14H3.33333C2.59695 14 2 13.403 2 12.6667V3.33333C2 2.59695 2.59695 2 3.33333 2Z" stroke="#2A2AEA" stroke-width="1.5" stroke-linecap="round" stroke-linejoin="round"></path>
</svg> Develop a custom application

Use the qtimltflite GStreamer-based plug-in to develop your own application and run LiteRT models.

## Run LiteRT sample applications

<svg xmlns="http://www.w3.org/2000/svg" width="16" height="16" viewbox="0 0 16 16" fill="none" aria-label="icon3">
  <path d="M8 2V14M3.33333 2H12.6667C13.403 2 14 2.59695 14 3.33333V12.6667C14 13.403 13.403 14 12.6667 14H3.33333C2.59695 14 2 13.403 2 12.6667V3.33333C2 2.59695 2.59695 2 3.33333 2Z" stroke="#2A2AEA" stroke-width="1.5" stroke-linecap="round" stroke-linejoin="round"></path>
</svg> Prerequisites to run LiteRT sample applications

Download and copy models, label files, and a sample image to the device to run the label\_image sample application.

<svg xmlns="http://www.w3.org/2000/svg" width="16" height="16" viewbox="0 0 16 16" fill="none" aria-label="icon3">
  <path d="M8 2V14M3.33333 2H12.6667C13.403 2 14 2.59695 14 3.33333V12.6667C14 13.403 13.403 14 12.6667 14H3.33333C2.59695 14 2 13.403 2 12.6667V3.33333C2 2.59695 2.59695 2 3.33333 2Z" stroke="#2A2AEA" stroke-width="1.5" stroke-linecap="round" stroke-linejoin="round"></path>
</svg> Run a LiteRT model using an available delegate

Run LiteRT models using delegates, such as XNNPACK and GPU, to benchmark model execution.

<svg xmlns="http://www.w3.org/2000/svg" width="16" height="16" viewbox="0 0 16 16" fill="none" aria-label="icon3">
  <path d="M8 2V14M3.33333 2H12.6667C13.403 2 14 2.59695 14 3.33333V12.6667C14 13.403 13.403 14 12.6667 14H3.33333C2.59695 14 2 13.403 2 12.6667V3.33333C2 2.59695 2.59695 2 3.33333 2Z" stroke="#2A2AEA" stroke-width="1.5" stroke-linecap="round" stroke-linejoin="round"></path>
</svg> Run the QNN delegate using an external delegate

Use the Qualcomm AI Engine direct API as an external delegate, along with the associated libraries, to run the QNN delegate.

## Build LiteRT

<svg xmlns="http://www.w3.org/2000/svg" width="16" height="16" viewbox="0 0 16 16" fill="none" aria-label="icon3">
  <path d="M8 2V14M3.33333 2H12.6667C13.403 2 14 2.59695 14 3.33333V12.6667C14 13.403 13.403 14 12.6667 14H3.33333C2.59695 14 2 13.403 2 12.6667V3.33333C2 2.59695 2.59695 2 3.33333 2Z" stroke="#2A2AEA" stroke-width="1.5" stroke-linecap="round" stroke-linejoin="round"></path>
</svg> Optional: Build LiteRT

Recompile LiteRT in specific scenarios such as when you want to change the LiteRT library version.

Last Published: Oct 09, 2025

Next Topic

LiteRT overview