# TensorFlow Lite Runtime

Source: [https://docs.qualcomm.com/doc/80-70014-54/topic/tensorflow-lite-runtime.html](https://docs.qualcomm.com/doc/80-70014-54/topic/tensorflow-lite-runtime.html)

The TensorFlow Lite on-device inference loads the model into an interpreter, which
        parses the model and uses a delegate to run it.

TensorFlow Lite on-device inference does the following:

1. Loads the TensorFlow Lite model into a TensorFlow Lite interpreter interface that
                parses the model to identify neural network operators present within the model.
2. The interpreter interface is further configured to run the model by using a
                delegate.
3. The interpreter invokes model inference on provided inputs and saves the
                corresponding outputs of model inference into buffers provided to the interpreter
                interface.

Qualcomm supports executing TensorFlow Lite models on the following accelerators using
            delegates:

- CPU
- Adreno GPU
- Hexagon Tensor Processor

The following table lists the delegates and accelerators that support these
            delegates:

Table : Supported delegates and accelerators

| Delegate | Acceleration |
| --- | --- |
| XNNPACK delegate | CPU |
| GPU delegate | GPU |
| Qualcomm® AI Engine Direct delegate (Qualcomm® Neural Network (QNN)<br>                            delegate) | CPU, GPU, and Hexagon Tensor Processor |

**Parent Topic:** [Architecture](https://docs.qualcomm.com/doc/80-70014-54/topic/arch.html)

Last Published: Jul 12, 2024

[Previous Topic
Architecture](https://docs.qualcomm.com/bundle/publicresource/80-70014-54/topics/arch.md) [Next Topic
Delegates](https://docs.qualcomm.com/bundle/publicresource/80-70014-54/topics/delegates.md)