# Run a LiteRT model on CPU

The XNNPACK delegate uses the XNNPACK library to speed up LiteRT
models efficiently on CPUs. XNNPACK is an open-source library from
Google, which does the following:

- Provides an optimized implementation of neural network operators for
Arm CPUs
- Uses low-level CPU instructions, such as the Arm^®^
Neon^™^ instruction set, to optimize operators for efficient
execution

The XNNPACK delegate can run models in both 32-bit floating-point and
int8 formats. For more information, see [XNNPACK back-end for TensorFlow Lite](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/delegates/xnnpack/README.md).

To run a LiteRT model using the XNNPACK delegate, see [Deploy LiteRT as a Native application](https://docs.qualcomm.com/doc/80-80022-15B/topic/deploy-litert-as-a-native-application.html).

Last Published: Jun 23, 2026

[Previous Topic
Supported LiteRT runtimes](https://docs.qualcomm.com/bundle/publicresource/80-80022-15B/topics/run-a-litert-model.md) [Next Topic
Run LiteRT Model on GPU](https://docs.qualcomm.com/bundle/publicresource/80-80022-15B/topics/run-litert-model-on-gpu.md)