# LiteRT overview

Lite Runtime (LiteRT) is an open-source deep learning framework designed for on-device inference. The TensorFlow framework provides tools and APIs to convert a standard pretrained TensorFlow model from the SavedModel format or the Keras format into the LiteRT format.

Topics covered describe the available delegates and methods to run LiteRT models using the Qualcomm^®^ software stack, and explain how to do the following on Qualcomm^®^ Linux^®^:

- Run LiteRT models using the GStreamer-based Qualcomm^®^ Intelligent Multimedia SDK (IM SDK) or the native LiteRT application.
- Convert TensorFlow models to LiteRT models and optimize them for on-device inference.
- Run LiteRT models using a delegate on hardware accelerators, such as CPU, GPU, and the Qualcomm^®^ Hexagon^™^ Tensor Processor.
- Run LiteRT sample applications on an available LiteRT delegate and an external delegate.
- Benchmark LiteRT models.

## Next steps

- [Get started with running LiteRT models](https://docs.qualcomm.com/doc/80-80021-54/topic/getting-started.html#getting-started)
- [Deploy a LiteRT model](https://docs.qualcomm.com/doc/80-80021-54/topic/tensorflow-lite-developer-workflow.html#tensorflow-lite-developer-workflow)
- [Run LiteRT sample applications](https://docs.qualcomm.com/doc/80-80021-54/topic/sample-applications.html#run-litert-sample-apps)

Last Published: Mar 23, 2026

[Previous Topic
Lite Runtime documentation](https://docs.qualcomm.com/bundle/publicresource/80-80021-54/topics/litert-landing-page.md) [Next Topic
Get started with running LiteRT models](https://docs.qualcomm.com/bundle/publicresource/80-80021-54/topics/getting-started.md)