# Run LiteRT Model on GPU

The GPU open-source delegate accelerates LiteRT models on various
vendor-specific GPUs, including the Adreno GPU.

LiteRT can use the GPU delegate to improve the parallel-processing
power of GPUs, which makes inferencing faster. The GPU delegate uses
OpenCL kernels to run neural network operations within a LiteRT model
execution graph on the GPU.

The default cross-compilation of the GPU delegate includes the LiteRT
library, optimizing the execution of the following LiteRT models on
the Adreno GPU:

1. 16-bit floating-point
2. 32-bit floating-point

For more information, see [GPU delegates for LiteRT](https://www.tensorflow.org/lite/performance/gpu).

To run a LiteRT model using the GPU delegate, see [Deploy LiteRT as a Native application](https://docs.qualcomm.com/doc/80-80022-15B/topic/deploy-litert-as-a-native-application.html).

Last Published: Jun 23, 2026

[Previous Topic
Run a LiteRT model on CPU](https://docs.qualcomm.com/bundle/publicresource/80-80022-15B/topics/run-litert-model-on-cpu.md) [Next Topic
Run LiteRT Model on NPU](https://docs.qualcomm.com/bundle/publicresource/80-80022-15B/topics/run-litert-model-on-npu.md)