# AI hardware cores/accelerators

Source: [https://docs.qualcomm.com/doc/80-63195-1/topic/AI-hardware-cores-accelerators.html](https://docs.qualcomm.com/doc/80-63195-1/topic/AI-hardware-cores-accelerators.html)

Qualcomm Innovators Development Kit supports running the AI/ML models on the three
    hardware cores/accelerators.

- **Qualcomm® Hexagon™  Tensor Processor (HTP)** is an AI accelerator that is
          suited for running computationally-intensive AI workloads. To get improved performance and
          run AI/ML models on HTP, the models must be quantized to one of the supported precisions:
          INT4, INT8, INT16, or FP16.
- **Qualcomm® Adreno™  GPU** can be used to run unquantized FP32/FP16 models
          with a higher throughput compared to the CPU. The GPU can also be used for running UDOs
          implemented using OpenCL.
- **Qualcomm® Kryo™  CPU** supports unquantized models with FP32 precision.
          CPUs can be used to run the UDOs or ops which are not optimized for execution on HTP. It
          can also be used for model benchmarking purposes.

 Below table summarizing the properties of hardware/accelerators available on Snapdragon
      for executing AI/ML Models:

| Accelerator | Supported Data Types | [Quantization](https://docs.qualcomm.com/doc/80-63195-1/topic/quantization.html)/Activation | Power | Throughput | Features |
| :---: | :---: | :---: | :---: | :---: | :---: |
| HTP | INT4, INT8, INT16, FP16 | Needed | Low | High | Dedicated for AI applications.<br><br><br>              <br>Hardware-accelerated Convolution Engine. |
| GPU | INT8, INT16, FP16, FP32 | Not needed | Medium | Medium | Suitable for use cases that require high accuracy but low usage.<br><br><br>              <br>Can run unquantized models with FP32/FP16 precision.<br><br><br>              <br>[UDOs](https://docs.qualcomm.com/doc/80-63195-1/topic/User-defined-Operations-UDO.html) written in<br>                OpenCL can be compiled for GPU. |
| CPU | INT8, INT16, FP16, FP32 | Not needed | High | Low | Reference for accuracy verification and debugging<br><br><br>              <br>Used for quantization process<br><br><br>              <br>Can be used to run ops that are not supported on HTP, and for [UDOs](https://docs.qualcomm.com/doc/80-63195-1/topic/User-defined-Operations-UDO.html) implemented in<br>                languages like C/C++, Java. |

| Datatype | Details |
| :---: | :---: |
| INT4 | 4-bit weights + 8-bit activations |
| INT8 | 8-bit weights + 8-bit activations |
| INT16 | 8-bit weights + 16-bit activations |
| FP16 | 16-bit floating point precession |
| FP32 | 32-bit floating point precession |

**Parent Topic:** [AI resources](https://docs.qualcomm.com/doc/80-63195-1/topic/AI-resources.html)

Last Published: May 16, 2024

[Previous Topic
AI resources](https://docs.qualcomm.com/bundle/publicresource/80-63195-1/topics/AI-resources.md) [Next Topic
AI software accelerator framework](https://docs.qualcomm.com/bundle/publicresource/80-63195-1/topics/AI-software-accelerator-framework.md)