# Struct TfLiteQnnDelegateHtpBackendOptions

- Defined in [File QnnTFLiteDelegate.h](https://docs.qualcomm.com/doc/80-63442-10/topic/api-rst_file_include_QNN_QnnTFLiteDelegate_h.html#file-include-qnn-qnntflitedelegate-h)

## Struct Documentation

- struct TfLiteQnnDelegateHtpBackendOptions

    - Specifies the backend options for the HTP backend. To be used when selecting TfLiteQnnDelegateBackendType.kGpuBackend for the [TfLiteQnnDelegateOptions::backend\_type](https://docs.qualcomm.com/doc/80-63442-10/topic/api-rst_file_include_QNN_QnnTFLiteDelegate_h.html#structTfLiteQnnDelegateOptions_1a9c123140d4503bd3968d4add4e763eb3).

Public Members

- [TfLiteQnnDelegateHtpPerformanceMode](https://docs.qualcomm.com/doc/80-63442-10/topic/typedef_QnnTFLiteDelegate_8h_1aaa81373cfb558138ff77e7181b6d8f87.html#_CPPv435TfLiteQnnDelegateHtpPerformanceMode) performance\_mode

    - The default performance mode sets no configurations on the HTP.

- [TfLiteQnnDelegateHtpPerfCtrlStrategy](https://docs.qualcomm.com/doc/80-63442-10/topic/typedef_QnnTFLiteDelegate_8h_1a2d3fe43817efffe4b4203fcf0c4fd7d0.html#_CPPv436TfLiteQnnDelegateHtpPerfCtrlStrategy) perf\_ctrl\_strategy

    - The default performance control strategy is Manual.

- [TfLiteQnnDelegateHtpPrecision](https://docs.qualcomm.com/doc/80-63442-10/topic/typedef_QnnTFLiteDelegate_8h_1af989a24414f1d08327fbf4cb70542073.html#_CPPv429TfLiteQnnDelegateHtpPrecision) precision

    - The default precision mode supports quantized networks. Other precision modes may only be supported on certain SoCs.

- [TfLiteQnnDelegateHtpPdSession](https://docs.qualcomm.com/doc/80-63442-10/topic/typedef_QnnTFLiteDelegate_8h_1a016059a126485ee4c45861335d030734.html#_CPPv429TfLiteQnnDelegateHtpPdSession) pd\_session

    - Signed or unsigned HTP PD session. The default PD session is unsigned.

- [TfLiteQnnDelegateHtpOptimizationStrategy](https://docs.qualcomm.com/doc/80-63442-10/topic/typedef_QnnTFLiteDelegate_8h_1a2dc39b3fe5dcbb87fd51f1af2be6fb72.html#_CPPv440TfLiteQnnDelegateHtpOptimizationStrategy) optimization\_strategy

    - The default optimization strategy will optimize the graph for inference.

- bool useConvHmx

    - When using short conv hmx, one might have better performance, but convolution that have short depth and/or weights that are not symmetric could exhibit inaccurate results.

- bool useFoldRelu

    - When using fold relu, one might have better performance. This optimization is correct when quantization ranges for convolution are equal to or are subset of the Relu operation.

- uint32\_t vtcm\_size

    - Option to set VTCM size in MB. This is directly mapped to QNN\_HTP\_GRAPH\_CONFIG\_OPTION\_VTCM\_SIZE under QnnHtpGraph\_ConfigOption\_t. If VTCM size is set to 0, the default VTCM size will be used. If VTCM size is greater than VTCM size available for this device, it will be set to the maximum VTCM size for this device.

- uint32\_t num\_hvx\_threads

    - Option to set number of HVX threads. This is directly mapped to QNN\_HTP\_GRAPH\_CONFIG\_OPTION\_NUM\_HVX\_THREADS under QnnHtpGraph\_ConfigOption\_t. If this this option is set to 0, the default number of HVX threads will be used. If input exceeds the max number of HVX threads, the maximum number of threads supported will be used.

- uint32\_t device\_id

    - Some SoCs come with more than 1 HTP device. You can set which HTP device you want to run the model on by this attribute. But in most cases, you can just use the default device\_id.

Last Published: Jun 04, 2026