# QNN HTP Shared Buffer Tutorial

## Introduction

This tutorial describes how to use data buffers for shared access in between
processing domains in QNN HTP backend. Using shared buffers can eliminate data
copy in between client code on the host CPU and HTP accelerator.

There are multiple types of shared memories supported by HTP backend.

| Qnn\_MemDescriptor\_t Type | QnnMemHtp\_Descriptor\_t Type | Descriptor |
| --- | --- | --- |
| [QNN\_MEM\_TYPE\_ION](https://docs.qualcomm.com/doc/80-63442-50/topic/enum_QnnMem_8h_1ab5f34c4a6c8b1f544072bb49739d2a89.html#exhale-enum-qnnmem-8h-1ab5f34c4a6c8b1f544072bb49739d2a89) | Not Applicable | <ul class="simple"><br><li><p>Each tensor will be mapped to its own shared buffer</p></li><br><li><p>One-to-one relationship between the file descriptor and memory handle</p></li><br></ul> |
| [QNN\_MEM\_TYPE\_CUSTOM](https://docs.qualcomm.com/doc/80-63442-50/topic/enum_QnnMem_8h_1ab5f34c4a6c8b1f544072bb49739d2a89.html#exhale-enum-qnnmem-8h-1ab5f34c4a6c8b1f544072bb49739d2a89) | [QNN\_HTP\_MEM\_SHARED\_BUFFER](https://docs.qualcomm.com/doc/80-63442-50/topic/enum_QnnHtpMem_8h_1aaef9cbda0b0e670f83b37a504fdced5f.html#exhale-enum-qnnhtpmem-8h-1aaef9cbda0b0e670f83b37a504fdced5f) | <ul class="simple"><br><li><p>Multiple tensors will be mapped to one shared buffer</p></li><br><li><p>One-to-many relationship between the file descriptor and memory handles</p></li><br></ul> |
| [QNN\_MEM\_TYPE\_CUSTOM](https://docs.qualcomm.com/doc/80-63442-50/topic/enum_QnnMem_8h_1ab5f34c4a6c8b1f544072bb49739d2a89.html#exhale-enum-qnnmem-8h-1ab5f34c4a6c8b1f544072bb49739d2a89) | [QNN\_HTP\_MEM\_WEIGHTS\_BUFFER](https://docs.qualcomm.com/doc/80-63442-50/topic/enum_QnnHtpMem_8h_1aaef9cbda0b0e670f83b37a504fdced5f.html#exhale-enum-qnnhtpmem-8h-1aaef9cbda0b0e670f83b37a504fdced5f) | <ul class="simple"><br><li><p>An empty DMA buffer to store all the weights for the context</p></li><br></ul> |
| [QNN\_MEM\_TYPE\_CUSTOM](https://docs.qualcomm.com/doc/80-63442-50/topic/enum_QnnMem_8h_1ab5f34c4a6c8b1f544072bb49739d2a89.html#exhale-enum-qnnmem-8h-1ab5f34c4a6c8b1f544072bb49739d2a89) | [QNN\_HTP\_MEM\_SHARED\_SPILLFILL\_BUFFER](https://docs.qualcomm.com/doc/80-63442-50/topic/enum_QnnHtpMem_8h_1aaef9cbda0b0e670f83b37a504fdced5f.html#exhale-enum-qnnhtpmem-8h-1aaef9cbda0b0e670f83b37a504fdced5f) | <ul class="simple"><br><li><p>An empty DMA buffer to use as a shared spill-fill buffer for all graphs in the context</p></li><br></ul> |

Note

This tutorial is only focused on the shared buffer usage. There are some prerequisites in the
SDK example code not discussed in detail here. Users can refer to the corresponding part in the
QNN documentation, or refer to the SampleApp.

SampleApp documentation: [Sample App Tutorial](https://docs.qualcomm.com/doc/80-63442-50/topic/sample_app.html#sample-app-tutorial)

SampleApp code: ${QNN\_SDK\_ROOT}/examples/QNN/SampleApp

## Loading prerequisite shared libraries to use the RPCMem framework

A hardware device equipped with the Qualcomm chipset includes a shared library which provides the
functions for shared buffer manipulation.

### Loading shared library

The `libcdsprpc.so` shared library is available on most mainstream Qualcomm chipset equipped
devices (SD888 and later).

We can dynamically load it as shown below:

1 void* libCdspHandle = dlopen("libcdsprpc.so", RTLD_NOW | RTLD_LOCAL);
    2
    3 if (nullptr == libCdspHandle) {
    4   // handle errors
    5 }
    Copy to clipboard

### Resolving Symbols

After the shared library is successfully loaded, we can proceed to resolve all necessary symbols.

The below code snippet shows a template to resolve a symbol in a shared library:

1/**
     2* Defination: void* rpcmem_alloc(int heapid, uint32 flags, int size);
     3* Allocate a buffer via ION and register it with the FastRPC framework.
     4* @param[in] heapid  Heap ID to use for memory allocation.
     5* @param[in] flags   ION flags to use for memory allocation.
     6* @param[in] size    Buffer size to allocate.
     7* @return            Pointer to the buffer on success; NULL on failure.
     8*/
     9typedef void *(*RpcMemAllocFn_t)(int, uint32_t, int);
    10
    11/**
    12* Defination: void rpcmem_free(void* po);
    13* Free a buffer and ignore invalid buffers.
    14*/
    15typedef void (*RpcMemFreeFn_t)(void *);
    16
    17/**
    18* Defination: int rpcmem_to_fd(void* po);
    19* Return an associated file descriptor.
    20* @param[in] po  Data pointer for an RPCMEM-allocated buffer.
    21* @return        Buffer file descriptor.
    22*/
    23typedef int (*RpcMemToFdFn_t)(void *);
    24
    25RpcMemFreeFn_t rpcmem_alloc = (RpcMemAllocFn_t)dlsym(libCdspHandle, "rpcmem_alloc");
    26RpcMemFreeFn_t rpcmem_free = (RpcMemFreeFn_t)dlsym(libCdspHandle, "rpcmem_free");
    27RpcMemToFdFn_t rpcmem_to_fd = (RpcMemToFdFn_t)dlsym(libCdspHandle, "rpcmem_to_fd");
    28if (nullptr == rpcmem_alloc || nullptr == rpcmem_free || nullptr == rpcmem_to_fd) {
    29    dlclose(libCdspHandle);
    30    // handle errors
    31}
    Copy to clipboard

## Using QNN\_MEM\_TYPE\_ION with QNN API

The following is the representation of ION shared buffers, where each tensor has its own shared buffer
with its own unique memory pointer, file descriptor, and memory handle.

![../../_static/resources/htp_shared_buffer/ION_Shared_Buffer.png](data:image/png;base64,UklGRvoIAABXRUJQVlA4TO4IAAAvxIEzAC/kOJJtVemi3HduaZExezJwd4f3JQ3HkWyryuDu/n1JBoRA6EThctff/f3nuLZtNTlxG5IR2kRm1ER3VEET7q7ffzL/keouBEjQ9ngRkSwZFBpFJNMOfTIpVl7IIpQF4d8KpCoUKydWraRpRbWX6FZUealuJaoTakPbeTSZbOTEJbZJLOsrNL5xJ64RcvzzTlIj20vhYABYcMGj6kGlZaFFASWUqmkrlE4ovI+EugAHaGQz7AAEb6D0mwAH123Gu+AFSDsAYQPcAQHSBqCAm8oS5WXg0OnwgFqHGy5KozkGBoABDHUBA2SkCxhAXxcQEKChwh5EPWQPaAId4GCjQ0fzbXqtXTdLOFiC/NT4x2fYmQEMNcLCXj4K5wUuWBgBP2CChT5Y2sBaHzAwkKnJDTN2xsyA4vZ/bpvmlyjMbvbee2+xMlghg00qYrBNHYLFUIyR4CeZZZTW/3hu+e5+PTsdcUT/IaGNJElyQcO1Q0Z0VfURNPuHXjUtiBR4jxEPhE4yBKCYJpWuhVyB15wLvWbLEYaoybHboJji4QbWZ6GcISJGUM4wmsK6X84GAYJa6CBDAArxfNexKGfJLASxzwaabHzxM2chjZxkiNeMCl0Ig2zObkkRz8x0WgudZEiQVLoiIuE+ZuFIFxkiTOu1sAtZmY59gGAuLH1riXhT8ZCLDBFaN7Jg9y0+q6CB8cEQCojYAu6XECMXGQIAeTeyGHS0IV3HD2FScZkhS8k3b0p8gi9gUhEyTBcZ0u1kmIWkIi1ByDDdYwhHjl0LXhORe6QvNRA/JxUxw3SQIVyG2fUsylkLSllSETJMFxnCZZhdD0Et5EoQMkwnGRIkle4JIcN0EWKG2d1QTIeg1EwqLMN0lCFihtmdIBlYxQbW5zGe5zNMFxniixlmz+YWIEZ5W28UJNFhoCpKyDhdZEiHzzjzqJ24SxkdyhEyThcZ0hEzzmKKGH/CWshFgvVZSR0SghTxYAgBzjWx7gMfHcoQ6gUx43SQIZ0x4yzEfl57ng2yd9gQxL5YB+RiICF++xSfY+eMzzRliPUKGaeLDOmMGeeCr3lUzgYD5NqgWIdkoD39hIjcN09ckXwcJkOsV8g4nWRIkFQ6KIK/athA8po6iPXyGaeTDBEzzs5IMY1CmE4H1RQS33uSyuCjQ9lsKdbLMk5HGcJp6ICH1L7joSz2Sw3E9wefiXV8FZ8gpjA+1MAoQIxYkaEQHV7gBlbsA19vKHwv6SJDYHGZcfb8p+c/S0+xZpPNtkbBqg1WJaxW8NqqgvUq1luV8JoQv1pVv84Em/fftdf2b1awettdi23XLwrW7bYpYYuKLTYV7F5HiE17rRpphDuT9todJdsnLbY9SvbZlKDGpoJ9pLhr08ie//T85z+zTMwcFbk9/PKoGQwoGL8CsPyRGQxIuH2ZSTCCjq0yrsFx+ujIp81b0dDrVYC+R2bQVzAxw+rrv28EfQmjVV6CCXQUSLgBcJQ+OvJpc0009GrfqZcD98ygrWD8IhybHL8ER42gb8a15Tsmx6/2PbLINZn8t2xw0EdHPmXeVkE2xCZm3kyaQVvBxMxxZlffIyPomiHsvGhmoauAq+HNpf779NGVfw1g4EYVTt4jxjXg4Ef+aPW4IbQVcO1m9c2kEQxIGL/IjVsT6CsYrb4Zvzhwjz668hdYwTrScUor6lH/Fo4aQkuB7MAT98xgQMLEDNtjBG0Fo9WBe2xw6PLj2bMKTivLOKPgzOl2VrTlv4XlD9lcRA2JofzMawgDCm6/6Ns6OWkTdpblHfG4ETQUSO/xDByy8+NHBe+UZXxQ8OFdO2gYKTm13CC1h4ErPZvlDKGtYHyEdXO7XFO4O1o1g66C0SpH3yP6C/271v773CClg0zN7csD94QjDaGrYPQFO2z0Z0MruhKu9S90wrFLcNQIurZKzhz9P9OVPzGzYMH4wg0iMcYvclW9EQuA40bQVCAKGDCErhlXgWsnJo2gqYC7q+EW9NGUz+ah/gczTD+1zS8AftoxaRg9BZKr3htDm7XNuAxcAWbQVCCy/G/wQ5ry37JBdYsden+R+OK95z89/1k6hkXzf8ZZu8lmW6tg5UarEn7odP+j6jfLRi4OW89/ev6z9A5BpGDZkS+RY4woNRDrlQ5zdssZRpAj1kI65Aq8Zpz+7clV5FHo5SRsdouZ5Sz2oZBYHxLFFA83sD7LOiMiRlynnMK6z06eLSgbAURwi5nlDGshARZkJLMQxL6sr+SYVILIJoSNKGcY+0CgucXMcnY4bdFgkM3ZLSmcFptQNgJKTRIrDjGTryeI52kSgTtoQ3/xL79zEAGP10QqK9OxDxDMhaVvLQqQNEI81vudxMINZkqBYkqCww3+dqXcwPhgCAVEbLHuimjrnpusETCVIv45CzRwhpncyY0goLHiohmPROusZhZTtHIpcKQRLjJzid/mTSFii1HApMJlmG4xglCG6RQz/yCYYRaSirQEqxkmVSMIZZhuMVPMMOn0zCYiigPtMycntwVlI4AIbjFTzDDJ9c1y1oJSZheyRhDKMN1iJpdhUiRgY6xgF7JGEMow3WImn2F2UsQMs/PgNZEgxXQISk27kDWCUIbpFjP5DJMQ5UzILIsNrM9jfN5ihknWCJ9UhukOMxGFDLO35teQiFGO2MaLIUmEGKiK4jJOtxhBKON0ipll8xlnHnGvEbXeT+URle8lqRpBKON0i5ntZpzFFDH+hLWQHzCzkjokBCniwRACnGti3Qd1pinUyxlhC8pGABHcYmbbGWch9vPa82yQvfaFIPbFOiAX38+KnwvF55hMdaYp1mvzpTxZIwhlnG4xs+2Mc6GSPGLRGXJtUKxD0jeffkLEFhPBFamOw8R6bULWCEIZp1vM5DNOHYSK1LC+5zW1EJTYhKwRhDJOt5jJZ5waFNMohOl0UE0h8b0nqQx1pinWaxOyRhDKON1iZrsZZzHF2nc8lMV+qYH4/uAzsY6v4k33FMaHGhgFiBErMhQixAtc+hb7wNcb2vxekqwRtL6XdIeZr5qLy4yz5z+LOAA=)

An example is shown below:

HTP Shared Buffer Example

1// QnnInterface_t is defined in ${QNN_SDK_ROOT}/include/QNN/QnnInterface.h
     2QnnInterface_t qnnInterface;
     3// Init qnn interface ......
     4// See ${QNN_SDK_ROOT}/examples/QNN/SampleApp code
     5
     6// Qnn_Tensor_t is defined in ${QNN_SDK_ROOT}/include/QNN/QnnTypes.h
     7Qnn_Tensor_t inputTensor;
     8// Set up common setting for inputTensor ......
     9/* There are 2 specific settings for shared buffer:
    10*  1. memType should be QNN_TENSORMEMTYPE_MEMHANDLE; (line 40)
    11*  2. union member memHandle should be used instead of clientBuf, and it
    12*     should be set to nullptr. (line 41)
    13*/
    14
    15
    16size_t bufSize;
    17// Calculate the bufSize base on tensor dimensions and data type ......
    18
    19#define RPCMEM_HEAP_ID_SYSTEM 25
    20#define RPCMEM_DEFAULT_FLAGS 1
    21
    22// Allocate the shared buffer
    23uint8_t* memPointer = (uint8_t*)rpcmem_alloc(RPCMEM_HEAP_ID_SYSTEM, RPCMEM_DEFAULT_FLAGS, bufSize);
    24if (nullptr == memPointer) {
    25    // handle errors
    26}
    27
    28int memFd = rpcmem_to_fd(memPointer);
    29if (-1 == memfd) {
    30    // handle errors
    31}
    32
    33// Fill the info of Qnn_MemDescriptor_t and regist the buffer to QNN
    34// Qnn_MemDescriptor_t is defined in ${QNN_SDK_ROOT}/include/QNN/QnnMem.h
    35Qnn_MemDescriptor_t memDescriptor = QNN_MEM_DESCRIPTOR_INIT;
    36memDescriptor.memShape = {inputTensor.rank, inputTensor.dimensions, nullptr};
    37memDescriptor.dataType = inputTensor.dataType;
    38memDescriptor.memType = QNN_MEM_TYPE_ION;
    39memDescriptor.ionInfo.fd = memfd;
    40inputTensor.memType = QNN_TENSORMEMTYPE_MEMHANDLE;
    41inputTensor.memHandle = nullptr;
    42Qnn_ContextHandle_t context; // Must obtain a QNN context handle before memRegister()
    43// To obtain QNN context handle:
    44// For online prepare, refer to ${QNN_SDK_ROOT}/docs/general/sample_app.html#create-context
    45// For offline prepare, refer to ${QNN_SDK_ROOT}/docs/general/sample_app.html#load-context-from-a-cached-binary
    46Qnn_ErrorHandle_t registRet = qnnInterface->memRegister(context, &memDescriptor, 1u, &(inputTensor.memHandle));
    47if (QNN_SUCCESS != registRet) {
    48    rpcmem_free(memPointer);
    49    // handle errors
    50}
    51
    52/**
    53* At this place, the allocation and registration of the shared buffer has been complete.
    54* On QNN side, the buffer has been bound by memfd
    55* On user side, this buffer can be manipulated through memPointer.
    56*/
    57
    58/**
    59* Optionally, user can also allocate and register shared buffer for output as adove codes (lines 7-46).
    60* And if so the output buffer also should be deregistered and freed as below codes (lines 66-70).
    61*/
    62
    63// Load the input data to memPointer ......
    64
    65// Execute QNN graph with input tensor and output tensor ......
    66
    67// Get output data ......
    68
    69// Deregister and free all buffers if it's not being used
    70Qnn_ErrorHandle_t deregisterRet = qnnInterface->memDeRegister(&tensors.memHandle, 1);
    71if (QNN_SUCCESS != registRet) {
    72    // handle errors
    73}
    74rpcmem_free(memPointer);
    Copy to clipboard

## Using QNN\_HTP\_MEM\_SHARED\_BUFFER with QNN API

The following is the representation of a Multi-Tensor shared buffer where a group of
tensors is mapped to single shared buffer. This single shared buffer has one memory
pointer and a file descriptor, however each tensor has its own memory pointer offset and
memory handle.

![../../_static/resources/htp_shared_buffer/Multi_Tensor_Shared_Buffer.png](data:image/png;base64,UklGRiwJAABXRUJQVlA4TCAJAAAvLAIoAIfkOpJt05p777Nt5B+Jv14Atu23907DcW3banLinsxxLYLaqI8y3Bk5ROE6jmRbVeb59xWuyZAN6ZGEJaFvh7ve+Q+ThDt4gJctP38MHwGk4rnis61X/mASJqFl6l/9QKYAsMDvDusIIlpYAog4YQmg4/yziqR8T+1bfUjzo/oQfSGYAMIxoY1iEiBMGoI0QAI0k7CfYsAxFKmA+fevxTcXVyNBD3ATd3ZSr6av3xG4GBEL5ls+V/NdM2wBmU8EiM6fYAM8ay/in0DEATOgqnENz2kSflcT+yQQ4Gj+rQAb5Go4piK596tE3lQgmgNYgw36gB7oyz8GlBRLQKZMhjGKpJ7ZkTf98Lva4AT6K+FgghbqgBagpwJoIchIcZ+53SrxjwOZvwNO8OUo9ICD5Wmhz1SLDfaogTlLHDABdMACMAQ99Deas6UaoIXhfAxykAJF27bcts2TkV41LurpcUuPe3qh03sPDHcroeLCwKZSBAVuNPiAMBDzzZh5DzMARxwyXmFE/yFBkty2GZRz4C4EgSdhS+AbJpq30K8Yr4WIK/QABja8S4ifCXusEmaTLHZdcr9iQe0spgcwUuv1jgmvt2KNVUqOjspsgh918Pga7Ohi0Xx5AL+FxyMtmBnqkzXwelJmFtMDwGxHkQ0zHSyz0QAIA2GVVYB2es1V3+3oZmsQppEcGNx0EPol8DJ94eVZXDK9KHpWwGsFYjY5CqwNVVniW2bV6/ltd6Uhb9YVlpIbuI1Ka3BnIEVrKcp0c/JDASGRyXIlCYRlVsMsdml8vcLFGICFazcxECEZHRxh8Zt2WSVHCoS7fnEljQDCY2Lm+grF7FzU0jkXzf4dAVzO4tnEF3AlaVAbXis9ap1VOowdFVnlRlJBB9NPhHLyWTEAZi4hnl5jpg16gNkEsZCvVrknPhHUhnoq8+2yCgC5w75rDCurrvqPMIuttDo5N2GD6OeIvl5aAbNY0UVgszP2WK0jO+PCEZswECbZlpzCZmfsscpmZ5z7icH0FgaCJFMYHQxhgviJCPFYj3l+/yQDDbQw2RmbrDLZGadu7TTKg+3dBk2mUB2Qc+vHcqrK0zfkAJJBlpkuBw20kOyMXVaV7Ixzk8W5X8xHNJlCdTDTy/wtRCzwFY3KpNVmYAItJDtjmdUwi50dmkwpR84fXs+ELAYuO2ORVT474+TQZEo57SzyLicMcgDN9LhzEZUrszPWWVUsuPevBP/gx900UpMpc1THTfr4uYXpxx0lGCM1itkOHr+G6VVlOkkjNdDCZGdssspmZ8YrbeI//1/NPffV2B6oVfL9ttq9z9ZmaU/dO4g8sa/G9syLNUre/WyNkp/fVaPkl57et+m2F/bWKPnlRweRnRs1tue+r1Hywa9qlPzlgRolf/fqxqbbPt1fo+TvH5v4z8R//iMg9Mcq5OMKqtmPeXzwrqL9mMcHVLQf8/gEH8YHmO/HPH6gkv2YK2V560XC2ff+mD6ii7nk/jvTcNfXuphrPvc2wNRhPUwkM/TXYXVYYNLJA/ZF3U8GNqcI78JDD09d1MVYcn999dv++h3famKseXl66dv+u1MXtTCRzL7B4pCjOVUFVe/HXC/vbvmG16CLseTmlm+U39PEXLMc201NzHqKuWNNXRxO8ENhE1lmaKqi1Mno379WN3QxldxfX9oofk8XU7vkh4VhLSqQ/A4sFjfisMC0kwsBd7wvSwzL16SWpxd5DZWgK/Pcn7C4ofnOXHN/XY5WPcwlN2FR3q8q4kF4k+dz+IDnPPzA8yaUaf69BGMHn5s5eNDgnXEnL09PHSp+b9VyfoIjvAZtKpC8PF28VYGuxf67sKiHseSfYFXer6rird9/4Tnz+xc8H/z+I88vv5dpfr0EYwdnzBy8ZYB5Jxen+U2AZtlPtDGX3JSPo3Qxt0vOu3qYSl6eBtm2fD3cvt1ky1FlFrEb+diJ16CNqeT+2/I/a2Ngt4QjephILrlfDQtMO7m/Th5/2M2/fxWce/uOb8lPtDGUXPyPpW83ln/7VhdDzT/Lbj+3rvtt056iIoYGpp1czPnqXXIg8FqIuEI3WK72i5+nAe5+7Wt6MoFFzS/MJPfXVcWrul+Y2j33BxQCDn+7ofeFoWRSksPw+DPDTpaH3XJBDo0jg0A7i1kJI+4yQ/14PSxaFqsbLI+8DM47dYPlsQnqnrm3p6BusDzOoGxrdrLBcjTmeFcejZj4z8R/bkvk8Z01tr07LW1P1in5qZ2bb9tTp+RHYLzSJv4z8Z//MAj9Wl5Nnr6qvIGNO1+54dvodqaDeDx2WnK/YkHtLKYHMFLr9dLEt9At5L7w8kA4EwNwqQP6eJcQP1ljXk2eeVX52Y6UrcwlrA11tNnpFpyKAbjUgRJ4mXKuyGJlbil9VXmvFYjZ5CiU2YDct9Ltji6mETgTA3SpA21UWoM7/ShaS1HmmpMfCgiJzBLsdAszvUC4TiWrnEcgRwxEnfh6VYsxAAvXbmIgQjLWSrDTLcDs35H7VLI7umkkpdV/qQNmJ6JWFpdAR8TlLJ5NfAFXkga1oWChW6nH+zUQzlPJygMEoqD+Sx3Q74aZS4inqc1A0APMJoiFfLUbTnwiqA2vp+qyzS20EsRTa45TyZKbP1mpmIatFZrbV7IcjTC9lsUVX+qAldM6ZrGDYFFNyeL15AgYyZqFNaUcz7PJCIdlNeW2nvxn4ShJA0a3ZmtNeRthwgbRzxH9UpgrIwg5TUx2xjK3dWRnnDdiEwbCZPEjpzDZGcvc8tkZpy7j0lsYCLJ0wehgCBPET0SIx3rMA8CTDMyKhzJ87HQLTg200ygPtncbdOmC6oCcLiypT2Pm6Rty9KhXRsBBVzwULHRLszMOTRbn/o5ugy5dUB3M3DJ/CxELfEWjutk9A7PioWChW5qdcZtKtiQ7MzDQ5xLLkZOH19OHeQpSwTK3fHbGbSrZkuzMwECXLsppZ5F3OWFQr4yAOxHRFQ8FC93S7IzzVLJMdmYQuiH4Bz/uppG6dDFHddykE2wL0487Sl5HahTkygiudqWUNFJXPLjsjF1umeyMA1WyTHbGzjZ6VbJqdmZUw7KacvTLzlhYU4582Rm7ako2OzNeaRP/uX0dAA==)

An example is shown below:

HTP Multi-Tensor Shared Buffer Example

1// QnnInterface_t is defined in ${QNN_SDK_ROOT}/include/QNN/QnnInterface.h
      2QnnInterface_t qnnInterface;
      3// Init qnn interface ......
      4// See ${QNN_SDK_ROOT}/examples/QNN/SampleApp code
      5
      6// Total number of input tensors
      7size_t numTensors;
      8
      9// Qnn_Tensor_t is defined in ${QNN_SDK_ROOT}/include/QNN/QnnTypes.h
     10Qnn_Tensor_t inputTensors[numTensors];
     11// Set up common setting for inputTensor ......
     12/* There are 2 specific settings for shared buffer:
     13*  1. memType should be QNN_TENSORMEMTYPE_MEMHANDLE; (line 40)
     14*  2. union member memHandle should be used instead of clientBuf, and it
     15*     should be set to nullptr. (line 41)
     16*/
     17
     18// Calculate the shared buffer size
     19uint64_t totalBufferSize;
     20for (size_t tensorIdx = 0; tensorIdx < numTensors; tensorIdx++) {
     21   // Calculate the tensorSize based on tensor dimensions and data type
     22   totalBufferSize += tensorSize;
     23}
     24
     25#define RPCMEM_HEAP_ID_SYSTEM 25
     26#define RPCMEM_DEFAULT_FLAGS 1
     27
     28// Allocate the shard buffer
     29uint8_t* memPointer = (uint8_t*)rpcmem_alloc(RPCMEM_HEAP_ID_SYSTEM, RPCMEM_DEFAULT_FLAGS, totalBufferSize);
     30if (nullptr == memPointer) {
     31    // handle errors
     32}
     33
     34// Get a file descriptor for the buffer
     35int memFd = rpcmem_to_fd(memPointer);
     36if (-1 == memfd) {
     37    // handle errors
     38}
     39
     40// Regiter the memory handles using memory descriptors
     41// This is the offset of the tensor location in the shared buffer
     42uint64_t offset;
     43for (size_t tensorIdx = 0; tensorIdx < numTensors; tensorIdx++) {
     44   // Fill the info of Qnn_MemDescriptor_t and register the descriptor to QNN
     45   // Qnn_MemDescriptor_t is defined in ${QNN_SDK_ROOT}/include/QNN/QnnMem.h
     46   Qnn_MemDescriptor_t memDescriptor;
     47   memDescriptor.memShape = {inputTensors[tensorIdx].rank, inputTensors[tensorIdx].dimensions, nullptr};
     48   memDescriptor.dataType = inputTensors[tensorIdx].dataType;
     49   memDescriptor.memType = QNN_MEM_TYPE_CUSTOM;
     50   inputTensor[tensorIdx].memType = QNN_TENSORMEMTYPE_MEMHANDLE;
     51   inputTensor[tensorIdx].memHandle = nullptr;
     52
     53   // Fill the info of QnnMemHtp_Descriptor_t and set as custom info
     54   // QnnMemHtp_Descriptor_t is defined in ${QNN_SDK_ROOT}/include/QNN/HTP/QnnHtpMem.h
     55   QnnMemHtp_Descriptor_t htpMemDescriptor;
     56   htpMemDescriptor.type = QNN_HTP_MEM_SHARED_BUFFER;
     57   htpMemDescriptor.size = totalBufferSize; //Note: it's total buffer size
     58
     59   QnnHtpMem_SharedBufferConfig_t htpSharedBuffConfig = {memFd, offset};
     60   htpMemDescriptor.sharedBufferConfig = htpSharedBuffConfig;
     61
     62   memDescriptor.customInfo = &htpMemDescriptor;
     63
     64   Qnn_ContextHandle_t context; // Must obtain a QNN context handle before memRegister()
     65   // To obtain QNN context handle:
     66   // For online prepare, refer to ${QNN_SDK_ROOT}/docs/general/sample_app.html#create-context
     67   // For offline prepare, refer to ${QNN_SDK_ROOT}/docs/general/sample_app.html#load-context-from-a-cached-binary
     68
     69   Qnn_ErrorHandle_t registRet = qnnInterface->memRegister(context, &memDescriptor, 1u, &(inputTensor[tensorIdx].memHandle));
     70   if (QNN_SUCCESS != registRet) {
     71      // Deregister already created memory handles
     72      rpcmem_free(memPointer);
     73      // handle errors
     74   }
     75
     76   // move offset by the tensor size
     77   offset = offset + tensorSize;
     78}
     79
     80/**
     81* At this place, the allocation and registration of the shared buffer has been complete.
     82* On QNN side, the buffer has been bound by memfd
     83* On user side, this buffer can be manipulated through memPointer and offset.
     84*/
     85
     86/**
     87* Optionally, user can also allocate and register shared buffer for output as adove codes (lines 7-78).
     88* And if so the output buffer also should be deregistered and freed as below codes (lines 98-104).
     89*/
     90
     91// Load the input data to memPointer with respecitve offsets ......
     92
     93// Execute QNN graph with input tensors and output tensors ......
     94
     95// Get output data from the memPointer and offset combination ......
     96
     97// Deregister all mem handles the buffer if it's not being used
     98for (size_t tensorIdx = 0; tensorIdx < numTensors; tensorIdx++) {
     99   Qnn_ErrorHandle_t deregisterRet = qnnInterface->memDeRegister(&(inputTensors[tensorIdx].memHandle), 1);
    100   if (QNN_SUCCESS != registRet) {
    101    // handle errors
    102   }
    103}
    104rpcmem_free(memPointer);
    Copy to clipboard

## Using QNN\_HTP\_MEM\_WEIGHTS\_BUFFER and QNN\_HTP\_MEM\_SHARED\_SPILLFILL\_BUFFER with QNN API

Note

- Currently the external weights and spill-fill buffers feature has the following limitations:
    - - It is only supported on Android platforms.
- It is only supported when the context is created using the `QnnContext_createFromBinary` API.
For example, it is not supported with other APIs such as `QnnContext_createFromBinaryListAsync`.
- It is not supported together with the `QNN_HTP_CONTEXT_CONFIG_OPTION_REGISTER_MULTI_CONTEXTS` and `QNN_HTP_CONTEXT_CONFIG_OPTION_REGISTER_CONCURRENT_RESOURCE_SHARING` context configurations.
- It is not supported together with the graph switching feature.
- It is not supported together with the udma64 feature. The udma64 feature can be turned off during the graph prepare phase.
- It is not supported together with the securepd model protection feature.

Steps to use external weights and spill-fill buffer feature:

### 1. Create context

Context has to be created using the `QnnContext_createFromBinary` API with the `DEFER_GRAPH_INIT` config option enabled.
In this step the context will not yet be deserialized, only a context handle will be created to enable buffer registration.

### 2. Retrieve context properties

Context properties can be used to retrieve buffer sizes and alignment requirements.

### 3. Allocate external buffer

- Users have to ensure that the following requirements are met for external weights and spill-fill buffers:
    - - They have to be DMA buffers.
- Their start addresses - determined by the fd and the offset together - must be aligned according to the `BUFFER_START_ALIGNMENT` context property (to e.g. 4KB).
- They must be at least the required size determined by context properties.
- File descriptors must not be registered with fastrpc (using `fastrpc_mmap`).
- Modification to their contents should only be induced by QNN, otherwise behavior is undefined.
- They must not be deallocated while registered with QNN.

### 4. Register external buffer with QNN

External buffers can be registered with QNN for storing weights or as a spill-fill buffer. It is possible to register only one, both, or neither of these.
If no buffer is registered for a certain purpose, QNN will allocate it internally.

The external weights buffer will be used by QNN to store all the weights for the given context including shared and graph weights.
Each context has to have its own external weights buffer, meaning that multiple contexts cannot share the same external weights buffer.

The external spill-fill buffer will be shared between all graphs of a given context. Accordingly, the required size for this buffer is the largest out of
all the spill-fill buffers for all the graphs of a given context.
External spill-fill buffers can also be shared between graphs of multiple contexts by registering the same external spill-fill buffer with multiple contexts.

- Users have to make sure that the following requirements are for external weights and spill-fill buffers:
    - - File descriptors registered for `QNN_HTP_MEM_WEIGHTS_BUFFER` or `QNN_HTP_MEM_SHARED_SPILLFILL_BUFFER` buffer types cannot be used for other buffer
types (like `QNN_MEM_TYPE_ION` or `QNN_HTP_MEM_SHARED_BUFFER`).
- External weights buffers are unique for each context.
- Graphs sharing the same external spill-fill buffer cannot be executed in parallel.

### 5. Finalize context

In this step the context will be deserialized and its weights will be copied to the external weights buffer - if registered - by QNN.
Users have to make sure that the `binaryBuffer` and `config` pointers provided as arguments during context creation are still valid until the end of context finalization.

After this step, external weights and spill-fill buffers registered with the context cannot be deregistered because they are in use by the context. These buffers
will be automatically deregistered by QNN when the context is freed.

### Code example

HTP External weights and spill-fill buffer example

1 // QnnInterface_t is defined in ${QNN_SDK_ROOT}/include/QNN/QnnInterface.h
      2 QnnInterface_t qnnInterface;
      3 // Init qnn interface ......
      4 // See ${QNN_SDK_ROOT}/examples/QNN/SampleApp code
      5
      6 // Step 1. Create context using DEFER_GRAPH_INIT config option
      7 // See ${QNN_SDK_ROOT}/examples/QNN/SampleApp code for details on loading context
      8 Qnn_ContextHandle_t context;
      9 QnnContext_Config_t contextConfig = QNN_CONTEXT_CONFIG_INIT;
     10 contextConfig.option              = QNN_CONTEXT_CONFIG_OPTION_DEFER_GRAPH_INIT;
     11 contextConfig.isGraphInitDeferred = 1;
     12 const QnnContext_Config_t* pContextConfig[] = {&contextConfig, NULL};
     13 const Qnn_ErrorHandle_t contextCreateReturn =
     14     m_qnnFunctionPointers.qnnInterface.contextCreateFromBinary(backendHandle,
     15                                                                deviceHandle,
     16                                                                (const QnnContext_Config_t**)&pContextConfig,
     17                                                                reinterpret_cast<void*>(readBuffer),
     18                                                                bufferSize,
     19                                                                &context,
     20                                                                profileBackendHandle);
     21 if(contextCreateReturn != QNN_SUCCESS) {
     22     // handle error
     23 }
     24
     25 // Step 2. Retrieve context properties
     26
     27 uint64_t weightsBufferSize = -1, spillfillBufferSize = -1;
     28 uint64_t bufferStartAlignmentBytes = -1;
     29 QnnContext_Property_t contextProperty = QNN_CONTEXT_PROPERTY_INIT;
     30 contextProperty.option                = QNN_CONTEXT_PROPERTY_OPTION_CUSTOM;
     31 QnnContext_Property_t* contextProperties[] = {&contextProperty, nullptr};
     32
     33 // Retrieve required external weights buffer size
     34 QnnHtpContext_CustomProperty_t weightsCustomProperty = QNN_HTP_CONTEXT_CUSTOM_PROPERTY_INIT;
     35 weightsCustomProperty.option                         = QNN_HTP_CONTEXT_GET_PROP_WEIGHTS_BUFFER_SIZE;
     36 weightsCustomProperty.weightsBufferSize              = -1;
     37 contextProperty.customProperty        = &weightsCustomProperty;
     38
     39 const Qnn_ErrorHandle_t weightsPropertyError =
     40   m_qnnFunctionPointers.qnnInterface.contextGetProperty(context, contextProperties);
     41 if(weightsPropertyError != QNN_SUCCESS) {
     42     // handle error
     43 } else {
     44     weightsBufferSize = weightsCustomProperty.weightsBufferSize;
     45 }
     46
     47 // Retrieve required external spill-fill buffer size
     48 QnnHtpContext_CustomProperty_t spillfillCustomProperty = QNN_HTP_CONTEXT_CUSTOM_PROPERTY_INIT;
     49 spillfillCustomProperty.option                         = QNN_HTP_CONTEXT_GET_PROP_MAX_SPILLFILL_BUFFER_SIZE;
     50 spillfillCustomProperty.spillfillBufferSize            = -1;
     51 contextProperty.customProperty        = &spillfillCustomProperty;
     52
     53 const Qnn_ErrorHandle_t spillfillPropertyError =
     54   m_qnnFunctionPointers.qnnInterface.contextGetProperty(context, contextProperties);
     55 if(spillfillPropertyError != QNN_SUCCESS) {
     56     // handle error
     57 } else {
     58    spillfillBufferSize = spillfillCustomProperty.spillfillBufferSize;
     59 }
     60
     61 // Retrieve alignment requirement for external buffers
     62 QnnHtpContext_CustomProperty_t alignmentCustomProperty = QNN_HTP_CONTEXT_CUSTOM_PROPERTY_INIT;
     63 alignmentCustomProperty.option                         = QNN_HTP_CONTEXT_GET_PROP_BUFFER_START_ALIGNMENT;
     64 alignmentCustomProperty.bufferStartAlignment            = -1;
     65 contextProperty.customProperty        = &alignmentCustomProperty;
     66
     67 const Qnn_ErrorHandle_t alignmentPropertyError =
     68   m_qnnFunctionPointers.qnnInterface.contextGetProperty(context, contextProperties);
     69 if(alignmentPropertyError != QNN_SUCCESS) {
     70     // handle error
     71 } else {
     72    bufferStartAlignmentBytes = alignmentCustomProperty.bufferStartAlignment;
     73 }
     74
     75 // Step 3. Allocate external buffers
     76 // DMA buffers can be allocated using custom allocator or rpcmem_alloc
     77 // The start addresses - determined by the fd and the offset together - must be aligned to bufferStartAlignmentBytes
     78 uint8_t* weightsBufferPointer = nullptr, *spillfillBufferPointer = nullptr;
     79 int weightsBufferMemFd = -1, spillfillBufferMemFd = -1;
     80 int weightsBufferOffset = 0, spillfillBufferOffset = 0;
     81 if(weightsBufferSize > 0) {
     82         weightsBufferPointer = (uint8_t*)rpcmem_alloc(RPCMEM_HEAP_ID_SYSTEM, RPCMEM_DEFAULT_FLAGS, weightsBufferSize);
     83         if (nullptr == weightsBufferPointer) {
     84             // handle errors
     85         }
     86         weightsBufferMemFd = rpcmem_to_fd(weightsBufferPointer);
     87         if (-1 == weightsBufferMemFd) {
     88             // handle errors
     89         }
     90 }
     91 if(spillfillBufferSize > 0) {
     92         spillfillBufferPointer = (uint8_t*)rpcmem_alloc(RPCMEM_HEAP_ID_SYSTEM, RPCMEM_DEFAULT_FLAGS, spillfillBufferSize);
     93         if (nullptr == spillfillBufferPointer) {
     94             // handle errors
     95         }
     96         spillfillBufferMemFd = rpcmem_to_fd(spillfillBufferPointer);
     97         if (-1 == spillfillBufferMemFd) {
     98             // handle errors
     99         }
    100 }
    101
    102 // Step 4. Register external buffers with QNN
    103 // Since these buffers are empty, memShape and dataType are not applicable
    104 Qnn_MemHandle_t weightsMemHandle, spillfillMemHandle;
    105 Qnn_MemDescriptor_t memDescriptor = QNN_MEM_DESCRIPTOR_INIT;
    106 memDescriptor.memType             = QNN_MEM_TYPE_CUSTOM;
    107
    108 QnnMemHtp_Descriptor_t weightsHtpMemDescriptor;
    109 weightsHtpMemDescriptor.type               = QNN_HTP_MEM_WEIGHTS_BUFFER;
    110 weightsHtpMemDescriptor.size               = weightsBufferSize;
    111 QnnHtpMem_SharedBufferConfig_t weightsSharedBuffConfig = {weightsBufferMemFd, weightsBufferOffset};
    112 weightsHtpMemDescriptor.weightsBufferConfig               = weightsSharedBuffConfig;
    113
    114 memDescriptor.customInfo          = &weightsHtpMemDescriptor;
    115 const Qnn_ErrorHandle_t weightsBufferError =
    116     m_qnnFunctionPointers.qnnInterface.memRegister(context, &memDescriptor, 1u, &weightsMemHandle);
    117 if(weightsBufferError != QNN_SUCCESS) {
    118     // handle error
    119 }
    120
    121 QnnMemHtp_Descriptor_t spillfillHtpMemDescriptor;
    122 spillfillHtpMemDescriptor.type               = QNN_HTP_MEM_SHARED_SPILLFILL_BUFFER;
    123 spillfillHtpMemDescriptor.size               = spillfillBufferSize;
    124 QnnHtpMem_SharedBufferConfig_t spillfillSharedBuffConfig = {spillfillBufferMemFd, spillfillBufferOffset};
    125 spillfillHtpMemDescriptor.spillfillBufferConfig               = spillfillSharedBuffConfig;
    126
    127 memDescriptor.customInfo          = &spillfillHtpMemDescriptor;
    128 const Qnn_ErrorHandle_t spillfillBufferError =
    129     m_qnnFunctionPointers.qnnInterface.memRegister(context, &memDescriptor, 1u, &spillfillMemHandle);
    130 if(spillfillBufferError != QNN_SUCCESS) {
    131     // handle error
    132 }
    133
    134 // Step 5. Finalize context
    135 // pContextConfig and readBuffer arguments used to create the context still have to be valid at this point
    136 const Qnn_ErrorHandle_t contextFinalizeError = m_qnnFunctionPointers.qnnInterface.contextFinalize(context, profileBackendHandle);
    137 if(contextFinalizeError != QNN_SUCCESS) {
    138     // handle error
    139 }
    140
    141 // Obtain and save graph handles for each graph present in the context
    142 // See ${QNN_SDK_ROOT}/examples/QNN/SampleApp code for details
    143 for (size_t graphIdx = 0; graphIdx < m_graphsCount; graphIdx++) {
    144     if (QNN_SUCCESS != m_qnnFunctionPointers.qnnInterface.graphRetrieve(
    145             context, (*graphsInfo)[graphIdx].graphName, &((*graphsInfo)[graphIdx].graph))) {
    146         // handle error
    147     }
    148 }
    149
    150 // Execute graphs ...
    151 // See ${QNN_SDK_ROOT}/examples/QNN/SampleApp code
    152
    153 // Free context
    154 // See ${QNN_SDK_ROOT}/examples/QNN/SampleApp code for details
    155 if (QNN_CONTEXT_NO_ERROR != m_qnnFunctionPointers.qnnInterface.contextFree(context, profileBackendHandle)) {
    156     // hadle error
    157 }
    158
    159 // Free DMA buffers
    160 rpcmem_free(weightsBufferPointer);
    161 rpcmem_free(spillfillBufferPointer);
    Copy to clipboard

Last Published: Oct 10, 2025

[Previous Topic
Advanced](https://docs.qualcomm.com/bundle/publicresource/80-63442-50/topics/tutorials.md) [Next Topic
Custom Operators](https://docs.qualcomm.com/bundle/publicresource/80-63442-50/topics/tutorials.md)