# QNN HTP Shared Buffer Tutorial ## Introduction This tutorial describes how to use data buffers for shared access in between processing domains in QNN HTP backend. Using shared buffers can eliminate data copy in between client code on the host CPU and HTP accelerator. The HTP backend supports multiple types of shared memory buffers, as summarized in the table below. On Android platforms, these shared memory types are typically used in two use cases: 1. **FastRPC**: QNN APIs are invoked on the host CPU and executed on the DSP via FastRPC. 2. **Non-RPC (Native HTP)**: QNN APIs are invoked natively on the DSP. The table below indicates which buffer types are supported in each use case. | Qnn\_MemDescriptor\_t Type | QnnMemHtp\_Descriptor\_t Type | Descriptor | FastRPC | Non-RPC | | --- | --- | --- | --- | --- | | QNN\_MEM\_TYPE\_ION | Not Applicable |

Each tensor will be mapped to its own shared buffer

One-to-one relationship between the file descriptor and memory handle

Multiple tensors will be mapped to one shared buffer

One-to-many relationship between the file descriptor and memory handles

An empty DMA buffer to store all the weights for the context

An empty buffer to use as a shared spill-fill buffer for all graphs in the context

An empty buffer to use as a shared VTCM backup buffer for all graphs in the context

| Not Supported | Supported | Note This tutorial is only focused on the shared buffer usage. There are some prerequisites in the SDK example code not discussed in detail here. Users can refer to the corresponding part in the QNN documentation, or refer to the SampleApp. SampleApp documentation: general/sample\_app:Sample App Tutorial SampleApp code: ${QNN\_SDK\_ROOT}/examples/QNN/SampleApp ## Loading prerequisite shared libraries to use the RPCMem framework A hardware device equipped with the Qualcomm chipset includes a shared library which provides the functions for shared buffer manipulation. ### Loading shared library The `libcdsprpc.so` shared library is available on most mainstream Qualcomm chipset equipped devices (SD888 and later). We can dynamically load it as shown below: 1 void* libCdspHandle = dlopen("libcdsprpc.so", RTLD_NOW | RTLD_LOCAL); 2 3 if (nullptr == libCdspHandle) { 4 // handle errors 5 } Copy to clipboard ### Resolving Symbols After the shared library is successfully loaded, we can proceed to resolve all necessary symbols. The below code snippet shows a template to resolve a symbol in a shared library: 1/** 2* Defination: void* rpcmem_alloc(int heapid, uint32 flags, int size); 3* Allocate a buffer via ION and register it with the FastRPC framework. 4* @param[in] heapid Heap ID to use for memory allocation. 5* @param[in] flags ION flags to use for memory allocation. 6* @param[in] size Buffer size to allocate. 7* @return Pointer to the buffer on success; NULL on failure. 8*/ 9typedef void *(*RpcMemAllocFn_t)(int, uint32_t, int); 10 11/** 12* Defination: void rpcmem_free(void* po); 13* Free a buffer and ignore invalid buffers. 14*/ 15typedef void (*RpcMemFreeFn_t)(void *); 16 17/** 18* Defination: int rpcmem_to_fd(void* po); 19* Return an associated file descriptor. 20* @param[in] po Data pointer for an RPCMEM-allocated buffer. 21* @return Buffer file descriptor. 22*/ 23typedef int (*RpcMemToFdFn_t)(void *); 24 25RpcMemFreeFn_t rpcmem_alloc = (RpcMemAllocFn_t)dlsym(libCdspHandle, "rpcmem_alloc"); 26RpcMemFreeFn_t rpcmem_free = (RpcMemFreeFn_t)dlsym(libCdspHandle, "rpcmem_free"); 27RpcMemToFdFn_t rpcmem_to_fd = (RpcMemToFdFn_t)dlsym(libCdspHandle, "rpcmem_to_fd"); 28if (nullptr == rpcmem_alloc || nullptr == rpcmem_free || nullptr == rpcmem_to_fd) { 29 dlclose(libCdspHandle); 30 // handle errors 31} Copy to clipboard ## Using QNN\_MEM\_TYPE\_ION with QNN API The following is the representation of ION shared buffers, where each tensor has its own shared buffer with its own unique memory pointer, file descriptor, and memory handle. ![../../_static/resources/htp_shared_buffer/ION_Shared_Buffer.png](data:image/png;base64,UklGRvoIAABXRUJQVlA4TO4IAAAvxIEzAC/kOJJtVemi3HduaZExezJwd4f3JQ3HkWyryuDu/n1JBoRA6EThctff/f3nuLZtNTlxG5IR2kRm1ER3VEET7q7ffzL/keouBEjQ9ngRkSwZFBpFJNMOfTIpVl7IIpQF4d8KpCoUKydWraRpRbWX6FZUealuJaoTakPbeTSZbOTEJbZJLOsrNL5xJ64RcvzzTlIj20vhYABYcMGj6kGlZaFFASWUqmkrlE4ovI+EugAHaGQz7AAEb6D0mwAH123Gu+AFSDsAYQPcAQHSBqCAm8oS5WXg0OnwgFqHGy5KozkGBoABDHUBA2SkCxhAXxcQEKChwh5EPWQPaAId4GCjQ0fzbXqtXTdLOFiC/NT4x2fYmQEMNcLCXj4K5wUuWBgBP2CChT5Y2sBaHzAwkKnJDTN2xsyA4vZ/bpvmlyjMbvbee2+xMlghg00qYrBNHYLFUIyR4CeZZZTW/3hu+e5+PTsdcUT/IaGNJElyQcO1Q0Z0VfURNPuHXjUtiBR4jxEPhE4yBKCYJpWuhVyB15wLvWbLEYaoybHboJji4QbWZ6GcISJGUM4wmsK6X84GAYJa6CBDAArxfNexKGfJLASxzwaabHzxM2chjZxkiNeMCl0Ig2zObkkRz8x0WgudZEiQVLoiIuE+ZuFIFxkiTOu1sAtZmY59gGAuLH1riXhT8ZCLDBFaN7Jg9y0+q6CB8cEQCojYAu6XECMXGQIAeTeyGHS0IV3HD2FScZkhS8k3b0p8gi9gUhEyTBcZ0u1kmIWkIi1ByDDdYwhHjl0LXhORe6QvNRA/JxUxw3SQIVyG2fUsylkLSllSETJMFxnCZZhdD0Et5EoQMkwnGRIkle4JIcN0EWKG2d1QTIeg1EwqLMN0lCFihtmdIBlYxQbW5zGe5zNMFxniixlmz+YWIEZ5W28UJNFhoCpKyDhdZEiHzzjzqJ24SxkdyhEyThcZ0hEzzmKKGH/CWshFgvVZSR0SghTxYAgBzjWx7gMfHcoQ6gUx43SQIZ0x4yzEfl57ng2yd9gQxL5YB+RiICF++xSfY+eMzzRliPUKGaeLDOmMGeeCr3lUzgYD5NqgWIdkoD39hIjcN09ckXwcJkOsV8g4nWRIkFQ6KIK/athA8po6iPXyGaeTDBEzzs5IMY1CmE4H1RQS33uSyuCjQ9lsKdbLMk5HGcJp6ICH1L7joSz2Sw3E9wefiXV8FZ8gpjA+1MAoQIxYkaEQHV7gBlbsA19vKHwv6SJDYHGZcfb8p+c/S0+xZpPNtkbBqg1WJaxW8NqqgvUq1luV8JoQv1pVv84Em/fftdf2b1awettdi23XLwrW7bYpYYuKLTYV7F5HiE17rRpphDuT9todJdsnLbY9SvbZlKDGpoJ9pLhr08ie//T85z+zTMwcFbk9/PKoGQwoGL8CsPyRGQxIuH2ZSTCCjq0yrsFx+ujIp81b0dDrVYC+R2bQVzAxw+rrv28EfQmjVV6CCXQUSLgBcJQ+OvJpc0009GrfqZcD98ygrWD8IhybHL8ER42gb8a15Tsmx6/2PbLINZn8t2xw0EdHPmXeVkE2xCZm3kyaQVvBxMxxZlffIyPomiHsvGhmoauAq+HNpf779NGVfw1g4EYVTt4jxjXg4Ef+aPW4IbQVcO1m9c2kEQxIGL/IjVsT6CsYrb4Zvzhwjz668hdYwTrScUor6lH/Fo4aQkuB7MAT98xgQMLEDNtjBG0Fo9WBe2xw6PLj2bMKTivLOKPgzOl2VrTlv4XlD9lcRA2JofzMawgDCm6/6Ns6OWkTdpblHfG4ETQUSO/xDByy8+NHBe+UZXxQ8OFdO2gYKTm13CC1h4ErPZvlDKGtYHyEdXO7XFO4O1o1g66C0SpH3yP6C/271v773CClg0zN7csD94QjDaGrYPQFO2z0Z0MruhKu9S90wrFLcNQIurZKzhz9P9OVPzGzYMH4wg0iMcYvclW9EQuA40bQVCAKGDCErhlXgWsnJo2gqYC7q+EW9NGUz+ah/gczTD+1zS8AftoxaRg9BZKr3htDm7XNuAxcAWbQVCCy/G/wQ5ry37JBdYsden+R+OK95z89/1k6hkXzf8ZZu8lmW6tg5UarEn7odP+j6jfLRi4OW89/ev6z9A5BpGDZkS+RY4woNRDrlQ5zdssZRpAj1kI65Aq8Zpz+7clV5FHo5SRsdouZ5Sz2oZBYHxLFFA83sD7LOiMiRlynnMK6z06eLSgbAURwi5nlDGshARZkJLMQxL6sr+SYVILIJoSNKGcY+0CgucXMcnY4bdFgkM3ZLSmcFptQNgJKTRIrDjGTryeI52kSgTtoQ3/xL79zEAGP10QqK9OxDxDMhaVvLQqQNEI81vudxMINZkqBYkqCww3+dqXcwPhgCAVEbLHuimjrnpusETCVIv45CzRwhpncyY0goLHiohmPROusZhZTtHIpcKQRLjJzid/mTSFii1HApMJlmG4xglCG6RQz/yCYYRaSirQEqxkmVSMIZZhuMVPMMOn0zCYiigPtMycntwVlI4AIbjFTzDDJ9c1y1oJSZheyRhDKMN1iJpdhUiRgY6xgF7JGEMow3WImn2F2UsQMs/PgNZEgxXQISk27kDWCUIbpFjP5DJMQ5UzILIsNrM9jfN5ihknWCJ9UhukOMxGFDLO35teQiFGO2MaLIUmEGKiK4jJOtxhBKON0ipll8xlnHnGvEbXeT+URle8lqRpBKON0i5ntZpzFFDH+hLWQHzCzkjokBCniwRACnGti3Qd1pinUyxlhC8pGABHcYmbbGWch9vPa82yQvfaFIPbFOiAX38+KnwvF55hMdaYp1mvzpTxZIwhlnG4xs+2Mc6GSPGLRGXJtUKxD0jeffkLEFhPBFamOw8R6bULWCEIZp1vM5DNOHYSK1LC+5zW1EJTYhKwRhDJOt5jJZ5waFNMohOl0UE0h8b0nqQx1pinWaxOyRhDKON1iZrsZZzHF2nc8lMV+qYH4/uAzsY6v4k33FMaHGhgFiBErMhQixAtc+hb7wNcb2vxekqwRtL6XdIeZr5qLy4yz5z+LOAA=) An example is shown below: HTP Shared Buffer Example 1// QnnInterface_t is defined in ${QNN_SDK_ROOT}/include/QNN/QnnInterface.h 2QnnInterface_t qnnInterface; 3// Init qnn interface ...... 4// See ${QNN_SDK_ROOT}/examples/QNN/SampleApp code 5 6// Qnn_Tensor_t is defined in ${QNN_SDK_ROOT}/include/QNN/QnnTypes.h 7Qnn_Tensor_t inputTensor; 8// Set up common setting for inputTensor ...... 9/* There are 2 specific settings for shared buffer: 10* 1. memType should be QNN_TENSORMEMTYPE_MEMHANDLE; (line 40) 11* 2. union member memHandle should be used instead of clientBuf, and it 12* should be set to nullptr. (line 41) 13*/ 14 15 16size_t bufSize; 17// Calculate the bufSize base on tensor dimensions and data type ...... 18 19#define RPCMEM_HEAP_ID_SYSTEM 25 20#define RPCMEM_DEFAULT_FLAGS 1 21 22// Allocate the shared buffer 23uint8_t* memPointer = (uint8_t*)rpcmem_alloc(RPCMEM_HEAP_ID_SYSTEM, RPCMEM_DEFAULT_FLAGS, bufSize); 24if (nullptr == memPointer) { 25 // handle errors 26} 27 28int memFd = rpcmem_to_fd(memPointer); 29if (-1 == memfd) { 30 // handle errors 31} 32 33// Fill the info of Qnn_MemDescriptor_t and regist the buffer to QNN 34// Qnn_MemDescriptor_t is defined in ${QNN_SDK_ROOT}/include/QNN/QnnMem.h 35Qnn_MemDescriptor_t memDescriptor = QNN_MEM_DESCRIPTOR_INIT; 36memDescriptor.memShape = {inputTensor.rank, inputTensor.dimensions, nullptr}; 37memDescriptor.dataType = inputTensor.dataType; 38memDescriptor.memType = QNN_MEM_TYPE_ION; 39memDescriptor.ionInfo.fd = memfd; 40inputTensor.memType = QNN_TENSORMEMTYPE_MEMHANDLE; 41inputTensor.memHandle = nullptr; 42Qnn_ContextHandle_t context; // Must obtain a QNN context handle before memRegister() 43// To obtain QNN context handle: 44// For online prepare, refer to ${QNN_SDK_ROOT}/docs/general/sample_app.html#create-context 45// For offline prepare, refer to ${QNN_SDK_ROOT}/docs/general/sample_app.html#load-context-from-a-cached-binary 46Qnn_ErrorHandle_t registRet = qnnInterface->memRegister(context, &memDescriptor, 1u, &(inputTensor.memHandle)); 47if (QNN_SUCCESS != registRet) { 48 rpcmem_free(memPointer); 49 // handle errors 50} 51 52/** 53* At this place, the allocation and registration of the shared buffer has been complete. 54* On QNN side, the buffer has been bound by memfd 55* On user side, this buffer can be manipulated through memPointer. 56*/ 57 58/** 59* Optionally, user can also allocate and register shared buffer for output as adove codes (lines 7-46). 60* And if so the output buffer also should be deregistered and freed as below codes (lines 66-70). 61*/ 62 63// Load the input data to memPointer ...... 64 65// Execute QNN graph with input tensor and output tensor ...... 66 67// Get output data ...... 68 69// Deregister and free all buffers if it's not being used 70Qnn_ErrorHandle_t deregisterRet = qnnInterface->memDeRegister(&tensors.memHandle, 1); 71if (QNN_SUCCESS != registRet) { 72 // handle errors 73} 74rpcmem_free(memPointer); Copy to clipboard ## Using QNN\_HTP\_MEM\_SHARED\_BUFFER with QNN API The following is the representation of a Multi-Tensor shared buffer where a group of tensors is mapped to single shared buffer. This single shared buffer has one memory pointer and a file descriptor, however each tensor has its own memory pointer offset and memory handle. ![../../_static/resources/htp_shared_buffer/Multi_Tensor_Shared_Buffer.png](data:image/png;base64,UklGRiwJAABXRUJQVlA4TCAJAAAvLAIoAIfkOpJt05p777Nt5B+Jv14Atu23907DcW3banLinsxxLYLaqI8y3Bk5ROE6jmRbVeb59xWuyZAN6ZGEJaFvh7ve+Q+ThDt4gJctP38MHwGk4rnis61X/mASJqFl6l/9QKYAsMDvDusIIlpYAog4YQmg4/yziqR8T+1bfUjzo/oQfSGYAMIxoY1iEiBMGoI0QAI0k7CfYsAxFKmA+fevxTcXVyNBD3ATd3ZSr6av3xG4GBEL5ls+V/NdM2wBmU8EiM6fYAM8ay/in0DEATOgqnENz2kSflcT+yQQ4Gj+rQAb5Go4piK596tE3lQgmgNYgw36gB7oyz8GlBRLQKZMhjGKpJ7ZkTf98Lva4AT6K+FgghbqgBagpwJoIchIcZ+53SrxjwOZvwNO8OUo9ICD5Wmhz1SLDfaogTlLHDABdMACMAQ99Deas6UaoIXhfAxykAJF27bcts2TkV41LurpcUuPe3qh03sPDHcroeLCwKZSBAVuNPiAMBDzzZh5DzMARxwyXmFE/yFBkty2GZRz4C4EgSdhS+AbJpq30K8Yr4WIK/QABja8S4ifCXusEmaTLHZdcr9iQe0spgcwUuv1jgmvt2KNVUqOjspsgh918Pga7Ohi0Xx5AL+FxyMtmBnqkzXwelJmFtMDwGxHkQ0zHSyz0QAIA2GVVYB2es1V3+3oZmsQppEcGNx0EPol8DJ94eVZXDK9KHpWwGsFYjY5CqwNVVniW2bV6/ltd6Uhb9YVlpIbuI1Ka3BnIEVrKcp0c/JDASGRyXIlCYRlVsMsdml8vcLFGICFazcxECEZHRxh8Zt2WSVHCoS7fnEljQDCY2Lm+grF7FzU0jkXzf4dAVzO4tnEF3AlaVAbXis9ap1VOowdFVnlRlJBB9NPhHLyWTEAZi4hnl5jpg16gNkEsZCvVrknPhHUhnoq8+2yCgC5w75rDCurrvqPMIuttDo5N2GD6OeIvl5aAbNY0UVgszP2WK0jO+PCEZswECbZlpzCZmfsscpmZ5z7icH0FgaCJFMYHQxhgviJCPFYj3l+/yQDDbQw2RmbrDLZGadu7TTKg+3dBk2mUB2Qc+vHcqrK0zfkAJJBlpkuBw20kOyMXVaV7Ixzk8W5X8xHNJlCdTDTy/wtRCzwFY3KpNVmYAItJDtjmdUwi50dmkwpR84fXs+ELAYuO2ORVT474+TQZEo57SzyLicMcgDN9LhzEZUrszPWWVUsuPevBP/gx900UpMpc1THTfr4uYXpxx0lGCM1itkOHr+G6VVlOkkjNdDCZGdssspmZ8YrbeI//1/NPffV2B6oVfL9ttq9z9ZmaU/dO4g8sa/G9syLNUre/WyNkp/fVaPkl57et+m2F/bWKPnlRweRnRs1tue+r1Hywa9qlPzlgRolf/fqxqbbPt1fo+TvH5v4z8R//iMg9Mcq5OMKqtmPeXzwrqL9mMcHVLQf8/gEH8YHmO/HPH6gkv2YK2V560XC2ff+mD6ii7nk/jvTcNfXuphrPvc2wNRhPUwkM/TXYXVYYNLJA/ZF3U8GNqcI78JDD09d1MVYcn999dv++h3famKseXl66dv+u1MXtTCRzL7B4pCjOVUFVe/HXC/vbvmG16CLseTmlm+U39PEXLMc201NzHqKuWNNXRxO8ENhE1lmaKqi1Mno379WN3QxldxfX9oofk8XU7vkh4VhLSqQ/A4sFjfisMC0kwsBd7wvSwzL16SWpxd5DZWgK/Pcn7C4ofnOXHN/XY5WPcwlN2FR3q8q4kF4k+dz+IDnPPzA8yaUaf69BGMHn5s5eNDgnXEnL09PHSp+b9VyfoIjvAZtKpC8PF28VYGuxf67sKiHseSfYFXer6rird9/4Tnz+xc8H/z+I88vv5dpfr0EYwdnzBy8ZYB5Jxen+U2AZtlPtDGX3JSPo3Qxt0vOu3qYSl6eBtm2fD3cvt1ky1FlFrEb+diJ16CNqeT+2/I/a2Ngt4QjephILrlfDQtMO7m/Th5/2M2/fxWce/uOb8lPtDGUXPyPpW83ln/7VhdDzT/Lbj+3rvtt056iIoYGpp1czPnqXXIg8FqIuEI3WK72i5+nAe5+7Wt6MoFFzS/MJPfXVcWrul+Y2j33BxQCDn+7ofeFoWRSksPw+DPDTpaH3XJBDo0jg0A7i1kJI+4yQ/14PSxaFqsbLI+8DM47dYPlsQnqnrm3p6BusDzOoGxrdrLBcjTmeFcejZj4z8R/bkvk8Z01tr07LW1P1in5qZ2bb9tTp+RHYLzSJv4z8Z//MAj9Wl5Nnr6qvIGNO1+54dvodqaDeDx2WnK/YkHtLKYHMFLr9dLEt9At5L7w8kA4EwNwqQP6eJcQP1ljXk2eeVX52Y6UrcwlrA11tNnpFpyKAbjUgRJ4mXKuyGJlbil9VXmvFYjZ5CiU2YDct9Ltji6mETgTA3SpA21UWoM7/ShaS1HmmpMfCgiJzBLsdAszvUC4TiWrnEcgRwxEnfh6VYsxAAvXbmIgQjLWSrDTLcDs35H7VLI7umkkpdV/qQNmJ6JWFpdAR8TlLJ5NfAFXkga1oWChW6nH+zUQzlPJygMEoqD+Sx3Q74aZS4inqc1A0APMJoiFfLUbTnwiqA2vp+qyzS20EsRTa45TyZKbP1mpmIatFZrbV7IcjTC9lsUVX+qAldM6ZrGDYFFNyeL15AgYyZqFNaUcz7PJCIdlNeW2nvxn4ShJA0a3ZmtNeRthwgbRzxH9UpgrIwg5TUx2xjK3dWRnnDdiEwbCZPEjpzDZGcvc8tkZpy7j0lsYCLJ0wehgCBPET0SIx3rMA8CTDMyKhzJ87HQLTg200ygPtncbdOmC6oCcLiypT2Pm6Rty9KhXRsBBVzwULHRLszMOTRbn/o5ugy5dUB3M3DJ/CxELfEWjutk9A7PioWChW5qdcZtKtiQ7MzDQ5xLLkZOH19OHeQpSwTK3fHbGbSrZkuzMwECXLsppZ5F3OWFQr4yAOxHRFQ8FC93S7IzzVLJMdmYQuiH4Bz/uppG6dDFHddykE2wL0487Sl5HahTkygiudqWUNFJXPLjsjF1umeyMA1WyTHbGzjZ6VbJqdmZUw7KacvTLzlhYU4582Rm7ako2OzNeaRP/uX0dAA==) An example is shown below: HTP Multi-Tensor Shared Buffer Example 1// QnnInterface_t is defined in ${QNN_SDK_ROOT}/include/QNN/QnnInterface.h 2QnnInterface_t qnnInterface; 3// Init qnn interface ...... 4// See ${QNN_SDK_ROOT}/examples/QNN/SampleApp code 5 6// Total number of input tensors 7size_t numTensors; 8 9// Qnn_Tensor_t is defined in ${QNN_SDK_ROOT}/include/QNN/QnnTypes.h 10Qnn_Tensor_t inputTensors[numTensors]; 11// Set up common setting for inputTensor ...... 12/* There are 2 specific settings for shared buffer: 13* 1. memType should be QNN_TENSORMEMTYPE_MEMHANDLE; (line 40) 14* 2. union member memHandle should be used instead of clientBuf, and it 15* should be set to nullptr. (line 41) 16*/ 17 18// Calculate the shared buffer size 19uint64_t totalBufferSize; 20for (size_t tensorIdx = 0; tensorIdx < numTensors; tensorIdx++) { 21 // Calculate the tensorSize based on tensor dimensions and data type 22 totalBufferSize += tensorSize; 23} 24 25#define RPCMEM_HEAP_ID_SYSTEM 25 26#define RPCMEM_DEFAULT_FLAGS 1 27 28// Allocate the shard buffer 29uint8_t* memPointer = (uint8_t*)rpcmem_alloc(RPCMEM_HEAP_ID_SYSTEM, RPCMEM_DEFAULT_FLAGS, totalBufferSize); 30if (nullptr == memPointer) { 31 // handle errors 32} 33 34// Get a file descriptor for the buffer 35int memFd = rpcmem_to_fd(memPointer); 36if (-1 == memfd) { 37 // handle errors 38} 39 40// Regiter the memory handles using memory descriptors 41// This is the offset of the tensor location in the shared buffer 42uint64_t offset; 43for (size_t tensorIdx = 0; tensorIdx < numTensors; tensorIdx++) { 44 // Fill the info of Qnn_MemDescriptor_t and register the descriptor to QNN 45 // Qnn_MemDescriptor_t is defined in ${QNN_SDK_ROOT}/include/QNN/QnnMem.h 46 Qnn_MemDescriptor_t memDescriptor; 47 memDescriptor.memShape = {inputTensors[tensorIdx].rank, inputTensors[tensorIdx].dimensions, nullptr}; 48 memDescriptor.dataType = inputTensors[tensorIdx].dataType; 49 memDescriptor.memType = QNN_MEM_TYPE_CUSTOM; 50 inputTensor[tensorIdx].memType = QNN_TENSORMEMTYPE_MEMHANDLE; 51 inputTensor[tensorIdx].memHandle = nullptr; 52 53 // Fill the info of QnnMemHtp_Descriptor_t and set as custom info 54 // QnnMemHtp_Descriptor_t is defined in ${QNN_SDK_ROOT}/include/QNN/HTP/QnnHtpMem.h 55 QnnMemHtp_Descriptor_t htpMemDescriptor; 56 htpMemDescriptor.type = QNN_HTP_MEM_SHARED_BUFFER; 57 htpMemDescriptor.size = totalBufferSize; //Note: it's total buffer size 58 59 QnnHtpMem_SharedBufferConfig_t htpSharedBuffConfig = {memFd, offset}; 60 htpMemDescriptor.sharedBufferConfig = htpSharedBuffConfig; 61 62 memDescriptor.customInfo = &htpMemDescriptor; 63 64 Qnn_ContextHandle_t context; // Must obtain a QNN context handle before memRegister() 65 // To obtain QNN context handle: 66 // For online prepare, refer to ${QNN_SDK_ROOT}/docs/general/sample_app.html#create-context 67 // For offline prepare, refer to ${QNN_SDK_ROOT}/docs/general/sample_app.html#load-context-from-a-cached-binary 68 69 Qnn_ErrorHandle_t registRet = qnnInterface->memRegister(context, &memDescriptor, 1u, &(inputTensor[tensorIdx].memHandle)); 70 if (QNN_SUCCESS != registRet) { 71 // Deregister already created memory handles 72 rpcmem_free(memPointer); 73 // handle errors 74 } 75 76 // move offset by the tensor size 77 offset = offset + tensorSize; 78} 79 80/** 81* At this place, the allocation and registration of the shared buffer has been complete. 82* On QNN side, the buffer has been bound by memfd 83* On user side, this buffer can be manipulated through memPointer and offset. 84*/ 85 86/** 87* Optionally, user can also allocate and register shared buffer for output as adove codes (lines 7-78). 88* And if so the output buffer also should be deregistered and freed as below codes (lines 98-104). 89*/ 90 91// Load the input data to memPointer with respecitve offsets ...... 92 93// Execute QNN graph with input tensors and output tensors ...... 94 95// Get output data from the memPointer and offset combination ...... 96 97// Deregister all mem handles the buffer if it's not being used 98for (size_t tensorIdx = 0; tensorIdx < numTensors; tensorIdx++) { 99 Qnn_ErrorHandle_t deregisterRet = qnnInterface->memDeRegister(&(inputTensors[tensorIdx].memHandle), 1); 100 if (QNN_SUCCESS != registRet) { 101 // handle errors 102 } 103} 104rpcmem_free(memPointer); Copy to clipboard ## FastRPC Use Case: Using QNN\_HTP\_MEM\_WEIGHTS\_BUFFER and QNN\_HTP\_MEM\_SHARED\_SPILLFILL\_BUFFER with QNN API Note - Currently the external weights and spill-fill buffers feature has the following limitations: - - It is supported on Android platforms in FastRPC use cases. - It is only supported when the context is created using the `QnnContext_createFromBinary` API. For example, it is not supported with other APIs such as `QnnContext_createFromBinaryListAsync`. - It is not supported together with the `QNN_HTP_CONTEXT_CONFIG_OPTION_REGISTER_MULTI_CONTEXTS` and `QNN_HTP_CONTEXT_CONFIG_OPTION_REGISTER_CONCURRENT_RESOURCE_SHARING` context configurations. - It is not supported together with the graph switching feature. - It is not supported together with the udma64 feature. The udma64 feature can be turned off during the graph prepare phase. - It is not supported together with the securepd model protection feature. Example in backend extension config: > > > { > "backend_extensions" :{...}, > "context_configs" : > { > "spill_fill_buffer" : int64_value, > "weights_buffer" : int64_value > } > "graph_configs" : [{...}] > } > Copy to clipboard - **spill\_fill\_buffer**: This field sets the spill fill weights to be shared in a buffer shared between the client and backend. The default is 0. > > > - A value greater than 0 allocates the exact spill fill size given by the user. > - A value of 0 disables this feature. > - A value of -1 allocates the spill fill size given by QNN. - **weights\_buffer**: This field sets the weights to be shared in a buffer shared between the client and backend. The default is 0. > > > - A value greater than 0 allocates the exact buffer size given by the user. > - A value of 0 disables this feature. > - A value of -1 allocates the weight size given by QNN. Steps to use external weights and spill-fill buffer feature: ### 1. Create context Context has to be created using the `QnnContext_createFromBinary` API with the `DEFER_GRAPH_INIT` config option enabled. In this step the context will not yet be deserialized, only a context handle will be created to enable buffer registration. ### 2. Retrieve context properties Context properties can be used to retrieve buffer sizes and alignment requirements. ### 3. Allocate external buffer - Users have to ensure that the following requirements are met for external weights and spill-fill buffers: - - They have to be DMA buffers. - Their start addresses - determined by the fd and the offset together - must be aligned according to the `BUFFER_START_ALIGNMENT` context property (to e.g. 4KB). - They must be at least the required size determined by context properties. - File descriptors must not be registered with fastrpc (using `fastrpc_mmap`). - Modification to their contents should only be induced by QNN, otherwise behavior is undefined. - They must not be deallocated while registered with QNN. ### 4. Register external buffer with QNN External buffers can be registered with QNN for storing weights or as a spill-fill buffer. It is possible to register only one, both, or neither of these. If no buffer is registered for a certain purpose, QNN will allocate it internally. The external weights buffer will be used by QNN to store all the weights for the given context including shared and graph weights. Each context has to have its own external weights buffer, meaning that multiple contexts cannot share the same external weights buffer. The external spill-fill buffer will be shared between all graphs of a given context. Accordingly, the required size for this buffer is the largest out of all the spill-fill buffers for all the graphs of a given context. External spill-fill buffers can also be shared between graphs of multiple contexts by registering the same external spill-fill buffer with multiple contexts. - Users have to make sure that the following requirements are for external weights and spill-fill buffers: - - File descriptors registered for `QNN_HTP_MEM_WEIGHTS_BUFFER` or `QNN_HTP_MEM_SHARED_SPILLFILL_BUFFER` buffer types cannot be used for other buffer types (like `QNN_MEM_TYPE_ION` or `QNN_HTP_MEM_SHARED_BUFFER`). - External weights buffers are unique for each context. - Graphs sharing the same external spill-fill buffer cannot be executed in parallel. ### 5. Finalize context In this step the context will be deserialized and its weights will be copied to the external weights buffer - if registered - by QNN. Users have to make sure that the `binaryBuffer` and `config` pointers provided as arguments during context creation are still valid until the end of context finalization. After this step, external weights and spill-fill buffers registered with the context cannot be deregistered because they are in use by the context. These buffers will be automatically deregistered by QNN when the context is freed. ### Code example FastRPC Use Case: HTP External weights and spill-fill buffer example 1 // QnnInterface_t is defined in ${QNN_SDK_ROOT}/include/QNN/QnnInterface.h 2 QnnInterface_t qnnInterface; 3 // Init qnn interface ...... 4 // See ${QNN_SDK_ROOT}/examples/QNN/SampleApp code 5 6 // Step 1. Create context using DEFER_GRAPH_INIT config option 7 // See ${QNN_SDK_ROOT}/examples/QNN/SampleApp code for details on loading context 8 Qnn_ContextHandle_t context; 9 QnnContext_Config_t contextConfig = QNN_CONTEXT_CONFIG_INIT; 10 contextConfig.option = QNN_CONTEXT_CONFIG_OPTION_DEFER_GRAPH_INIT; 11 contextConfig.isGraphInitDeferred = 1; 12 const QnnContext_Config_t* pContextConfig[] = {&contextConfig, NULL}; 13 const Qnn_ErrorHandle_t contextCreateReturn = 14 m_qnnFunctionPointers.qnnInterface.contextCreateFromBinary(backendHandle, 15 deviceHandle, 16 (const QnnContext_Config_t**)&pContextConfig, 17 reinterpret_cast(readBuffer), 18 bufferSize, 19 &context, 20 profileBackendHandle); 21 if(contextCreateReturn != QNN_SUCCESS) { 22 // handle error 23 } 24 25 // Step 2. Retrieve context properties 26 27 uint64_t weightsBufferSize = -1, spillfillBufferSize = -1; 28 uint64_t bufferStartAlignmentBytes = -1; 29 QnnContext_Property_t contextProperty = QNN_CONTEXT_PROPERTY_INIT; 30 contextProperty.option = QNN_CONTEXT_PROPERTY_OPTION_CUSTOM; 31 QnnContext_Property_t* contextProperties[] = {&contextProperty, nullptr}; 32 33 // Retrieve required external weights buffer size 34 QnnHtpContext_CustomProperty_t weightsCustomProperty = QNN_HTP_CONTEXT_CUSTOM_PROPERTY_INIT; 35 weightsCustomProperty.option = QNN_HTP_CONTEXT_GET_PROP_WEIGHTS_BUFFER_SIZE; 36 weightsCustomProperty.weightsBufferSize = -1; 37 contextProperty.customProperty = &weightsCustomProperty; 38 39 const Qnn_ErrorHandle_t weightsPropertyError = 40 m_qnnFunctionPointers.qnnInterface.contextGetProperty(context, contextProperties); 41 if(weightsPropertyError != QNN_SUCCESS) { 42 // handle error 43 } else { 44 weightsBufferSize = weightsCustomProperty.weightsBufferSize; 45 } 46 47 // Retrieve required external spill-fill buffer size 48 QnnHtpContext_CustomProperty_t spillfillCustomProperty = QNN_HTP_CONTEXT_CUSTOM_PROPERTY_INIT; 49 spillfillCustomProperty.option = QNN_HTP_CONTEXT_GET_PROP_MAX_SPILLFILL_BUFFER_SIZE; 50 spillfillCustomProperty.spillfillBufferSize = -1; 51 contextProperty.customProperty = &spillfillCustomProperty; 52 53 const Qnn_ErrorHandle_t spillfillPropertyError = 54 m_qnnFunctionPointers.qnnInterface.contextGetProperty(context, contextProperties); 55 if(spillfillPropertyError != QNN_SUCCESS) { 56 // handle error 57 } else { 58 spillfillBufferSize = spillfillCustomProperty.spillfillBufferSize; 59 } 60 61 // Retrieve alignment requirement for external buffers 62 QnnHtpContext_CustomProperty_t alignmentCustomProperty = QNN_HTP_CONTEXT_CUSTOM_PROPERTY_INIT; 63 alignmentCustomProperty.option = QNN_HTP_CONTEXT_GET_PROP_BUFFER_START_ALIGNMENT; 64 alignmentCustomProperty.bufferStartAlignment = -1; 65 contextProperty.customProperty = &alignmentCustomProperty; 66 67 const Qnn_ErrorHandle_t alignmentPropertyError = 68 m_qnnFunctionPointers.qnnInterface.contextGetProperty(context, contextProperties); 69 if(alignmentPropertyError != QNN_SUCCESS) { 70 // handle error 71 } else { 72 bufferStartAlignmentBytes = alignmentCustomProperty.bufferStartAlignment; 73 } 74 75 // Step 3. Allocate external buffers 76 // DMA buffers can be allocated using custom allocator or rpcmem_alloc 77 // The start addresses - determined by the fd and the offset together - must be aligned to bufferStartAlignmentBytes 78 uint8_t* weightsBufferPointer = nullptr, *spillfillBufferPointer = nullptr; 79 int weightsBufferMemFd = -1, spillfillBufferMemFd = -1; 80 int weightsBufferOffset = 0, spillfillBufferOffset = 0; 81 if(weightsBufferSize > 0) { 82 weightsBufferPointer = (uint8_t*)rpcmem_alloc(RPCMEM_HEAP_ID_SYSTEM, RPCMEM_DEFAULT_FLAGS, weightsBufferSize); 83 if (nullptr == weightsBufferPointer) { 84 // handle errors 85 } 86 weightsBufferMemFd = rpcmem_to_fd(weightsBufferPointer); 87 if (-1 == weightsBufferMemFd) { 88 // handle errors 89 } 90 } 91 if(spillfillBufferSize > 0) { 92 spillfillBufferPointer = (uint8_t*)rpcmem_alloc(RPCMEM_HEAP_ID_SYSTEM, RPCMEM_DEFAULT_FLAGS, spillfillBufferSize); 93 if (nullptr == spillfillBufferPointer) { 94 // handle errors 95 } 96 spillfillBufferMemFd = rpcmem_to_fd(spillfillBufferPointer); 97 if (-1 == spillfillBufferMemFd) { 98 // handle errors 99 } 100 } 101 102 // Step 4. Register external buffers with QNN 103 // Since these buffers are empty, memShape and dataType are not applicable 104 Qnn_MemHandle_t weightsMemHandle, spillfillMemHandle; 105 Qnn_MemDescriptor_t memDescriptor = QNN_MEM_DESCRIPTOR_INIT; 106 memDescriptor.memType = QNN_MEM_TYPE_CUSTOM; 107 108 QnnMemHtp_Descriptor_t weightsHtpMemDescriptor; 109 weightsHtpMemDescriptor.type = QNN_HTP_MEM_WEIGHTS_BUFFER; 110 weightsHtpMemDescriptor.size = weightsBufferSize; 111 QnnHtpMem_SharedBufferConfig_t weightsSharedBuffConfig = {weightsBufferMemFd, weightsBufferOffset}; 112 weightsHtpMemDescriptor.weightsBufferConfig = weightsSharedBuffConfig; 113 114 memDescriptor.customInfo = &weightsHtpMemDescriptor; 115 const Qnn_ErrorHandle_t weightsBufferError = 116 m_qnnFunctionPointers.qnnInterface.memRegister(context, &memDescriptor, 1u, &weightsMemHandle); 117 if(weightsBufferError != QNN_SUCCESS) { 118 // handle error 119 } 120 121 QnnMemHtp_Descriptor_t spillfillHtpMemDescriptor; 122 spillfillHtpMemDescriptor.type = QNN_HTP_MEM_SHARED_SPILLFILL_BUFFER; 123 spillfillHtpMemDescriptor.size = spillfillBufferSize; 124 QnnHtpMem_SharedBufferConfig_t spillfillSharedBuffConfig = {spillfillBufferMemFd, spillfillBufferOffset}; 125 spillfillHtpMemDescriptor.spillfillBufferConfig = spillfillSharedBuffConfig; 126 127 memDescriptor.customInfo = &spillfillHtpMemDescriptor; 128 const Qnn_ErrorHandle_t spillfillBufferError = 129 m_qnnFunctionPointers.qnnInterface.memRegister(context, &memDescriptor, 1u, &spillfillMemHandle); 130 if(spillfillBufferError != QNN_SUCCESS) { 131 // handle error 132 } 133 134 // Step 5. Finalize context 135 // pContextConfig and readBuffer arguments used to create the context still have to be valid at this point 136 const Qnn_ErrorHandle_t contextFinalizeError = m_qnnFunctionPointers.qnnInterface.contextFinalize(context, profileBackendHandle); 137 if(contextFinalizeError != QNN_SUCCESS) { 138 // handle error 139 } 140 141 // Obtain and save graph handles for each graph present in the context 142 // See ${QNN_SDK_ROOT}/examples/QNN/SampleApp code for details 143 for (size_t graphIdx = 0; graphIdx < m_graphsCount; graphIdx++) { 144 if (QNN_SUCCESS != m_qnnFunctionPointers.qnnInterface.graphRetrieve( 145 context, (*graphsInfo)[graphIdx].graphName, &((*graphsInfo)[graphIdx].graph))) { 146 // handle error 147 } 148 } 149 150 // Execute graphs ... 151 // See ${QNN_SDK_ROOT}/examples/QNN/SampleApp code 152 153 // Free context 154 // See ${QNN_SDK_ROOT}/examples/QNN/SampleApp code for details 155 if (QNN_CONTEXT_NO_ERROR != m_qnnFunctionPointers.qnnInterface.contextFree(context, profileBackendHandle)) { 156 // hadle error 157 } 158 159 // Free DMA buffers 160 rpcmem_free(weightsBufferPointer); 161 rpcmem_free(spillfillBufferPointer); Copy to clipboard ## Non-RPC Use Case: Using QNN\_HTP\_MEM\_SHARED\_SPILLFILL\_BUFFER and QNN\_HTP\_MEM\_SHARED\_VTCMBACKUP\_BUFFER with QNN API This section describes how to use external spill-fill and VTCM backup buffers in Non-RPC use cases with concurrent buffer sharing. This feature enables multiple contexts to share the same external buffers while executing graphs in parallel, providing efficient memory utilization. Note **Platform Support and Limitations:** The concurrent external spill-fill and VTCM backup buffers sharing feature has the following limitations: - Supported on Android platforms in Non-RPC use cases - Buffers must be allocated in DSP address space - Only supported when context is created using `QnnContext_createFromBinary` API (not supported with `QnnContext_createFromBinaryListAsync`) - Not supported with `QNN_HTP_CONTEXT_CONFIG_OPTION_REGISTER_MULTI_CONTEXTS` context config option - Not supported with graph switching feature - Not supported with udma64 feature (can be disabled during graph prepare phase) - Not supported with securepd model protection feature Note **Concurrent Buffer Sharing Requirements:** To enable concurrent external buffer sharing: - Enable `QNN_HTP_CONTEXT_CONFIG_OPTION_REGISTER_CONCURRENT_RESOURCE_SHARING` config option in addition to `QNN_CONTEXT_CONFIG_OPTION_DEFER_GRAPH_INIT` - All contexts sharing the same buffers must have the same priority levels - Both external spill-fill and VTCM backup buffers must be registered for each context Steps to use concurrent external spill-fill and VTCM backup buffers in Non-RPC: ### 1. Create context with concurrent resource sharing Context must be created using the `QnnContext_createFromBinary` API with both the `QNN_CONTEXT_CONFIG_OPTION_DEFER_GRAPH_INIT` and `QNN_HTP_CONTEXT_CONFIG_OPTION_REGISTER_CONCURRENT_RESOURCE_SHARING` config options enabled. ### 2. Retrieve context properties Context properties can be passed in the `QnnContext_getProperty` API to retrieve buffer sizes for spill-fill buffers using `QNN_HTP_CONTEXT_GET_PROP_MAX_SPILLFILL_BUFFER_SIZE` and VTCM backup buffers using `QNN_HTP_CONTEXT_GET_PROP_MAX_VTCMBACKUP_BUFFER_SIZE`. These sizes are calculated based on HTP internal optimizations and are recommended to be used directly for buffer allocation. ### 3. Allocate external buffers in DSP address space Allocate buffers in the DSP address space with at least the required size determined from context properties. **Buffer Requirements:** - Buffers must be allocated in DSP address space - Minimum size must match the value retrieved from context properties - Buffer contents should only be modified by QNN (user modifications result in undefined behavior) - Buffers must remain allocated while registered with QNN ### 4. Register external buffers with QNN External spill-fill and VTCM backup buffers can be shared between graphs of multiple contexts by registering the same external buffer address using the `QnnMem_register` API. For concurrent execution, all contexts sharing the same buffers must have the same priority levels, and both external spill-fill and VTCM backup buffers must be registered for each context. ### 5. Finalize context After buffer registration, finalize the context to complete deserialization. Users have to make sure that the `binaryBuffer` and `config` pointers provided as arguments during context creation are still valid until the end of context finalization. After this step, external spill-fill and VTCM backup buffers registered with the context cannot be deregistered because they are in use by the context. These buffers will be automatically deregistered by QNN when the context is freed. ### Code example Non-RPC Use Case: HTP concurrent External spill-fill and VTCM backup buffer sharing example 1 // QnnInterface_t is defined in ${QNN_SDK_ROOT}/include/QNN/QnnInterface.h 2 QnnInterface_t qnnInterface; 3 // Init qnn interface ...... 4 // See ${QNN_SDK_ROOT}/examples/QNN/SampleApp code 5 6 // Step 1. Create contexts with config options: 7 // 1) REGISTER_CONCURRENT_RESOURCE_SHARING: enable concurrent buffer sharing 8 // 2) DEFER_GRAPH_INIT: enable external buffer 9 // See ${QNN_SDK_ROOT}/examples/QNN/SampleApp code for details on loading context 10 Qnn_ContextHandle_t context1, context2; 11 12 // Create the first context 13 QnnHtpContext_GroupRegistration_t concurrentGroupRegistrationCfg1; 14 // When DEFER_GRAPH_INIT is enabled, valid firstGroupHandle and maxSpillFillBuffer 15 // values are not required for external buffers 16 concurrentGroupRegistrationCfg1.firstGroupHandle = 0; 17 concurrentGroupRegistrationCfg1.maxSpillFillBuffer = 0; 18 19 QnnHtpContext_CustomConfig_t htpCustomConfig1; 20 htpCustomConfig1.option = QNN_HTP_CONTEXT_CONFIG_OPTION_REGISTER_CONCURRENT_RESOURCE_SHARING; 21 htpCustomConfig1.concurrentGroupRegistration = concurrentGroupRegistrationCfg1; 22 23 QnnContext_Config_t contextCustomConfig1; 24 contextCustomConfig1.option = QNN_CONTEXT_CONFIG_OPTION_CUSTOM; 25 contextCustomConfig1.customConfig = &htpCustomConfig1; 26 27 QnnContext_Config_t contextExtBufferCustomConfig; 28 contextExtBufferCustomConfig.option = QNN_CONTEXT_CONFIG_OPTION_DEFER_GRAPH_INIT; 29 contextExtBufferCustomConfig.isGraphInitDeferred = true; 30 const QnnContext_Config_t* contextConfig1[] = { 31 &contextCustomConfig1, &contextExtBufferCustomConfig, nullptr}; 32 33 const Qnn_ErrorHandle_t contextCreateReturn1 = 34 qnnInterface.contextCreateFromBinary(backendHandle, 35 deviceHandle, 36 contextConfig1, 37 reinterpret_cast(readBuffer), 38 bufferSize, 39 &context1, 40 profileBackendHandle); 41 if(contextCreateReturn1 != QNN_SUCCESS) { 42 // handle error 43 } 44 45 // Create the second context 46 QnnHtpContext_GroupRegistration_t concurrentGroupRegistrationCfg2; 47 // When DEFER_GRAPH_INIT is enabled, valid firstGroupHandle and maxSpillFillBuffer 48 // values are not required for external buffers 49 concurrentGroupRegistrationCfg2.firstGroupHandle = 0; 50 concurrentGroupRegistrationCfg2.maxSpillFillBuffer = 0; 51 52 QnnHtpContext_CustomConfig_t htpCustomConfig2; 53 htpCustomConfig2.option = QNN_HTP_CONTEXT_CONFIG_OPTION_REGISTER_CONCURRENT_RESOURCE_SHARING; 54 htpCustomConfig2.concurrentGroupRegistration = concurrentGroupRegistrationCfg2; 55 56 QnnContext_Config_t contextCustomConfig2; 57 contextCustomConfig2.option = QNN_CONTEXT_CONFIG_OPTION_CUSTOM; 58 contextCustomConfig2.customConfig = &htpCustomConfig2; 59 const QnnContext_Config_t* contextConfig2[] = { 60 &contextCustomConfig2, &contextExtBufferCustomConfig, nullptr}; 61 62 const Qnn_ErrorHandle_t contextCreateReturn2 = 63 qnnInterface.contextCreateFromBinary(backendHandle, 64 deviceHandle, 65 contextConfig2, 66 reinterpret_cast(readBuffer), 67 bufferSize, 68 &context2, 69 profileBackendHandle); 70 if(contextCreateReturn2 != QNN_SUCCESS) { 71 // handle error 72 } 73 74 // Step 2. Retrieve context properties for buffer sizes, this step should be done for 75 // each context individually 76 uint64_t requiredSpillFillBufferSize = 0, requiredVtcmBackupBufferSize = 0; 77 78 QnnContext_Property_t spillfillProperty = {}; 79 spillfillProperty.option = QNN_CONTEXT_PROPERTY_OPTION_CUSTOM; 80 QnnHtpContext_CustomProperty_t spillfillCustomProp = QNN_HTP_CONTEXT_CUSTOM_PROPERTY_INIT; 81 spillfillCustomProp.option = QNN_HTP_CONTEXT_GET_PROP_MAX_SPILLFILL_BUFFER_SIZE; 82 spillfillCustomProp.spillfillBufferSize = 0; 83 spillfillProperty.customProperty = &spillfillCustomProp; 84 85 QnnContext_Property_t vtcmProperty = {}; 86 vtcmProperty.option = QNN_CONTEXT_PROPERTY_OPTION_CUSTOM; 87 QnnHtpContext_CustomProperty_t vtcmCustomProp = QNN_HTP_CONTEXT_CUSTOM_PROPERTY_INIT; 88 vtcmCustomProp.option = QNN_HTP_CONTEXT_GET_PROP_MAX_VTCMBACKUP_BUFFER_SIZE; 89 vtcmCustomProp.vtcmBackupBufferSize = 0; 90 vtcmProperty.customProperty = &vtcmCustomProp; 91 92 QnnContext_Property_t* properties[] = {&spillfillProperty, &vtcmProperty, nullptr}; 93 94 const Qnn_ErrorHandle_t propertyError = 95 qnnInterface.contextGetProperty(context1, properties); 96 if(propertyError != QNN_SUCCESS) { 97 // handle error 98 } else { 99 requiredSpillFillBufferSize = spillfillCustomProp.spillfillBufferSize; 100 requiredVtcmBackupBufferSize = vtcmCustomProp.vtcmBackupBufferSize; 101 } 102 103 // Step 3. Allocate external buffers in DSP address space 104 // Allocate spill-fill buffer for the context group 105 void* spillFillQurtVAPtr = malloc(requiredSpillFillBufferSize); 106 QnnHtpMem_QurtAddress_t spillFillQurtVA = reinterpret_cast(spillFillQurtVAPtr); 107 108 // Allocate VTCM backup buffer for the context group 109 void* vtcmBackupQurtVAPtr = malloc(requiredVtcmBackupBufferSize); 110 QnnHtpMem_QurtAddress_t vtcmBackupQurtVA = reinterpret_cast(vtcmBackupQurtVAPtr); 111 112 // Step 4. Register external buffers with QNN 113 // Setup spill-fill buffer descriptor 114 QnnMemHtp_Descriptor_t spillFillHtpDescriptor; 115 spillFillHtpDescriptor.type = QNN_HTP_MEM_SHARED_SPILLFILL_BUFFER; 116 spillFillHtpDescriptor.size = requiredSpillFillBufferSize; 117 spillFillHtpDescriptor.sharedSpillfillBufferQurtAddress = spillFillQurtVA; 118 119 Qnn_MemDescriptor_t spillFillMemDescriptor; 120 uint32_t dims = 1; 121 const uint32_t rank = 1; 122 const Qnn_MemShape_t qnnMemShapeInput = {rank, &dims, nullptr}; 123 spillFillMemDescriptor.memShape = qnnMemShapeInput; 124 spillFillMemDescriptor.dataType = QNN_DATATYPE_UINT_8; 125 spillFillMemDescriptor.memType = QNN_MEM_TYPE_CUSTOM; 126 spillFillMemDescriptor.customInfo = &spillFillHtpDescriptor; 127 Qnn_MemHandle_t spillFillMemHandle1 = nullptr; 128 Qnn_MemHandle_t spillFillMemHandle2 = nullptr; 129 130 // Setup VTCM backup buffer descriptor 131 QnnMemHtp_Descriptor_t vtcmBackupHtpDescriptor; 132 vtcmBackupHtpDescriptor.type = QNN_HTP_MEM_SHARED_VTCMBACKUP_BUFFER; 133 vtcmBackupHtpDescriptor.size = requiredVtcmBackupBufferSize; 134 vtcmBackupHtpDescriptor.sharedVTCMBackupBufferQurtAddress = vtcmBackupQurtVA; 135 136 Qnn_MemDescriptor_t vtcmBackupMemDescriptor; 137 vtcmBackupMemDescriptor.memShape = qnnMemShapeInput; 138 vtcmBackupMemDescriptor.dataType = QNN_DATATYPE_UINT_8; 139 vtcmBackupMemDescriptor.memType = QNN_MEM_TYPE_CUSTOM; 140 vtcmBackupMemDescriptor.customInfo = &vtcmBackupHtpDescriptor; 141 Qnn_MemHandle_t vtcmBackupMemHandle1 = nullptr; 142 Qnn_MemHandle_t vtcmBackupMemHandle2 = nullptr; 143 144 // Register buffers for context1 145 Qnn_MemDescriptor_t memDescriptors[2] = {spillFillMemDescriptor, vtcmBackupMemDescriptor}; 146 Qnn_MemHandle_t memHandles1[2] = {spillFillMemHandle1, vtcmBackupMemHandle1}; 147 Qnn_MemHandle_t memHandles2[2] = {spillFillMemHandle2, vtcmBackupMemHandle2}; 148 149 const Qnn_ErrorHandle_t memRegisterError1 = 150 qnnInterface.memRegister(context1, memDescriptors, 2, memHandles1); 151 if(memRegisterError1 != QNN_SUCCESS) { 152 // handle error 153 } 154 155 // Register same buffers for context2 (sharing with context1) 156 const Qnn_ErrorHandle_t memRegisterError2 = 157 qnnInterface.memRegister(context2, memDescriptors, 2, memHandles2); 158 if(memRegisterError2 != QNN_SUCCESS) { 159 // handle error 160 } 161 162 // Step 5. Finalize contexts 163 const Qnn_ErrorHandle_t contextFinalizeError1 = 164 qnnInterface.contextFinalize(context1, profileBackendHandle); 165 if(contextFinalizeError1 != QNN_SUCCESS) { 166 // handle error 167 } 168 169 const Qnn_ErrorHandle_t contextFinalizeError2 = 170 qnnInterface.contextFinalize(context2, profileBackendHandle); 171 if(contextFinalizeError2 != QNN_SUCCESS) { 172 // handle error 173 } 174 175 // Obtain and save graph handles for each graph present in the context1 176 // See ${QNN_SDK_ROOT}/examples/QNN/SampleApp code for details 177 for (size_t graphIdx = 0; graphIdx < m_graphsCount; graphIdx++) { 178 if (QNN_SUCCESS != m_qnnFunctionPointers.qnnInterface.graphRetrieve( 179 context1, (*graphsInfo)[graphIdx].graphName, &((*graphsInfo)[graphIdx].graph))) { 180 // handle error 181 } 182 } 183 184 // Execute graphs ... 185 // See ${QNN_SDK_ROOT}/examples/QNN/SampleApp code 186 187 // Free contexts 188 // See ${QNN_SDK_ROOT}/examples/QNN/SampleApp code for details 189 if (QNN_CONTEXT_NO_ERROR != qnnInterface.contextFree(context1, profileBackendHandle)) { 190 // handle error 191 } 192 if (QNN_CONTEXT_NO_ERROR != qnnInterface.contextFree(context2, profileBackendHandle)) { 193 // handle error 194 } 195 196 // Free DSP buffers 197 free(spillFillQurtVAPtr); 198 free(vtcmBackupQurtVAPtr); Copy to clipboard Last Published: Jun 04, 2026 [Previous Topic Tutorial: Turning on various optimization on HTP and HTP MCP Backends](https://docs.qualcomm.com/bundle/publicresource/80-63442-10/topics/htp_auto_optimization.md) [Next Topic Tutorial: Executing a shallow model using custom op package](https://docs.qualcomm.com/bundle/publicresource/80-63442-10/topics/tutorial1.md)