# 使用 QAIRT C++ API 開發 AI 應用程式 Qualcomm AI Runtime (QAIRT) SDK 提供 C++ API，可用於範例應用程式的開發。Qualcomm AI Engine Direct (QNN) 與 Qualcomm Neural Processing Engine SDK (SNPE) 皆提供相關範例。這些範例可幫助您著手進行應用程式的開發。下文將說明如何建置、執行及查看原始碼，並展示以 QNN 或 SNPE API 執行模型的完整工作流程。 ## 建置及執行 QNN 範例應用程式 `qnn-sample-app` 位於 `${QNN_SDK_ROOT}/examples/QNN/SampleApp` ，其中 `QNN_SDK_ROOT` 是 QNN SDK 解壓縮後的路徑。 ### 設定 QAIRT SDK 若要為 QNN 範例應用程式設定工具鏈，請執行下列步驟： 1. [下載 Qualcomm AI Runtime SDK](https://softwarecenter.qualcomm.com/api/download/software/sdks/Qualcomm_AI_Runtime_Community/All/2.41.0.251128/v2.41.0.251128.zip) 。 2. 解壓縮 SDK。 unzip v2.41.0.251128.zip Copy to clipboard cd qairt/2.41.0.251128 Copy to clipboard export QNN_SDK_ROOT=`pwd` Copy to clipboard 3. 安裝 eSDK。請依照 [Qualcomm IM SDK 快速入門](https://docs.qualcomm.com/doc/80-70022-51/topic/install-sdk.html#section-b5c-z3k-5bc) 的說明安裝 eSDK，其包含所需的交叉編譯器工具鏈。 - 就使用 Yocto Scarthgap 的裝置而言，其函式庫是由 GCC-11.2 編譯而成。 - 將 `ESDK_PATH` 環境變數設定為 eSDK 的安裝路徑。後續步驟將透過該安裝路徑 (`/path/to/extracted/toolchain` ) 進行編譯。 export ESDK_PATH="/path/to/extracted/toolchain" Copy to clipboard ### 建置 QNN 範例應用程式請依照下列步驟，為 QNN 範例應用程式設定工具鏈。 1. 進入範例應用程式目錄。 cd ${QNN_SDK_ROOT}/examples/QNN/SampleApp/SampleApp/ Copy to clipboard 2. 設定 GCC 工具鏈的環境變數。 export QNN_AARCH64_LINUX_OE_GCC_112=$ESDK_PATH Copy to clipboard 3. 建置應用程式。 make CXX="$ESDK_PATH/tmp/sysroots/x86_64/usr/bin/aarch64-qcom-linux/aarch64-qcom-linux-g++ --sysroot=$ESDK_PATH/tmp/sysroots/qcs6490-rb3gen2-vision-kit/" all_linux_oe_aarch64_gcc112 Copy to clipboard 此步驟將建立兩個資料夾。 - `bin` ：包含各個平台的 `qnn-sample-app` 二進位檔案，分別存放於其對應的目錄中。 - `obj` ：包含在建置與連結執行檔的過程中所使用的所有物件檔案。 ### 在 (以 Yocto 為基礎的) Linux 上執行 QNN 範例應用程式建置完成的 `qnn-sample-app` 執行檔可搭配任何 QNN 後端以執行模型。以 Yocto Scarthgap 為基礎的裝置可使用對應於 `aarch64-oe-linux-gcc11.2` 的後端。 1. 將產出檔案推送到目標裝置。 scp ${QNN_SDK_ROOT}/examples/QNN/SampleApp/SampleApp/bin/aarch64-oe-linux-gcc11.2/qnn-sample-app root@[ip-addr]:/etc/apps/qnn-sample-app Copy to clipboard 備註若裝置上尚無 `/etc/apps/` 目錄，請先建立該目錄。 2. 在主機電腦上，使用 [AI Hub](https://docs.qualcomm.com/doc/80-70023-15BT/topic/ai-hub.html) 匯出模型。例如，若要匯出 InceptionV3 QNN 模型，請執行以下命令： pip3 install qai-hub-models Copy to clipboard python -m qai_hub_models.models.inception_v3.export --quantize w8a8 --target-runtime=qnn_context_binary --chipset="qualcomm-qcs6490-proxy" --compile-options="--qairt_version 2.40" --profile-options "--qairt_version 2.40" Copy to clipboard 備註請產生與目標裝置上所使用的 SDK 版本一致的運算內容二進位檔案。 3. 將匯出的 InceptionV3 QNN 模型推送到目標裝置。 scp build/inception_v3_w8a8/inception_v3_w8a8.bin root@:/etc/apps/ Copy to clipboard 當系統提示輸入密碼時，請輸入 oelinux123。 4. 從主機電腦，透過 SSH 登入目標裝置。 ssh root@ Copy to clipboard 5. 產生虛擬輸入檔案。 cd /etc/apps Copy to clipboard python3 Copy to clipboard 1. 在 Python 環境中執行以下指令。 import numpy as np Copy to clipboard ((np.random.random((1,3,224,224)).astype(np.float32))).tofile("input.raw") Copy to clipboard 6. 建立 `input_list.txt` 。 echo "input.raw" > /etc/apps/input_list.txt Copy to clipboard 7. 執行應用程式。 chmod +x qnn-sample-app Copy to clipboard ./qnn-sample-app --retrieve_context inception_v3_w8a8.bin --backend libQnnHtp.so --input_list input_list.txt --system_library libQnnSystem.so Copy to clipboard 備註依照所選的模型，更新模型名稱與 input\_list。若要查看說明內容，請執行： ./qnn-sample-app --help Copy to clipboard 命令列引數 - **必要引數** - - `--model` ：QNN 網路模型的路徑。與 `--retrieve_context` 互斥。 - `--retrieve_context` ：快取二進位檔案的路徑，用於載入已儲存的運算內容與執行圖。與 `--model` 互斥。 - `--backend` ：執行模型所使用的 QNN 後端的路徑。 - `--input_list` ：網路輸入清單檔的路徑。若有多個運算圖，請提供以逗號分隔的輸入檔清單。 - **選用引數** - - `--debug` ：儲存所有網路層的輸出結果。 - `--output_dir` ：輸出結果的儲存目錄 (預設值：./output)。 - `--output_data_type` ：輸出資料類型 (float\_only、native\_only、float\_and\_native)。 - `--input_data_type` ：輸入資料類型 (float 或 native)。 - `--op_packages` ：以逗號分隔的操作符套件與介面提供者清單。 - `--profiling_level` ：效能分析層級 (基本或詳細)。 - `--Save_context` ：將後端運算內容與運算圖元資料存入二進位檔案。 - `--num_inferences` ：待執行的推理次數。 - `--log_level` ：最高紀錄層級 (錯誤、警告、資訊、詳細資訊)。 - `--system_library` ：運算內容載入期間之反射 API 的 libQnnSystem.so 路徑。 - `--version` ：列印 QNN SDK 版本資訊。 - `--help` ：顯示說明訊息。 ## 工作流程與 API 的使用方式請參考以下建議模式，使用 QNN API 開發 C++ 應用程式。 1. [載入必要的共享函式庫。](https://docs.qualcomm.com/nav/home/sample_app.html?product=1601111740009302#loading-pre-requisite-shared-libraries) 2. [使用 QNN API。](https://docs.qualcomm.com/nav/home/sample_app.html?product=1601111740009302#usage-of-qnn-apis) 1. [透過 QNN 介面取得函式指標。](https://docs.qualcomm.com/nav/home/sample_app.html?product=1601111740009302#use-qnn-interface-to-obtain-function-pointers) 2. [設定日誌記錄。](https://docs.qualcomm.com/nav/home/sample_app.html?product=1601111740009302#set-up-logging) 3. [初始化後端。](https://docs.qualcomm.com/nav/home/sample_app.html?product=1601111740009302#initialize-backend) 4. [初始化效能分析。](https://docs.qualcomm.com/nav/home/sample_app.html?product=1601111740009302#initialize-profiling) 5. [建立裝置。](https://docs.qualcomm.com/nav/home/sample_app.html?product=1601111740009302#create-device) 6. [註冊操作符套件。](https://docs.qualcomm.com/nav/home/sample_app.html?product=1601111740009302#register-op-packages) 7. [建立運算內容。](https://docs.qualcomm.com/nav/home/sample_app.html?product=1601111740009302#create-context) 8. [準備運算圖。](https://docs.qualcomm.com/nav/home/sample_app.html?product=1601111740009302#prepare-graphs) 9. [運算圖最終化。](https://docs.qualcomm.com/nav/home/sample_app.html?product=1601111740009302#finalize-graphs) 10. [將運算內容存入二進位檔案。](https://docs.qualcomm.com/nav/home/sample_app.html?product=1601111740009302#save-context-into-a-binary) 11. [從快取二進位檔案載入運算內容。](https://docs.qualcomm.com/nav/home/sample_app.html?product=1601111740009302#load-context-from-a-cached-binary) 12. [執行運算圖。](https://docs.qualcomm.com/nav/home/sample_app.html?product=1601111740009302#execute-graphs) 13. [釋放運算內容。](https://docs.qualcomm.com/nav/home/sample_app.html?product=1601111740009302#free-context) 14. [終止後端。](https://docs.qualcomm.com/nav/home/sample_app.html?product=1601111740009302#terminate-backend) ### 載入必要的共享函式庫 QNN SDK 提供多種共享函式庫以存取後端，應用程式須視執行網路的需求而載入對應的函式庫。請以下列其中一種方式，在 QNN 中建立網路。 - 直接在您的應用程式中使用 QNN API 建構網路。 - 使用 QNN 轉換器產出 QNN 網路的共享函式庫。 `qnn-sample-app` 採用共享函式庫的選項。此網路可由 SDK 所提供的其中一種 QNN 轉換器產出，並透過 `qnn-model-lib-generator` 編譯為共享函式庫。備註 Windows 使用者請將以下說明中的所有 `.so` 檔案替換為對應的 `.dll` 檔案。如需詳細資訊，請參閱平台差異說明。 #### 載入後端 QNN SDK 提供可供多種後端 (包含 CPU、GPU、HTP 及 DSP) 使用的共享函式庫。每個實作 QNN API 的後端都會公開所有必要的符號，且可透過動態載入機制進行存取。以名為 *libQnnSampleBackend.so* 的範例後端共享函式庫為例，其動態載入的方式如下： void* libBackendHandle = pal::dynamicloading::dlOpen( "libQnnSampleBackend.so", pal::dynamicloading::DL_NOW | pal::dynamicloading::DL_LOCAL); if (nullptr == libBackendHandle) { QNN_ERROR("Unable to load backend. pal::dynamicloading::dlError(): %s", pal::dynamicloading::dlError()); return StatusCode::FAIL_LOAD_BACKEND; Copy to clipboard 若要以共享函式庫的形式載入模型，則可以名為 *libQnnSampleModel.so* 的範例模型共享函式庫為例，其動態載入的方式如下： void* libModelHandle = pal::dynamicloading::dlOpen( "libQnnSampleModel.so", pal::dynamicloading::DL_NOW | pal::dynamicloading::DL_LOCAL); if (nullptr == libModelHandle) { QNN_ERROR("Unable to load model. pal::dynamicloading::dlError(): %s", pal::dynamicloading::dlError()); return StatusCode::FAIL_LOAD_MODEL; } Copy to clipboard 若選擇從快取二進位檔案建立運算內容並執行運算圖，則應用程式可使用 QnnSystem API 取得與運算內容相關的元資料。可藉由載入 *libQnnSystem.so* 共享函式庫的方式存取 QnnSystem API，如下所示： void* systemLibraryHandle = pal::dynamicloading::dlOpen( "libQnnSystem.so", pal::dynamicloading::DL_NOW | pal::dynamicloading::DL_LOCAL); if (nullptr == systemLibraryHandle) { QNN_ERROR("Unable to load system library. pal::dynamicloading::dlError(): %s", pal::dynamicloading::dlError()); return StatusCode::FAIL_LOAD_SYSTEM_LIB; } Copy to clipboard #### 解析共享函式庫中的符號成功載入共享函式庫後，即可開始解析所有必要符號，以便存取 QNN API。下列程式碼片段顯示解析共享函式庫符號的範本： // A generic function to resolve symbols in a library template static inline T resolveSymbol(void* libHandle, const char* symName) { T ptr = (T)pal::dynamicloading::dlSym(libHandle, symName); if (ptr == nullptr) { QNN_ERROR("Unable to access symbol [%s]. pal::dynamicloading::dlError(): %s", symName, pal::dynamicloading::dlError()); } return ptr; } // Template for resolving a function of type SampleFnHandleType_t typedef ReturnType_t (*SampleFnHandleType_t)(FunctionParameterTypes_t ...); SampleFnHandleType_t sampleFn = nullptr; sampleFnHandle = resolveSymbol(libBackendHandle, "QnnSample_API"); if (nullptr == sampleFnHandle) { // Error code indicating failure in symbol resolution return StatusCode::FAIL_SYM_FUNCTION; } Copy to clipboard 下列程式碼片段則顯示解析實際 QNN API 的範例： /* Resolve the symbol for Qnn_ErrorHandle_t QnnInterface_getProviders(const QnnInterface_t*** providerList, uint32_t* numProviders) API */ typedef Qnn_ErrorHandle_t (*QnnInterfaceGetProvidersFn_t)(const QnnInterface_t*** providerList, uint32_t* numProviders); QnnInterfaceGetProvidersFn_t getInterfaceProviders {nullptr}; getInterfaceProviders = resolveSymbol(libBackendHandle, "QnnInterface_getProviders"); if (nullptr == getInterfaceProviders) { return StatusCode::FAIL_SYM_FUNCTION; } Copy to clipboard 在 *qnn-sample-app* 的原始碼中，所有必要符號皆已解析完成，並儲存於如下所示的 QnnFunctionPointers 型結構中： typedef struct QnnFunctionPointers { // APIs from model output from converters // QnnModel_composeGraphs ComposeGraphsFnHandleType_t composeGraphsFnHandle; // QnnModel_freeGraphsInfo FreeGraphInfoFnHandleType_t freeGraphInfoFnHandle; // QNN Interface function table containing pointers to all necessary QNN APIs // in a backend QNN_INTERFACE_VER_TYPE qnnInterface; // QNN System Interface function table containing pointers to all QNN System APIs QNN_SYSTEM_INTERFACE_VER_TYPE qnnSystemInterface; } QnnFunctionPointers; Copy to clipboard 上述結構可在 ${QNN\_SDK\_ROOT}/examples/QNN/SampleApp/SampleApp/src/SampleApp.hpp 中找到。本教學的後續內容將假設一個名為 *m\_qnnFunctionPointers* 且類型為 *QnnFunctionPointers* 的變數，該變數包含有效的函式指標。 ### QNN API 的使用方式本節將示範如何在用戶端應用程式中使用 QNN API。 #### 透過 QNN 介面取得函式指標 QNN 介面機制可用於在後端建立一個 QNN API 函式指標表，而非手動解析每個 API 的符號，這使得符號解析變得更加容易。QNN 介面的使用方式如下： QnnInterface_t** interfaceProviders{nullptr}; uint32_t numProviders{0}; // Query for al available interfaces if (QNN_SUCCESS != getInterfaceProviders((const QnnInterface_t***)&interfaceProviders, &numProviders)) { QNN_ERROR("Failed to get interface providers."); return StatusCode::FAIL_GET_INTERFACE_PROVIDERS; } // Check for validity of returned interfaces if (nullptr == interfaceProviders) { QNN_ERROR("Failed to get interface providers: null interface providers received."); return StatusCode::FAIL_GET_INTERFACE_PROVIDERS; } if (0 == numProviders) { QNN_ERROR("Failed to get interface providers: 0 interface providers."); return StatusCode::FAIL_GET_INTERFACE_PROVIDERS; } bool foundValidInterface{false}; // Loop through all available interface providers and pick the one that suits the current API // version for (size_t pIdx = 0; pIdx < numProviders; pIdx++) { if (QNN_API_VERSION_MAJOR == interfaceProviders[pIdx]->apiVersion.coreApiVersion.major && QNN_API_VERSION_MINOR <= interfaceProviders[pIdx]->apiVersion.coreApiVersion.minor) { foundValidInterface = true; m_qnnFunctionPointers.qnnInterface = interfaceProviders[pIdx]->QNN_INTERFACE_VER_NAME; break; } } if (!foundValidInterface) { QNN_ERROR("Unable to find a valid interface."); libBackendHandle = nullptr; return StatusCode::FAIL_GET_INTERFACE_PROVIDERS; } Copy to clipboard QNN 系統介面可用於解析所有與 QNN System API 相關的符號，如下所示： typedef Qnn_ErrorHandle_t (*QnnSystemInterfaceGetProvidersFn_t)( const QnnSystemInterface_t*** providerList, uint32_t* numProviders); QnnSystemInterfaceGetProvidersFn_t getSystemInterfaceProviders{nullptr}; getSystemInterfaceProviders = resolveSymbol( systemLibraryHandle, "QnnSystemInterface_getProviders"); if (nullptr == getSystemInterfaceProviders) { return StatusCode::FAIL_SYM_FUNCTION; } QnnSystemInterface_t** systemInterfaceProviders{nullptr}; uint32_t numProviders{0}; if (QNN_SUCCESS != getSystemInterfaceProviders( (const QnnSystemInterface_t***)&systemInterfaceProviders, &numProviders)) { QNN_ERROR("Failed to get system interface providers."); return StatusCode::FAIL_GET_INTERFACE_PROVIDERS; } if (nullptr == systemInterfaceProviders) { QNN_ERROR("Failed to get system interface providers: null interface providers received."); return StatusCode::FAIL_GET_INTERFACE_PROVIDERS; } if (0 == numProviders) { QNN_ERROR("Failed to get interface providers: 0 interface providers."); return StatusCode::FAIL_GET_INTERFACE_PROVIDERS; } bool foundValidSystemInterface{false}; for (size_t pIdx = 0; pIdx < numProviders; pIdx++) { if (QNN_SYSTEM_API_VERSION_MAJOR == systemInterfaceProviders[pIdx]->systemApiVersion.major && QNN_SYSTEM_API_VERSION_MINOR <= systemInterfaceProviders[pIdx]->systemApiVersion.minor) { foundValidSystemInterface = true; m_qnnFunctionPointers->qnnSystemInterface = systemInterfaceProviders[pIdx]->QNN_SYSTEM_INTERFACE_VER_NAME; break; } } Copy to clipboard #### 設定日誌記錄日誌記錄可在後端初始化之前、且後端記錄層級共享函式庫已完成動態載入之後進行設定。若要初始化日誌記錄，必須定義一個類型為 *QnnLog\_Callback\_t* 的回呼函式。一範例定義如下： void logStdoutCallback(const char* fmt, QnnLog_Level_t level, uint64_t timestamp, va_list argp) { const char* levelStr = ""; switch (level) { case QNN_LOG_LEVEL_ERROR: levelStr = " ERROR "; break; case QNN_LOG_LEVEL_WARN: levelStr = "WARNING"; break; case QNN_LOG_LEVEL_INFO: levelStr = " INFO "; break; case QNN_LOG_LEVEL_DEBUG: levelStr = " DEBUG "; break; case QNN_LOG_LEVEL_VERBOSE: levelStr = "VERBOSE"; break; case QNN_LOG_LEVEL_MAX: levelStr = "UNKNOWN"; break; } fprintf(stdout, "%8.1fms [%-7s] ", ms, levelStr); vfprintf(stdout, fmt, argp); fprintf(stdout, "\n"); } Copy to clipboard 上述回呼函式可與最高記錄層級一同註冊至後端。以下是將最高紀錄層級初始化為 QNN\_LOG\_LEVEL\_INFO 的範例程式碼： Qnn_LogHandle_t logHandle; if (QNN_SUCCESS != m_qnnFunctionPointers.qnnInterface.logCreate(logStdoutCallback, QNN_LOG_LEVEL_INFO, &logHandle)) { QNN_ERROR("Unable to initialize logging in the backend."); return StatusCode::FAILURE; } Copy to clipboard #### 初始化後端將日誌記錄成功初始化之後，即可依下列方式初始化後端： 1 Qnn_BackendHandle_t backendHandle; 2 const QnnBackend_Config_t* backendConfigs; 3 /* Set up any necessary backend configurations */ 4 if (QNN_BACKEND_NO_ERROR != m_qnnFunctionPointers.qnnInterface.backendCreate(logHandle, 5 &backendConfigs, 6 &backendHandle)) { 7 QNN_ERROR("Could not initialize backend"); 8 return StatusCode::FAILURE; 9 } Copy to clipboard #### 初始化效能分析若需進行效能分析，可在後端初始化之後設定效能分析句柄。此效能分析句柄可後續用於任何支援效能分析的 API 中。可在後端建立一個具有基本效能分析層級的效能分析句柄，如下所示： - :: - 1 Qnn\_ProfileHandle\_t profileHandle; 2 if (QNN\_PROFILE\_NO\_ERROR != m\_qnnFunctionPointers.qnnInterface.profileCreate( 3 backendHandle, QNN\_PROFILE\_LEVEL\_BASIC, &profileHandle)) { 4 QNN\_WARN(「無法在後端建立效能分析句柄。」); 5 return StatusCode::FAILURE; 6 } #### 建立裝置可依下列方式建立裝置： 1 Qnn_DeviceHandle_t deviceHandle {nullptr}; 2 const QnnDevice_Config_t* devConfigArray[] = {&devConfig, nullptr}; 3 Qnn_ErrorHandle_t ret = m_qnnFunctionPointers.qnnInterface.deviceCreate(logHandle, devConfigArray, &deviceHandle); 4 if (QNN_SUCCESS != ret) { 5 QNN_ERROR("Failed to create device: %u", qnnStatus); 6 return StatusCode::FAILURE; 7 } Copy to clipboard 請依照 QNN HTP 後端 API 所定義的方式設定 devConfig。 #### 註冊操作符套件操作符套件是一種將包含操作符的函式庫提供給後端的方式。可依下列方式進行註冊： 1 uint32_t opPackageCount; 2 char* opPackagePath[opPackageCount]; 3 char* opPackageInterfaceProvider[opPackageCount]; 4 /* Set up required op package paths and interface providers as necessary */ 5 for(uint32_t idx = 0; idx < opPackageCount; idx++) { 6 if (QNN_BACKEND_NO_ERROR != 7 m_qnnFunctionPointers.qnnInterface.backendRegisterOpPackage(backendHandle, 8 opPackagePath[idx], 9 opPackageInterfaceProvider[idx])) { 10 QNN_ERROR("Could not register Op Package: %s and interface provider: %s", 11 opPackagePath[idx], 12 opPackageInterfaceProvider[idx]); 13 return StatusCode::FAILURE; 14 } 15 } Copy to clipboard #### 建立運算內容可依下列方式在後端建立運算內容： 1 Qnn_ContextHandle_t context; 2 Qnn_DeviceHandle_t deviceHandle {nullptr}; 3 const QnnContext_Config_t* contextConfigs; 4 /* Set up any context configs that are necessary */ 5 if (QNN_CONTEXT_NO_ERROR != 6 m_qnnFunctionPointers.qnnInterface.contextCreate(backendHandle, 7 deviceHandle, 8 &contextConfigs, 9 &context)) { 10 QNN_ERROR("Could not create context"); 11 return StatusCode::FAILURE; 12 } Copy to clipboard #### 準備運算圖 *qnn-sample-app* 需仰賴某一個轉換器的輸出，才能在後端建立 QNN 網路。*composeGraphsFnHandle* 會映射到模型共享函式庫中的 *QnnModel\_composeGraphs* API，該 API 接受 *qnn\_wrapper\_api::GraphInfo\_t\**\*\* 作為參數之一。函式 *composeGraphsFnHandle* 會執行必要的後端呼叫以建立網路，並將執行運算圖所需的所有必要資訊 (例如與運算圖相關的輸入和輸出張量資訊) 寫入 *graphsInfo* 結構中，如下列程式碼塊所示： 1 /* Structure to retrieve information about graphs, like graph name, 2 details about input and output tensors preset in libQnnSampleModel.so */ 3 qnn_wrapper_api::GraphInfo_t** graphsInfo; 4 // No. of graphs present in libQnnSampleModel.so 5 uint32_t graphsCount; 6 // true to enable intermediate outputs, false for network outputs only 7 bool debug; 8 if (qnn_wrapper_api::ModelError_t::MODEL_NO_ERROR != 9 m_qnnFunctionPointers.composeGraphsFnHandle(backendHandle, 10 m_qnnFunctionPointers.qnnInterface, 11 context, 12 &graphsInfo, 13 &graphsCount, 14 debug)) { 15 QNN_ERROR("Failed in composeGraphs()"); 16 return StatusCode::FAILURE; 17 } Copy to clipboard 至此，運算內容將包含所有存在於 *libQnnSampleModel.so* 中的運算圖。 #### 運算圖最終化可依下列方式將上一步驟所加入的運算圖最終化： 1 // information about graphs obtained in the previous step 2 qnn_wrapper_api::GraphInfo_t** graphsInfo; 3 // No. of graphs obtained in the previous step 4 uint32_t graphsCount; 5 /* A valid profile handle if profiling is desired, 6 nullptr if profiling is not needed */ 7 Qnn_ProfileHandle_t profileHandle; 8 9 for (size_t graphIdx = 0; graphIdx < m_graphsCount; graphIdx++) { 10 if (QNN_GRAPH_NO_ERROR != 11 m_qnnFunctionPointers.qnnInterface.graphFinalize( 12 (*graphsInfo)[graphIdx].graph, profileBackendHandle, nullptr)) { 13 return StatusCode::FAILURE; 14 } 15 /* Extract profiling information if desired and if a valid handle was supplied to finalize 16 graphs API */ 17 } Copy to clipboard #### 將運算內容存入二進位檔案當運算內容中的所有運算圖皆已最終化之後，使用者應用程式可選擇將運算內容存入二進位檔案以便日後使用。儲存運算內容的優點在於，未來可直接載入並執行其中的運算圖，無需再次進行最終化。這將大幅節省執行網路時的初始化時間。可依下列方式儲存運算內容： 1 // Get the expected size of the buffer from the backend in which the context can be saved 2 if (QNN_CONTEXT_NO_ERROR != 3 m_qnnFunctionPointers.qnnInterface.contextGetBinarySize(context, &requiredBufferSize)) { 4 QNN_ERROR("Could not get the required binary size."); 5 return StatusCode::FAILURE; 6 } 7 8 // Allocate a buffer of the required size 9 saveBuffer = (uint8_t*)malloc(requiredBufferSize * sizeof(uint8_t)); 10 if (nullptr == saveBuffer) { 11 QNN_ERROR("Could not allocate buffer to save binary."); 12 return StatusCode::FAILURE; 13 } 14 15 auto status = StatusCode::SUCCESS; 16 uint32_t writtenBufferSize{0}; 17 // Pass the allocated buffer and obtain a copy of the context binary written into the buffer 18 if (QNN_CONTEXT_NO_ERROR != 19 m_qnnFunctionPointers.qnnInterface.contextGetBinary(context, 20 reinterpret_cast(saveBuffer), 21 requiredBufferSize, 22 &writtenBufferSize)) { 23 QNN_ERROR("Could not get binary."); 24 status = StatusCode::FAILURE; 25 } 26 27 // Check if the supplied buffer size is at least as big as the amount of data witten by the backend 28 if (requiredBufferSize < writtenBufferSize) { 29 QNN_ERROR( 30 "Illegal written buffer size [%d] bytes. Cannot exceed allocated memory of [%d] bytes", 31 writtenBufferSize, 32 requiredBufferSize); 33 status = StatusCode::FAILURE; 34 } 35 36 // Use caching utility to save metadata along with the binary buffer from the backend 37 if (status == StatusCode::SUCCESS && 38 tools::datautil::StatusCode::SUCCESS != tools::datautil::writeBinaryToFile(outputPath, 39 saveBinaryName + ".bin", 40 (uint8_t*)saveBuffer, 41 writtenBufferSize)) { 42 QNN_ERROR("Could not serialize to file."); 43 status = StatusCode::FAILURE; 44 } Copy to clipboard #### 從快取二進位檔案載入運算內容可載入已在上一步驟中存入二進位檔案的運算內容，以作為重新建立運算內容的替代方案。下列程式碼片段展示此步驟： 1 auto returnStatus = StatusCode::SUCCESS; 2 std::shared_ptr buffer{nullptr}; 3 uint32_t graphsCount {0}; 4 buffer = std::shared_ptr(new uint8_t[bufferSize], std::default_delete()); 5 if (!buffer) { 6 QNN_ERROR("Failed to allocate memory."); 7 return StatusCode::FAILURE; 8 } 9 10 if (tools::datautil::StatusCode::SUCCESS != 11 tools::datautil::readBinaryFromFile( 12 cachedBinaryPath, reinterpret_cast(buffer.get()), bufferSize) 13 QNN_ERROR("Failed to read binary file."); 14 returnStatus = StatusCode::FAILURE; 15 } 16 17 /* Create a QnnSystemContext handle to access system context APIs. */ 18 QnnSystemContext_Handle_t sysCtxHandle{nullptr}; 19 if (QNN_SUCCESS != m_qnnFunctionPointers.qnnSystemInterface.systemContextCreate(&sysCtxHandle)) { 20 QNN_ERROR("Could not create system handle."); 21 returnStatus = StatusCode::FAILURE; 22 } 23 24 /* Retrieve metadata from the context binary through QNN System Context API. */ 25 QnnSystemContext_BinaryInfo_t* binaryInfo{nullptr}; 26 uint32_t binaryInfoSize{0}; 27 if (StatusCode::SUCCESS == returnStatus && 28 QNN_SUCCESS != m_qnnFunctionPointers.qnnSystemInterface.systemContextGetBinaryInfo( 29 sysCtxHandle, 30 static_cast(buffer.get()), 31 bufferSize, 32 &binaryInfo, 33 &binaryInfoSize)) { 34 QNN_ERROR("Failed to get context binary info"); 35 returnStatus = StatusCode::FAILURE; 36 } 37 38 qnn_wrapper_api::GraphInfo_t** graphsInfo; 39 /* Make a copy of the metadata. */ 40 if (StatusCode::SUCCESS == returnStatus && 41 !copyMetadataToGraphsInfo(binaryInfo, graphsInfo, graphsCount)) { 42 QNN_ERROR("Failed to copy metadata."); 43 returnStatus = StatusCode::FAILURE; 44 } 45 46 /* Release resources associated with previously created QnnSystemContext handle. */ 47 m_qnnFunctionPointers.qnnSystemInterface.systemContextFree(sysCtxHandle); 48 sysCtxHandle = nullptr; 49 50 /* readBuffer contains the binary data that was previously obtained from a backend. Pass this 51 cached binary data to the backend to recreate the same context. */ 52 if (StatusCode::SUCCESS == returnStatus && 53 m_qnnFunctionPointers.qnnInterface.contextCreateFromBinary(backendHandle, 54 deviceHandle, 55 (const QnnContext_Config_t**)&contextConfig, 56 reinterpret_cast(readBuffer), 57 bufferSize, 58 &context, 59 profileBackendHandle)) { 60 QNN_ERROR("Could not create context from binary."); 61 returnStatus = StatusCode::FAILURE; 62 } 63 64 // Optionally, extract profiling numbers if desired 65 if (ProfilingLevel::OFF != m_profilingLevel) { 66 extractBackendProfilingInfo(profileBackendHandle); 67 } 68 69 /* Obtain and save graph handles for each graph present in the context based on the saved graph 70 names in the metadata */ 71 if (StatusCode::SUCCESS == returnStatus) { 72 for (size_t graphIdx = 0; graphIdx < m_graphsCount; graphIdx++) { 73 if (QNN_SUCCESS != 74 m_qnnFunctionPointers.qnnInterface.graphRetrieve( 75 context, (*graphsInfo)[graphIdx].graphName, &((*graphsInfo)[graphIdx].graph))) { 76 QNN_ERROR("Unable to retrieve graph handle for graph Idx: %d", graphIdx); 77 returnStatus = StatusCode::FAILURE; 78 } 79 } 80 } Copy to clipboard #### 執行運算圖在建立運算內容並完成運算圖的加入與最終化之後，或從二進位檔案載入運算內容之後，即可執行運算內容中的一或多個運算圖。執行運算圖包含下列步驟： 1. 設定輸入與輸出張量。 2. 將輸入數據填入輸入張量中。 3. 呼叫後端的執行方法。 4. 取得輸出結果並加以儲存。下列程式碼片段即展示上述操作： 1 // Select a graph from graphsInfo if there are more than one graph in this context 2 uint32_t graphIdx; 3 QNN_DEBUG("Starting execution for graphIdx: %d", graphIdx); 4 Qnn_Tensor_t* inputs = nullptr; 5 Qnn_Tensor_t* outputs = nullptr; 6 // IOTensor utility is used to set up input and output tensor structures 7 if (iotensor::StatusCode::SUCCESS != 8 ioTensor.setupInputAndOutputTensors(&inputs, &outputs, (*graphsInfo)[graphIdx])) { 9 QNN_ERROR("Error in setting up Input and output Tensors for graphIdx: %d", graphIdx); 10 returnStatus = StatusCode::FAILURE; 11 break; 12 } 13 14 // Grab input raw file paths to read input data 15 auto inputFileList = inputFileLists[graphIdx]; 16 auto graphInfo = (*graphsInfo)[graphIdx]; 17 if (!inputFileList.empty()) { 18 /* *qnn-sample-app* reads data based on the batch size until the whole buffer is filled. 19 If there isn't sufficient data, it pads the rest with zeroes. */ 20 size_t totalCount = inputFileList[0].size(); 21 while (!inputFileList[0].empty()) { 22 size_t startIdx = (totalCount - inputFileList[0].size()); 23 24 // IOTensor utility is used to populate input tensors with input data 25 if (iotensor::StatusCode::SUCCESS != 26 m_ioTensor.populateInputTensors( 27 graphIdx, inputFileList, inputs, graphInfo, inputDataType)) { 28 returnStatus = StatusCode::FAILURE; 29 } 30 31 if (StatusCode::SUCCESS == returnStatus) { 32 // Execute the graph in the backend with optional profile handle 33 QNN_DEBUG("Successfully populated input tensors for graphIdx: %d", graphIdx); 34 Qnn_ErrorHandle_t executeStatus = QNN_GRAPH_NO_ERROR; 35 executeStatus = m_qnnFunctionPointers.qnnInterface.graphExecute(graphInfo.graph, 36 inputs, 37 graphInfo.numInputTensors, 38 outputs, 39 graphInfo.numOutputTensors, 40 profileBackendHandle, 41 nullptr); 42 if (QNN_GRAPH_NO_ERROR != executeStatus) { 43 returnStatus = StatusCode::FAILURE; 44 } 45 if (StatusCode::SUCCESS == returnStatus) { 46 QNN_DEBUG("Successfully executed graphIdx: %d ", graphIdx); 47 // IOTensor utility is used to write output tensors to raw files 48 if (iotensor::StatusCode::SUCCESS != 49 ioTensor.writeOutputTensors(graphIdx, 50 startIdx, 51 graphInfo.graphName, 52 outputs, 53 graphInfo.outputTensors, 54 graphInfo.numOutputTensors, 55 outputDataType, 56 graphsCount, 57 outputPath)) { 58 returnStatus = StatusCode::FAILURE; 59 } 60 } 61 } 62 if (StatusCode::SUCCESS != returnStatus) { 63 QNN_ERROR("Execution of Graph: %d failed!", graphIdx); 64 break; 65 } 66 } 67 } 68 69 // Clean up all the tensors after execution is completed 70 ioTensor.tearDownInputAndOutputTensors( 71 inputs, outputs, graphInfo.numInputTensors, graphInfo.numOutputTensors); 72 inputs = nullptr; 73 outputs = nullptr; 74 if (StatusCode::SUCCESS != returnStatus) { 75 break; 76 } 77 } Copy to clipboard IOTensor 是原始碼中所提供的工具程式，位於 ${QNN\_SDK\_ROOT}/examples/QNN/SampleApp/SampleApp/src/Utils/IOTensor.cpp。它提供多種有助於執行運算圖的方法，而該等方法已在先前的程式碼片段中使用： 1. *setupInputAndOutputTensors* ：用於設定與輸入及輸出張量相關的結構。 2. *populateInputTensors* ：用於將輸入數據複製到輸入張量結構中。 3. *tearDownInputAndOutputTensors* ：用於清理與輸入及輸出張量相關的資源。有關這些 API 的更多細節，請參閱 IOTensor 的原始碼。 #### 釋放運算內容完成上述所有執行程序後，可依下列方式釋放運算內容： 1 if (QNN_CONTEXT_NO_ERROR != 2 m_qnnFunctionPointers.qnnInterface.contextFree(context, profileBackendHandle)) { 3 QNN_ERROR("Could not free context"); 4 return StatusCode::FAILURE; 5 } Copy to clipboard #### 終止後端可依下列方式終止後端： 1 if (QNN_BACKEND_NO_ERROR != m_qnnFunctionPointers.qnnInterface.backendFree(backendHandle)) { 2 QNN_ERROR("Could not free backend"); 3 return StatusCode::FAILURE; 4 } Copy to clipboard ## SNPE 範例應用程式有關 C++ API 以及使用 SNPE 執行範例程式的方式，請參閱 [Qualcomm AI Runtime SDK 說明文件](https://docs.qualcomm.com/nav/home/usergroup8.html?product=1601111740009302) 。 Last Published: Feb 24, 2026 [Previous Topic 開發 AI 應用程式：瞭解 Qualcomm IM SDK](https://docs.qualcomm.com/bundle/publicresource/80-70023-15BT/topics/develop-your-own-application-im-sdk.md) [Next Topic 透過 GStreamer API 使用 AI Hub 模型](https://docs.qualcomm.com/bundle/publicresource/80-70023-15BT/topics/use-ai-hub-models-with-gstreamer.md)