# PSNPE C Tutorial

Prerequisites

- The Qualcomm® Neural Processing SDK has been set up following the [|Qualcomm(R)| Neural Processing SDK setup](https://docs.qualcomm.com/doc/80-63442-10/topic/SNPE_general_setup.html) .
- The [Tutorials Setup](https://docs.qualcomm.com/doc/80-63442-10/topic/tutorial_setup.html) has been completed.
- APIs used in this page can be found in [C Tutorial - Build the Sample chapter](https://docs.qualcomm.com/doc/80-63442-10/topic/c_tutorial.html).

Introduction

This tutorial demonstrates how to use PSNPE C APIs to build its C sample application
that can execute neural network models with multiple runtimes on the target device.
While this sample code does not do any error checking, it is strongly recommended
that users check for errors when using the PSNPE APIs. Besides, since sample code is
based on C, all relevant handles need to be freed in the end.

Using sync mode as a sample, a PSNPE integrated application will follow the following pattern while using a neural network:

1. [Get Configuration of Available Runtimes](https://docs.qualcomm.com/doc/80-63442-10/topic/tutorial_psnpe_c_tutorial.html#get-configuration-of-available-runtimes)
2. [Get Builder Configuration](https://docs.qualcomm.com/doc/80-63442-10/topic/tutorial_psnpe_c_tutorial.html#get-builder-configuration)
3. [Build PSNPE Instance](https://docs.qualcomm.com/doc/80-63442-10/topic/tutorial_psnpe_c_tutorial.html#build-psnpe-instance)
4. [Load Network Inputs with User Buffer List](https://docs.qualcomm.com/doc/80-63442-10/topic/tutorial_psnpe_c_tutorial.html#load-network-inputs-with-user-buffer-list)
5. [Execute the Network & Process Output for Sync Mode](https://docs.qualcomm.com/doc/80-63442-10/topic/tutorial_psnpe_c_tutorial.html#execute-the-network-process-output-for-sync-mode)
6. [C Application Example](https://docs.qualcomm.com/doc/80-63442-10/topic/tutorial_psnpe_c_tutorial.html#c-application-example)

auto runtimeConfigListHandle = Snpe_RuntimeConfigList_Create();
    auto bcHandle = Snpe_BuildConfig_Create();
    buildStatus = (Snpe_PSNPE_Build(psnpeHandle, bcHandle) == SNPE_SUCCESS);
    exeStatus = SNPE_SUCCESS == Snpe_PSNPE_Execute(psnpeHandle, inputMapList2, outputMapList2);
    Copy to clipboard

PSNPE uses sync model as default, if you want to choose async mode, please refer to
[BuildConfig for Async Mode](https://docs.qualcomm.com/doc/80-63442-10/topic/tutorial_psnpe_c_tutorial.html#buildconfig-for-async-mode).
For output async mode, loading input data and executing psnpe is similar as sync mode,
but you need to get output data by defining outputCallback function in
[Callback for OutputAsync Mode](https://docs.qualcomm.com/doc/80-63442-10/topic/tutorial_psnpe_c_tutorial.html#callback-for-outputasync-mode).
For input/output async mode, both loading input data and get output data need
callback functions which could refer to
[Execution and Callback for InputOutputAsync Mode](https://docs.qualcomm.com/doc/80-63442-10/topic/tutorial_psnpe_c_tutorial.html#execution-and-callback-for-inputoutputasync-mode).

The sections below describe how to implement each step described above.

Get Configuration of Available Runtimes

The code excerpt below illustrates how to set the config for each available runtime with the given parameters.
Creation of multiple instances for the same runtime can be done by adding multiple runtime config handles to the
runtime config list. Multiple instances, even of the same runtime, would create multiple worker threads to queue
work for execution, improving throughput.

auto runtimeConfigListHandle = Snpe_RuntimeConfigList_Create();
    for (size_t j = 0; j < numRequestedInstances; j++)
    {
        auto runtimeConfigHandle = Snpe_RuntimeConfig_Create();
        Snpe_RuntimeConfig_SetRuntimeList(runtimeConfigHandle, RuntimesListVector[j]);
        Snpe_RuntimeConfig_SetPerformanceProfile(runtimeConfigHandle, PerfProfile[j]);
        Snpe_RuntimeConfigList_PushBack(runtimeConfigListHandle, runtimeConfigHandle);
        Snpe_RuntimeConfig_Delete(runtimeConfigHandle);
        numCreatedInstances++;
    }
    Copy to clipboard

Get Builder Configuration

The code excerpt below illustrates how to set the configuration for PSNPE builder with the given parameters including DLC, runtimeConfigList, output layer, transmission mode etc.

auto containerHandle = Snpe_DlContainer_Open(ContainerPath.c_str());
    auto bcHandle = Snpe_BuildConfig_Create();
    Snpe_BuildConfig_SetContainer(bcHandle, containerHandle);
    Snpe_BuildConfig_SetRuntimeConfigList(bcHandle, runtimeConfigListHandle);
    Snpe_BuildConfig_SetOutputBufferNames(bcHandle, outputLayers);
    Snpe_BuildConfig_SetInputOutputTransmissionMode(bcHandle, static_cast<Snpe_PSNPE_InputOutputTransmissionMode_t>(inputOutputTransmissionMode));
    Snpe_BuildConfig_SetEncode(bcHandle, input_encode[0], input_encode[1]);
    Snpe_BuildConfig_SetEnableInitCache(bcHandle, usingInitCache);
    Snpe_BuildConfig_SetProfilingLevel(bcHandle, profilingLevel);
    Snpe_BuildConfig_SetPlatformOptions(bcHandle, platformOptions.c_str());
    Snpe_BuildConfig_SetOutputTensors(bcHandle, outputTensors);
    Copy to clipboard

Build PSNPE Instance

The following code demonstrates how to instantiate a PSNPE Builder object which will be used to execute the network.

buildStatus = (Snpe_PSNPE_Build(psnpeHandle, bcHandle) == SNPE_SUCCESS);
    Copy to clipboard

Load Network Inputs with User Buffer List

This input loading method is used in synchronous mode and output asynchronous mode which
is similar as the method used by Qualcomm® Neural Processing SDK to create inputs
and outputs from user-backed buffers.

std::vector<std::unordered_map <std::string, std::vector<uint8_t>>> outputBuffersVec(nums);
    std::vector<std::unordered_map <std::string, std::vector<uint8_t>>> inputBuffersVec(nums);
    std::vector<Snpe_IUserBuffer_Handle_t> snpeUserBackedInputBuffers;
    std::vector<Snpe_IUserBuffer_Handle_t> snpeUserBackedOutputBuffers;
    Snpe_UserBufferList_Handle_t inputMapList  = Snpe_UserBufferList_CreateSize(BufferNum);
    Snpe_UserBufferList_Handle_t outputMapList = Snpe_UserBufferList_CreateSize(BufferNum);
    if(inputOutputTransmissionMode != zdl::PSNPE::InputOutputTransmissionMode::inputOutputAsync)
    {
       for (size_t i = 0; i < inputs.size(); ++i) {
          for (size_t j = 0; j < Snpe_StringList_Size(inputTensorNamesList[0]); ++j) {
             const char* name = Snpe_StringList_At(inputTensorNamesList[0], j);
             uint8_t bufferBitWidth = bitWidthMap[bufferDataTypeMap[name]];
             uint8_t nativeBitWidth = usingNativeInputDataType ? bitWidthMap[nativeDataTypeMap[name]]: 32;
             std::string nativeDataType = usingNativeInputDataType ? nativeDataTypeMap[name] : "float32";
             if(bufferDataTypeMap[name] == "float16" || bufferDataTypeMap[name] == "float32"){
                if(!LoadInputBufferMapsFloatN(inputs[i][j], name, {psnpeHandle, true},
                                            Snpe_UserBufferList_At_Ref(inputMapList, i),
                                            snpeUserBackedInputBuffers, inputBuffersVec[i],numFilesCopied, batchSize, dynamicQuantization,
                                            bufferBitWidth,10, rpcMemAllocFnHandle, false, ionBufferMapHandle,
                                            usingNativeInputDataType, nativeDataType, nativeBitWidth))
                {
                   return EXIT_FAILURE;
                }
             }
          }
          Snpe_StringList_Handle_t outputBufferNamesHandle = Snpe_PSNPE_GetOutputTensorNames(psnpeHandle);
          for (size_t j = 0; j < Snpe_StringList_Size(outputBufferNamesHandle); ++j) {
             const char* name = Snpe_StringList_At(outputBufferNamesHandle, j);
             if(bufferDataTypeMap.find(name) == bufferDataTypeMap.end()){
                std::cerr << "DataType not specified for buffer " << name << std::endl;
             }
             uint8_t bitWidth = bitWidthMap[bufferDataTypeMap[name]];
             if(bufferDataTypeMap[name] == "float16" || bufferDataTypeMap[name] == "float32"){
                PopulateOutputBufferMapsFloatN({psnpeHandle, true}, name,
                                              Snpe_UserBufferList_At_Ref(outputMapList, i),
                                              snpeUserBackedOutputBuffers, outputBuffersVec[i], bitWidth, 10,
                                              rpcMemAllocFnHandle, usingIonBuffer, ionBufferMapHandle);
             }
          }
       }
    }
    Copy to clipboard

Execute the Network & Process Output for Sync Mode

The following code uses the native API to execute the network in synchronous mode. The saveOutput function could refer to [PSNPE C++ Tutorial](https://docs.qualcomm.com/doc/80-63442-10/topic/tutorial_psnpe_cplus_plus_tutorial.html)

exeStatus = SNPE_SUCCESS == Snpe_PSNPE_Execute(psnpeHandle, inputMapList, outputMapList);
    for (size_t i = 0; i < inputs.size(); i++) {
       saveOutput(Snpe_UserBufferList_At_Ref(outputMapList, i), outputBuffersVec[i], ionBufferMapReg, OutputDir, i * batchSize,  batchSize, false);
    }
    Copy to clipboard

BuildConfig for Async Mode

If you want to run outputAsync mode or inputOutputAsync mode, you need to set callback fuction in buildConfig.

if (inputOutputTransmissionMode == SNPE_PSNPE_INPUTOUTPUTTRANSMISSIONMODE_OUTPUTASYNC) {
       Snpe_BuildConfig_SetOutputThreadNumbers(bcHandle, outputNum);
       Snpe_BuildConfig_SetOutputCallback(bcHandle, OCallback);
    }
    if (inputOutputTransmissionMode == SNPE_PSNPE_INPUTOUTPUTTRANSMISSIONMODE_INPUTOUTPUTASYNC) {
       Snpe_BuildConfig_SetInputThreadNumbers(bcHandle, inputNum);
       Snpe_BuildConfig_SetOutputThreadNumbers(bcHandle, outputNum);
       Snpe_BuildConfig_SetInputOutputCallback(bcHandle, IOCallback);
       Snpe_BuildConfig_SetInputOutputInputCallback(bcHandle, inputCallback);
    }
    Copy to clipboard

Callback for OutputAsync Mode

Output asynchronous mode provide real-time output by calling callback function.

void OCallback(Snpe_PSNPE_OutputAsyncCallbackParam_Handle_t oacpHandle) {
       if(!Snpe_PSNPE_OutputAsyncCallbackParam_GetExecuteStatus(oacpHandle)) {
          std::cerr << "excute fail ,index: " << Snpe_PSNPE_OutputAsyncCallbackParam_GetDataIdx(oacpHandle) << std::endl;
       }
    }
    Copy to clipboard

Execution and Callback for InputOutputAsync Mode

Asynchronous execution can provide real-time output result while synchronous mode provides the outputs after finishing execution.

for (size_t i = 0; i < inputs.size(); ++i) {
    std::vector< std::string > filePaths;
    std::vector<std::queue<std::string>> temp = inputs[i];
    for(size_t j=0;j<temp.size();j++)
       {
          while(temp[j].size()!= 0){
             filePaths.push_back(temp[j].front());
             temp[j].pop();
          }
          numLines++;
          Snpe_StringList_Handle_t filePathsHandle = toStringList(filePaths);
          exeStatus = SNPE_SUCCESS == Snpe_PSNPE_ExecuteInputOutputAsync(psnpeHandle, filePathsHandle, i, usingTf8UserBuffer, usingTf8UserBuffer);
       }
    }
    //In input/output asynchronous mode, loading input data through callback function with TF8 vector.
    Snpe_ApplicationBufferMap_Handle_t inputCallback(Snpe_StringList_Handle_t inputs, Snpe_StringList_Handle_t inputNames) {
      Snpe_ApplicationBufferMap_Handle_t inputMap = Snpe_ApplicationBufferMap_Create();
      for (size_t j = 0; j < Snpe_StringList_Size(inputNames); j++) {
        std::vector<uint8_t> loadVector;
        ...  //load input data
        Snpe_ApplicationBufferMap_Add(inputMap, Snpe_StringList_At(inputNames, j), loadVector.data(), loadVector.size());
      }
      return inputMap;
    }
    // In input/output asynchronous mode, the index and data of output can be obtained through a callback function
    void IOCallback(Snpe_PSNPE_InputOutputAsyncCallbackParam_Handle_t ioacpHandle)
    {
       Snpe_StringList_Handle_t names = Snpe_PSNPE_InputOutputAsyncCallbackParam_GetUserBufferNames(ioacpHandle);
       std::vector<std::pair<const char*, Snpe_UserBufferData_t>> vec;
       const auto end = Snpe_StringList_End(names);
       for(auto it = Snpe_StringList_Begin(names); it != end; ++it){
          vec.emplace_back(*it, Snpe_PSNPE_InputOutputAsyncCallbackParam_GetUserBuffer(ioacpHandle, *it));
       }
       saveOutput(vec, OutputDir, Snpe_PSNPE_InputOutputAsyncCallbackParam_GetDataIdx(ioacpHandle));
    }
    // The below shows parts of the function.
    void saveOutput(const std::vector<std::pair<const char*, Snpe_UserBufferData_t>>& applicationOutputBuffers, const std::string& outputDir, int num){
      std::for_each(applicationOutputBuffers.begin(),
                    applicationOutputBuffers.end(),
                    [&](std::pair<std::string, Snpe_UserBufferData_t> a) {
                      std::ostringstream path;
                      path << outputDir << "/"
                           << "Result_" << num << "/" << pal::FileOp::toLegalFilename(a.first) << ".raw";
                      std::string outputPath = path.str();
                      std::string::size_type pos = outputPath.find(":");
                      if (pos != std::string::npos) outputPath = outputPath.replace(pos, 1, "_");
                      SaveUserBuffer(outputPath, a.second.data, a.second.size);
                    });
    }
    Copy to clipboard

C Application Example

The C application integrated with PSNPE in this tutorial is called [snpe-parallel-run](https://docs.qualcomm.com/doc/80-63442-10/topic/SNPE_general_tools.html#snpe-parallel-run).
It is a command line executable that executes a DLC model using Qualcomm® Neural Processing SDK SDK APIs.
It’s usage is same as snpe-net-run example from [Running the Inception v3 Model](https://docs.qualcomm.com/doc/80-63442-10/topic/tutorial_inceptionv3.html)
while running on android target.

1. Push model data to Android target.
2. Select target architecture.
3. Push binaries to target.
4. Set up environment variables.

adb shell
    export ADSP_LIBRARY_PATH="/data/local/tmp/snpeexample/dsp/lib;/system/lib/rfsa/adsp;/system/vendor/lib/rfsa/adsp;/dsp"
    export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/data/local/tmp/snpeexample/aarch64-android/lib
    export PATH=$PATH:/data/local/tmp/snpeexample/aarch64-android/bin/
    cd /data/local/tmp/inception_v3
    snpe-parallel-run --container inception_v3_quantized.dlc --input_list target_raw_list.txt --use_dsp --perf_profile burst --cpu_fallback false --use_dsp --perf_profile burst --cpu_fallback false --runtime_mode output_async
    exit
    Copy to clipboard

Last Published: Jun 04, 2026

[Previous Topic
PSNPE Introduction](https://docs.qualcomm.com/bundle/publicresource/80-63442-10/topics/tutorial_psnpe_introduction.md) [Next Topic
PSNPE C++ Tutorial](https://docs.qualcomm.com/bundle/publicresource/80-63442-10/topics/tutorial_psnpe_cplus_plus_tutorial.md)