# Tutorial: GPU Backend on SA-series — LV (GVM or PVM)

## Execution on LV (GVM or PVM)

First, set up toolchain path:

$ export QNN_AARCH64_OE_LINUX_GCC_93=<PATH-TO-AARCH64-OE-LINUX-GCC-93-Toolchain>
    Copy to clipboard

When using qnn-model-lib-generator to build your model, use additional argument -t aarch64-oe-linux-gcc9.3:

$ ${QNN_SDK_ROOT}/bin/x86_64-linux-clang/qnn-model-lib-generator \
        -c ${QNN_SDK_ROOT}/examples/Models/InceptionV3/model/Inception_v3.cpp \
        -b ${QNN_SDK_ROOT}/examples/Models/InceptionV3/model/Inception_v3.bin \
        -o ${QNN_SDK_ROOT}/examples/Models/InceptionV3/model_libs \ # This can be any path
        -t aarch64-oe-linux-gcc9.3
    Copy to clipboard

This will produce the following artifacts:

${QNN_SDK_ROOT}/examples/Models/InceptionV3/model_libs/aarch64-oe-linux-gcc9.3/libInception_v3.so
    Copy to clipboard

Now execute *adb root && adb remount* for your target, *adb shell* to the QNN terminal and create directories:

$ mkdir -p /data/local/tmp/lib
    $ mkdir -p /data/local/tmp/bin
    Copy to clipboard

Push the necessary libraries to device:

$ adb push ${QNN_SDK_ROOT}/lib/aarch64-oe-linux-gcc9.3/libQnnGpu.so /data/local/tmp/lib
    $ adb push ${QNN_SDK_ROOT}/lib/aarch64-oe-linux-gcc9.3/libQnnGpuNetRunExtensions.so /data/local/tmp/lib
    $ adb push ${QNN_SDK_ROOT}/bin/aarch64-oe-linux-gcc9.3/qnn-net-run /data/local/tmp/bin
    $ adb push ${QNN_SDK_ROOT}/bin/aarch64-oe-linux-gcc9.3/qnn-throughput-net-run /data/local/tmp/bin
    $ adb push ${QNN_SDK_ROOT}/bin/aarch64-oe-linux-gcc9.3/qnn-profile-viewer /data/local/tmp/bin
    Copy to clipboard

In the target terminal, enable permissions for QNN binaries:

$ chmod 777 /data/local/tmp/bin/qnn-net-run
    $ chmod 777 /data/local/tmp/bin/qnn-throughput-net-run
    $ chmod 777 /data/local/tmp/bin/qnn-profile-viewer
    Copy to clipboard

Now push the input data, input lists and model library to device:

$ adb push ${QNN_SDK_ROOT}/examples/Models/InceptionV3/data/cropped /data/local/tmp/bin
    $ adb push ${QNN_SDK_ROOT}/examples/Models/InceptionV3/data/target_raw_list.txt /data/local/tmp/bin
    $ adb push ${QNN_SDK_ROOT}/examples/Models/InceptionV3/model_libs/aarch64-oe-linux-gcc9.3/libInception_v3.so /data/local/tmp/bin
    Copy to clipboard

Finally, reset the target with the following command in the target terminal:

$ reset
    Copy to clipboard

Now qnn-net-run and qnn-profile-viewer can be executed using GPU and LV GVM or LV PVM. Then, *adb shell* into the target terminal and setup the execution environment:

$ cd /data/local/tmp/bin
    $ export PATH=/data/local/tmp/bin:$PATH
    $ export VENDOR_LIB=/data/local/tmp/lib
    $ export LD_LIBRARY_PATH=$VENDOR_LIB:$LD_LIBRARY_PATH
    Copy to clipboard

Then we can execute qnn-net-run with the following:

$ ./qnn-net-run \
        --backend libQnnGpu.so \
        --input_list target_raw_list.txt \
        --model libInception_v3.so
    Copy to clipboard

We can also execute qnn-net-run with profiling:

$ ./qnn-net-run \
        --backend libQnnGpu.so \
        --input_list target_raw_list.txt \
        --model libInception_v3.so \
        --profiling_level basic
    
    # OR
    
    $ ./qnn-net-run \
        --backend libQnnGpu.so \
        --input_list target_raw_list.txt \
        --model libInception_v3.so \
        --profiling_level detailed
    Copy to clipboard

Profiling data can be viewed using qnn-profile-viewer by running the following command:

$ ./qnn-profile-viewer \
        --input_log output/qnn-profiling-data.log
    Copy to clipboard

By default, qnn-net-run executes GPU backend in USER\_PROVIDED mode (i.e., no precision override). To explicitly set Precision Mode as FP32, FP16, or HYBRID, use the –config\_file option with a JSON configuration. Here is an example of json files:

> 
> 
> 1. EXAMPLE of `gpu_config.json` file:
> 
> 
> 
> {
>         "backend_extensions": {
>             "shared_library_path": "libQnnGpuNetRunExtensions.so",
>             "config_file_path": "./gpu_settings.json"
>         }
>     }
>     Copy to clipboard
> 
> 2. EXAMPLE of `gpu_settings.json` file to set FP16 Mode:
> 
> 
> 
> {
>         "graph_names": ["<Model graph name>"],
>         "precision_mode": "fp16"
>     }
>     Copy to clipboard
> 
> 3. EXAMPLE of `gpu_settings.json` file to set FP32 Mode:
> 
> 
> 
> {
>         "graph_names": ["<Model graph name>"],
>         "precision_mode": "fp32"
>     }
>     Copy to clipboard
> 
> 4. EXAMPLE of `gpu_settings.json` file to set Hybrid Mode:
> 
> 
> 
> {
>         "graph_names": ["<Model graph name>"],
>         "precision_mode": "hybrid"
>     }
>     Copy to clipboard

Then we can execute qnn-net-run taking a JSON configuration file with the following:

$ ./qnn-net-run \
        --backend libQnnGpu.so \
        --input_list target_raw_list.txt \
        --model libInception_v3.so \
        --config_file gpu_config.json
    Copy to clipboard

We can execute qnn-throughput-net-run with the following:

$ ./qnn-throughput-net-run \
        --config throughput_config.json \
        --output throughput_result.json
    Copy to clipboard

Below is the sample throughput\_config.json file:

> 
> 
> 1. EXAMPLE of `throughput_config.json` file:
> 
> 
> 
> > 
> > 
> > {
> >             "backends": [
> >                 {
> >                 "backendName": "gpu_backend",
> >                 "backendPath": "libQnnGpu.so",
> >                 "profilingLevel": "BASIC",
> >                 "backendExtensions": "libQnnGpuNetRunExtensions.so",
> >                 "perfProfile": "high_performance"
> >                 }
> >             ],
> >             "models": [
> >                 {
> >                 "modelName": "Inception_v3",
> >                 "modelPath": "libInception_v3.so",
> >                 "inputPath": "target_raw_list.txt",
> >                 "inputDataType": "FLOAT"
> >                 }
> >             ],
> >             "contexts": [
> >                 {
> >                 "contextName": "gpu_context_1"
> >                 }
> >             ],
> >             "testCase": {
> >                 "iteration": 1,
> >                 "logLevel": "error",
> >                 "threads": [
> >                     {
> >                     "threadName": "gpu_thread_1",
> >                     "backend": "gpu_backend",
> >                     "context": "gpu_context_1",
> >                     "model": "Inception_v3",
> >                     "interval": 0,
> >                     "loopUnit": "count",
> >                     "loop": 10
> >                     }
> >                 ]
> >             }
> >         }
> >         Copy to clipboard

Outputs from the run will be located at the default ./output directory. Exit the device and view the results:

$ exit
    $ cd ${QNN_SDK_ROOT}/examples/Models/InceptionV3
    $ adb pull /data/local/tmp/bin/output output_lv
    $ python3 ${QNN_SDK_ROOT}/examples/Models/InceptionV3/scripts/show_inceptionv3_classifications.py
        -i data/cropped/raw_list.txt \
        -o output_lv/ \
        -l data/imagenet_slim_labels.txt
    Copy to clipboard

Last Published: Jun 04, 2026

[Previous Topic
Tutorial: Running QNN GPU on SA-series](https://docs.qualcomm.com/bundle/publicresource/80-63442-10/topics/gpu_auto_tutorial_2.md) [Next Topic
Tutorial: Running QNN on SA8797 QC Linux PVM (HTP and LPAI Backends)](https://docs.qualcomm.com/bundle/publicresource/80-63442-10/topics/htp_auto_qclinux.md)