# UDO Tutorial With Weights

Overview

This tutorial describes the steps needed to create a UDO
package with weights and execute the VGG model using the package.
The Convolution operation has been chosen in this tutorial to
demonstrate the implementation of a UDO with weights.

The Qualcomm® Neural Processing SDK provides the resources for this example under

- $SNPE\_ROOT/examples/SNPE/NativeCpp/UdoExample/Conv2D

Information on UDO in general is available at [UDO Overview](https://docs.qualcomm.com/doc/80-63442-10/topic/udo_overview.html).
Information on running the VGG network without UDO is
available at [VGG Tutorial](https://docs.qualcomm.com/doc/80-63442-10/topic/tutorial_onnx.html).
Information on creating a UDO package and executing the model
using the package is available at [UDO
Tutorial](https://docs.qualcomm.com/doc/80-63442-10/topic/tutorial_inceptionv3_udo.html).

Prerequisites

The following tutorial assumes that general [Qualcomm (R) Neural Processing SDK
setup](https://docs.qualcomm.com/doc/80-63442-10/topic/SNPE_general_setup.html) has been followed to support
SDK environment, ONNX environment, and desired platform
dependencies. Additionally, we need an extracted Qualcomm® AI Direct SDK (no
need of Qualcomm® AI Direct SDK setup) for generating the skeleton code and
building the libraries. For Qualcomm® AI Direct SDK details, refer to the Qualcomm® AI Direct SDK
documentation at `$QNN_SDK_ROOT/docs/QNN/index.html` page, where
`QNN_SDK_ROOT` is the location of the Qualcomm® AI Direct SDK installation.
Set the `$QNN_SDK_ROOT` to the unzipped Qualcomm® AI Direct SDK location. This has to be performed
after running the envsetup.sh script mentioned in [SNPE Setup](https://docs.qualcomm.com/doc/80-63442-10/topic/SNPE_general_setup.html#environment-setup). The
steps listed in this tutorial use the ONNX model in the
form of vgg16.onnx. For details on
acquiring the VGG model visit [Tutorials
Setup](https://docs.qualcomm.com/doc/80-63442-10/topic/tutorial_setup.html#getting-vgg).

Introduction

Here are the steps to develop and run a UDO

1. [Package Generation](https://docs.qualcomm.com/doc/80-63442-10/topic/tutorial_onnx_udo_weights.html#step-1-package-generation)
2. [Framework Model Conversion to a DLC](https://docs.qualcomm.com/doc/80-63442-10/topic/tutorial_onnx_udo_weights.html#step-2-framework-model-conversion-to-a-dlc)
3. [Package Implementation](https://docs.qualcomm.com/doc/80-63442-10/topic/tutorial_onnx_udo_weights.html#step-3-package-implementations)
4. [Package Compilation](https://docs.qualcomm.com/doc/80-63442-10/topic/tutorial_onnx_udo_weights.html#step-4-package-compilation)
5. [Model Execution](https://docs.qualcomm.com/doc/80-63442-10/topic/tutorial_onnx_udo_weights.html#model-execution)

Steps 1-4 are run offline on the x86 host and are necessary for
execution in step 5. Step 5 provides information on execution
using the Qualcomm® Neural Processing SDK command-line executable **snpe-net-run**.

Step 1: Package Generation

Generating the Conv2DPackage requires the
**snpe-udo-package-generator** tool and the provided UDO
plugin: Conv2D.json / Conv2DQuant.json / Conv2D\_Htp.json depending on your
runtime requirement. The Conv2D.json and Conv2DQuant.json gives
you skeleton code for CPU (float) and DSP (uint8) implementations
respectively. The plugins is located under
$SNPE\_ROOT/examples/SNPE/NativeCpp/UdoExample/Conv2D/config. More
information about creating a UDO plugin can be found
[here](https://docs.qualcomm.com/doc/80-63442-10/topic/udo_operator_definition.html#the-udo-configuration-specification).

Generate the Conv2DPackage UDO package using the following:

export SNPE_UDO_ROOT=$SNPE_ROOT/share/SNPE/SnpeUdo
    export QNN_SDK_ROOT=<path to Qualcomm® AI Direct SDK>
    mkdir $SNPE_ROOT/examples/Models/VGG/ConvUdoCpu
    snpe-udo-package-generator -p $SNPE_ROOT/examples/SNPE/NativeCpp/UdoExample/Conv2D/config/Conv2D.json -o $SNPE_ROOT/examples/Models/VGG/ConvUdoCpu
    Copy to clipboard

or for DSP less than V68

export SNPE_UDO_ROOT=$SNPE_ROOT/share/SNPE/SnpeUdo
    export QNN_SDK_ROOT=<path to Qualcomm® AI Direct SDK>
    mkdir $SNPE_ROOT/examples/Models/VGG/ConvUdoDsp
    snpe-udo-package-generator -p $SNPE_ROOT/examples/SNPE/NativeCpp/UdoExample/Conv2D/config/Conv2DQuant.json -o $SNPE_ROOT/examples/Models/VGG/ConvUdoDsp
    Copy to clipboard

or for DSP V68 and later

export SNPE_UDO_ROOT=$SNPE_ROOT/share/SNPE/SnpeUdo
    export QNN_SDK_ROOT=<path to Qualcomm® AI Direct SDK>
    mkdir $SNPE_ROOT/examples/Models/VGG/ConvUdoDsp
    snpe-udo-package-generator -p $SNPE_ROOT/examples/SNPE/NativeCpp/UdoExample/Conv2D/config/Conv2D_Htp.json -o $SNPE_ROOT/examples/Models/VGG/ConvUdoDsp
    Copy to clipboard

This command creates the Convolution based package at
$SNPE\_ROOT/examples/Models/VGG/ConvUdoCpu/Conv2DPackage or
$SNPE\_ROOT/examples/Models/VGG/ConvUdoDsp/Conv2DPackage.

For more information on the snpe-udo-package-generator tool
visit [here](https://docs.qualcomm.com/doc/80-63442-10/topic/creating_udo_package.html).

Step 2: Framework model Conversion to a DLC

Converting the ONNX VGG model to DLC requires
the [snpe-onnx-to-dlc](https://docs.qualcomm.com/doc/80-63442-10/topic/SNPE_general_tools.html#snpe-onnx-to-dlc) tool.
The snpe-onnx-to-dlc
tool consumes the same Conv2D.json used in package generation
via the –udo command line option. In this step,
&lt;VGG\_PATH&gt; refers to the path to the vgg.onnx
file. For example, after running the setup\_vgg.py
script &lt;VGG\_PATH&gt; is
$SNPE\_ROOT/examples/Models/VGG/onnx.

Convert VGG with the following:

snpe-onnx-to-dlc --input_network <VGG_PATH>/vgg16.onnx --output_path $SNPE_ROOT/examples/Models/VGG/dlc/vgg16_udo.dlc --udo $SNPE_ROOT/examples/SNPE/NativeCpp/UdoExample/Conv2D/config/Conv2D.json
    Copy to clipboard

This will generate a DLC named vgg16\_udo.dlc containing the
Convolution as UDO at $SNPE\_ROOT/examples/Models/VGG/dlc.

Step 3: Package Implementations

The generated package creates the skeleton of the operation
implementation, which must be filled by the user to create a
functional UDO. The rest of the code scaffolding for
compatibility with Qualcomm® Neural Processing SDK is provided by the
**snpe-udo-package-generator**. The UDO implementations for this tutorial are provided under
$SNPE\_ROOT/examples/SNPE/NativeCpp/UdoExample/Conv2D/src.

**CPU Implementations (Android and x86)**

The file in the package that needs to be implemented for CPU is

- ConvUdoCpu/Conv2DPackage/jni/src/CPU/src/ops/Conv.cpp

The provided example implementation is present at the location

- $SNPE\_ROOT/examples/SNPE/NativeCpp/UdoExample/Conv2D/src/CPU/Conv.cpp

Copy the provided implementation to the package:

cp -f $SNPE_ROOT/examples/SNPE/NativeCpp/UdoExample/Conv2D/src/CPU/Conv.cpp $SNPE_ROOT/examples/Models/VGG/ConvUdoCpu/Conv2DPackage/jni/src/CPU/src/ops/
    Copy to clipboard

**DSP Implementations (Android) for V65 and V66**

Please note that only C files are supported for UDO on DSP V65
and V66 runtimes. Refer [Implementing a UDO for DSP V65 and
V66](https://docs.qualcomm.com/doc/80-63442-10/topic/compiling_udo_package.html#implementing-a-udo-for-dsp-v65-and-v66)
for more information on implementing UDO for DSP V65 and V66
runtimes. The example here executes float implementation on DSP
runtime. Please refer to [UDO DSP for Quantized
DLC](https://docs.qualcomm.com/doc/80-63442-10/topic/tutorial_inceptionv3_udo_dsp.html) tutorial for
executing quantized implementation on DSP runtime.

The file in the package that need to be implemented for DSP V65
and V66 are

- ConvUdoDsp/Conv2DPackage/jni/src/DSP/ConvolutionImplLibDsp.c
- ConvUdoDsp/Conv2DPackage/include/ConvolutionImplLibDsp.h

The provided example implementations are present at the
locations

- $SNPE\_ROOT/examples/SNPE/NativeCpp/UdoExample/Conv2D/src/DSP/Conv2DInt8Impl/ConvolutionImplLibDsp.c
- $SNPE\_ROOT/examples/SNPE/NativeCpp/UdoExample/Conv2D/src/DSP/Conv2DInt8Impl/ConvolutionImplLibDsp.h

Copy the provided implementations to the package:

cp -f $SNPE_ROOT/examples/SNPE/NativeCpp/UdoExample/Conv2D/src/DSP/Conv2DInt8Impl/ConvolutionImplLibDsp.c $SNPE_ROOT/examples/Models/VGG/ConvUdoDsp/Conv2DPackage/jni/src/DSP/
    cp -f $SNPE_ROOT/examples/SNPE/NativeCpp/UdoExample/Conv2D/src/DSP/Conv2DInt8Impl/ConvolutionImplLibDsp.h $SNPE_ROOT/examples/Models/VGG/ConvUdoDsp/Conv2DPackage/include/
    Copy to clipboard

Optionally, the user can provide their own implementations in
the package.

**DSP Implementations for V68 and later**

Please note that only C++ files are supported for UDO on DSP
V68 and later runtimes. Refer [Implementing a UDO for DSP V68
or
later](https://docs.qualcomm.com/doc/80-63442-10/topic/compiling_udo_package.html#implementing-a-udo-for-dsp-v68-or-later)
for more information on implementing UDO for DSP V68 or later
runtimes. The directory paths and locations in this example are
specific to DSP V68 and later architectures. For runtimes later than DSP V68, please
replace **DSP\_V68** with the corresponding DSP architecture.

The file in the package that needs to be implemented for DSP
V68 and later is

- ConvUdoDsp/Conv2DPackage/jni/src/DSP\_V68/ConvImplLibDsp.cpp

The provided example implementation is present at the location

- $SNPE\_ROOT/examples/SNPE/NativeCpp/UdoExample/Conv2D/src/HTP/ConvImplLibDsp.cpp

Copy the provided implementations to the package:

cp -f $SNPE_ROOT/examples/SNPE/NativeCpp/UdoExample/Conv2D/src/HTP/ConvImplLibDsp.cpp $SNPE_ROOT/examples/Models/VGG/ConvUdoDsp/Conv2DPackage/jni/src/DSP_V68/
    Copy to clipboard

Optionally, the user can provide their own implementations in
the package.

Step 4: Package Compilation

**x86 Host Compilation**

Compiling on x86 host uses the make build system. Compile the
CPU implementations with the following:

cd $SNPE_ROOT/examples/Models/VGG/ConvUdoCpu/Conv2DPackage
    make cpu_x86
    Copy to clipboard

The expected artifacts after compiling for CPU on x86 host are

- ConvUdoCpu/Conv2DPackage/libs/x86-64\_linux\_clang/libUdoConv2DPackageImplCpu.so
- ConvUdoCpu/Conv2DPackage/libs/x86-64\_linux\_clang/libUdoConv2DPackageReg.so

**Android CPU Runtime Compilation**

Compilation for the CPU runtime on Android uses Android NDK.
The ANDROID\_NDK\_ROOT environment variable must be set to the
directory containing ndk-build in order to compile the package.

export ANDROID_NDK_ROOT=<path_to_android_ndk>
    Copy to clipboard

It is suggested to add ANDROID\_NDK\_ROOT to the PATH environment
variable to access ndk-build.

export PATH=$ANDROID_NDK_ROOT:$PATH
    Copy to clipboard

Once the ANDROID\_NDK\_ROOT is part of PATH, compile the package
for Android CPU target:

cd $SNPE_ROOT/examples/Models/VGG/ConvUdoCpu/Conv2DPackage
    make cpu_android
    Copy to clipboard

The expected artifacts after compiling for Android CPU are

- ConvUdoCpu/Conv2DPackage/libs/arm64-v8a/libUdoConv2DPackageImplCpu.so
- ConvUdoCpu/Conv2DPackage/libs/arm64-v8a/libUdoConv2DPackageReg.so
- ConvUdoCpu/Conv2DPackage/libs/arm64-v8a/libc++\_shared.so

**Hexagon DSP Runtime Compilation**

Compilation for the DSP runtime makes use of the make system.
In order to build the implementation libraries for DSP V65 and
V66 runtimes, Hexagon-SDK needs to be installed and set up. For
details, follow the setup instructions on
`$HEXAGON_SDK_ROOT/docs/readme.html` page, where
`HEXAGON_SDK_ROOT` is the location of your Hexagon-SDK
installation. Information for compiling a UDO for DSP is
available at [Compiling UDO for
DSP](https://docs.qualcomm.com/doc/80-63442-10/topic/compiling_udo_package.html#compiling-a-udo-for-dsp-v65-and-v66-on-device).

Model Execution

**Execution using snpe-net-run**

Executing VGG with UDO is largely the same as use of
[snpe-net-run](https://docs.qualcomm.com/doc/80-63442-10/topic/tutorial_onnx.html#overview)
without UDO.

The Qualcomm® Neural Processing SDK provides Linux and Android binaries of
**snpe-net-run** under

- $SNPE\_ROOT/bin/x86\_64-linux-clang
- $SNPE\_ROOT/bin/aarch64-android
- $SNPE\_ROOT/bin/aarch64-oe-linux-gcc8.2
- $SNPE\_ROOT/bin/aarch64-oe-linux-gcc9.3

For UDO, snpe-net-run consumes the registration library through
the –udo\_package\_path option. LD\_LIBRARY\_PATH must also be
updated to include the runtime-specific artifacts generated
from package compilation.

**x86 Host Execution**

To execute the network on x86 host, run:

cd $SNPE_ROOT/examples/Models/VGG
    export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$SNPE_ROOT/examples/Models/VGG/ConvUdoCpu/Conv2DPackage/libs/x86-64_linux_clang/
    snpe-net-run --container dlc/vgg16_udo.dlc --input_list data/cropped/raw_list.txt --udo_package_path ConvUdoCpu/Conv2DPackage/libs/x86-64_linux_clang/libUdoConv2DPackageReg.so
    Copy to clipboard

**Android Target Execution**

The tutorial for execution on Android targets will use the
arm64-v8a architecture. This portion of the tutorial is generic
to all runtimes (CPU, DSP). Set SNPE\_TARGET\_DSPARCH
to the DSP architecture of the target Android device.

# architecture: arm64-v8a - compiler: clang - STL: libc++
    export SNPE_TARGET_ARCH=aarch64-android
    export SNPE_TARGET_DSPARCH=hexagon-v68
    Copy to clipboard

Then, push Qualcomm® Neural Processing SDK binaries and libraries to the target device:

adb shell "mkdir -p /data/local/tmp/snpeexample/$SNPE_TARGET_ARCH/bin"
    adb shell "mkdir -p /data/local/tmp/snpeexample/$SNPE_TARGET_ARCH/lib"
    
    adb push $SNPE_ROOT/lib/$SNPE_TARGET_ARCH/*.so \
          /data/local/tmp/snpeexample/$SNPE_TARGET_ARCH/lib
    adb push $SNPE_ROOT/bin/$SNPE_TARGET_ARCH/snpe-net-run \
          /data/local/tmp/snpeexample/$SNPE_TARGET_ARCH/bin
    Copy to clipboard

Next, update environment variables on the target device to
include the Qualcomm® Neural Processing SDK libraries and binaries:

adb shell
    export SNPE_TARGET_ARCH=aarch64-android
    export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/data/local/tmp/snpeexample/$SNPE_TARGET_ARCH/lib
    export PATH=$PATH:/data/local/tmp/snpeexample/$SNPE_TARGET_ARCH/bin
    Copy to clipboard

Lastly, push the VGG UDO model and input data to the device:

cd $SNPE_ROOT/examples/Models/VGG
    mkdir data/rawfiles && cp data/cropped/*.raw data/rawfiles/
    adb shell "mkdir -p /data/local/tmp/vgg16_udo"
    adb push data/rawfiles /data/local/tmp/vgg16_udo/cropped
    adb push data/raw_list.txt /data/local/tmp/vgg16_udo
    adb push dlc/vgg16_udo.dlc /data/local/tmp/vgg16_udo
    rm -rf data/rawfiles
    Copy to clipboard

**Android CPU Execution**

Once the model and data have been placed on the device, place
the UDO libraries on the device:

cd $SNPE_ROOT/examples/Models/VGG
    adb shell "mkdir -p /data/local/tmp/vgg16_udo/cpu"
    adb push ConvUdoCpu/Conv2DPackage/libs/arm64-v8a/libUdoConv2DPackageImplCpu.so /data/local/tmp/vgg16_udo/cpu
    adb push ConvUdoCpu/Conv2DPackage/libs/arm64-v8a/libUdoConv2DPackageReg.so /data/local/tmp/vgg16_udo/cpu
    adb push ConvUdoCpu/Conv2DPackage/libs/arm64-v8a/libc++_shared.so /data/local/tmp/vgg16_udo/cpu
    Copy to clipboard

Now set required environment variables and run snpe-net-run on device:

adb shell
    cd /data/local/tmp/vgg16_udo/
    export SNPE_TARGET_ARCH=aarch64-android
    export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/data/local/tmp/snpeexample/$SNPE_TARGET_ARCH/lib
    export PATH=$PATH:/data/local/tmp/snpeexample/$SNPE_TARGET_ARCH/bin
    export LD_LIBRARY_PATH=/data/local/tmp/vgg16_udo/cpu/:$LD_LIBRARY_PATH
    snpe-net-run --container vgg16_udo.dlc --input_list raw_list.txt --udo_package_path cpu/libUdoConv2DPackageReg.so
    Copy to clipboard

**Hexagon DSP Execution**

The procedure for execution on device for DSP is largely the
same as CPU and GPU. However, the DSP runtime requires
quantized network parameters. While DSP allows unquantized
DLCs, it is generally recommended to quantize DLCs for improved
performance. The tutorial will use a quantized DLC as an
illustrative example. Quantizing the DLC requires the
**snpe-dlc-quantize** tool.

To quantize the DLC for use on DSP:

cd $SNPE_ROOT/examples/Models/VGG/
    snpe-dlc-quantize --input_dlc dlc/vgg16_udo.dlc --input_list data/cropped/raw_list.txt --udo_package_path ConvUdoCpu/Conv2DPackage/libs/x86-64_linux_clang/libUdoConv2DPackageReg.so --output_dlc dlc/vgg16_udo_quantized.dlc
    Copy to clipboard

For more information on **snpe-dlc-quantize** visit
[quantization](https://docs.qualcomm.com/doc/80-63442-10/topic/quantized_models.html#overview). For
information on UDO-specific quantization visit [Quantizing a
DLC with UDO](https://docs.qualcomm.com/doc/80-63442-10/topic/preparing_model_with_udo.html#quantizing-a-dlc-with-udo).
For information on DSP runtime visit [DSP Runtime](https://docs.qualcomm.com/doc/80-63442-10/topic/dsp_runtime.html).

Now push the quantized model to device:

adb push dlc/vgg16_udo_quantized.dlc /data/local/tmp/vgg16_udo
    Copy to clipboard

**Note:** Please refer to [UDO DSP tutorial for Quantized
DLC](https://docs.qualcomm.com/doc/80-63442-10/topic/tutorial_inceptionv3_udo_dsp.html) for executing on the
DSP runtime using quantized dlc.

Before executing on the DSP, push the Qualcomm® Neural Processing SDK libraries for DSP to
device:

adb shell "mkdir -p /data/local/tmp/snpeexample/dsp/lib"
    adb push $SNPE_ROOT/lib/$SNPE_TARGET_DSPARCH/unsigned/*.so /data/local/tmp/snpeexample/dsp/lib
    Copy to clipboard

Now push DSP-specific UDO libraries to device. Depending on DSP
architecture specified in the config, **dsp\_v68** directory can
be **dsp\_v60** or **dsp** (with older Qualcomm® Neural Processing SDK).

cd $SNPE_ROOT/examples/Models/VGG
    adb shell "mkdir -p /data/local/tmp/vgg16_udo/dsp"
    adb push ConvUdoDsp/Conv2DPackage/libs/dsp_v68/*.so /data/local/tmp/vgg16_udo/dsp # For DSP V68 or later
    adb push ConvUdoDsp/Conv2DPackage/libs/dsp_v60/*.so /data/local/tmp/vgg16_udo/dsp # For DSP versions less than v68
    adb push ConvUdoDsp/Conv2DPackage/libs/arm64-v8a/libUdoConv2DPackageReg.so /data/local/tmp/vgg16_udo/dsp # Pushes reg lib
    adb push ConvUdoDsp/Conv2DPackage/libs/arm64-v8a/libc++_shared.so /data/local/tmp/vgg16_udo/dsp
    Copy to clipboard

Then set required environment variables and run snpe-net-run on
device. Note that **Conv2DInt8Impl** should be used for quantized DLCs:

adb shell
    cd /data/local/tmp/vgg16_udo/
    export SNPE_TARGET_ARCH=aarch64-android
    export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/data/local/tmp/snpeexample/$SNPE_TARGET_ARCH/lib
    export PATH=$PATH:/data/local/tmp/snpeexample/$SNPE_TARGET_ARCH/bin
    export LD_LIBRARY_PATH=/data/local/tmp/vgg16_udo/dsp/:$LD_LIBRARY_PATH
    export ADSP_LIBRARY_PATH="/data/local/tmp/vgg16_udo/dsp/;/data/local/tmp/snpeexample/dsp/lib;/system/lib/rfsa/adsp;/system/vendor/lib/rfsa/adsp;/dsp"
    snpe-net-run --container vgg16_udo_quantized.dlc --input_list raw_list.txt --udo_package_path dsp/libUdoConv2DPackageReg.so --use_dsp
    Copy to clipboard

To verify classification results, run the following on your host cpu machine.

cd $SNPE_ROOT/examples/Models/VGG
    adb pull /data/local/tmp/vgg16_udo/output .
    python3 $SNPE_ROOT/examples/Models/VGG/scripts/show_vgg_classifications.py -i data/cropped/raw_list.txt \
                                                                               -o output/ \
                                                                               -l data/synset.txt
    Copy to clipboard

The output should look like the following, showing
classification results for all the images.

Classification results
    probability=0.351832 ; class=n02123045 tabby, tabby cat
    probability=0.315168 ; class=n02123159 tiger cat
    probability=0.313084 ; class=n02124075 Egyptian cat
    probability=0.012995 ; class=n02127052 lynx, catamount
    probability=0.003528 ; class=n02129604 tiger, Panthera tigris
    Copy to clipboard

Last Published: Jun 04, 2026

[Previous Topic
UDO DSP tutorial on Windows for Quantized DLC](https://docs.qualcomm.com/bundle/publicresource/80-63442-10/topics/tutorial_inceptionv3_udo_dsp_win.md) [Next Topic
PSNPE Introduction](https://docs.qualcomm.com/bundle/publicresource/80-63442-10/topics/tutorial_psnpe_introduction.md)