# Mobilenet V2 Inference on HTP

This tutorial shows how to convert and execute a MobileNetV2 model on the QAIRT HTP Backend. It includes a
step-by-step breakdown of the process. You can copy each snippet into a single script to run the tutorial end to end.

Note

If you would like to skip the breakdown, you can obtain a simplified version of the tutorial in the QAIRT SDK from
the following path:

> 
> 
> - `examples/QAIRT/python/basic_tutorial.py`

Ensure the `target_device` environment variable is set to None to perform execution on your Windows on Snapdragon
device.

The parameters for this tutorial are as follows:

> 
> 
> - Framework: PyTorch
> - Model: [MobileNetV2](https://pytorch.org/hub/pytorch_vision_mobilenet_v2)
> - Configurations:
> 
>     - Host OS: Windows on Snapdragon (WoS)
>     - Target Devices: Snapdragon X Elite Device
>     - Processor: Qualcomm NPU
>     - Backend: HTP

Tip

This tutorial creates some temporary files as part of the workflow. To customize the temporary file
location, set the env variable *QAIRT\_TMP\_DIR* to a location of your choosing.

## Step 1: Setup

First, import the necessary libraries that will be used in this tutorial. Ensure you have the QAIRT SDK
installed and that the **qairt** package is available. If you see any import errors, follow the setup instructions
to install the QAIRT SDK here: [Setup](https://docs.qualcomm.com/bundle/publicresource/topics/80-63442-50/setup.html)

import json
    import os
    import platform
    from pathlib import Path
    
    import numpy as np
    import requests
    import torch
    import torchvision.transforms as transforms
    from PIL import Image
    
    import qairt
    from qairt import CompileConfig, Device, DevicePlatformType, ExecutionResult, RemoteDeviceIdentifier
    Copy to clipboard

## Step 2: Get a MobileNetV2 model

Download the MobileNetV2 model from PyTorch Hub.

pytorch_model = torch.hub.load("pytorch/vision:v0.10.0", "mobilenet_v2", pretrained=True)
    pytorch_model.eval()
    
    # Create a directory for artifacts
    artifacts_dir = Path("./mobilenetv2_artifacts").resolve()
    artifacts_dir.mkdir(parents=True, exist_ok=True)
    onnx_model_path = str(artifacts_dir / "mobilenet_v2.onnx")
    
    # Export the PyTorch model as an ONNX model
    dummy_input = torch.rand((1, 3, 224, 224), dtype=torch.float32)
    
    torch.onnx.export(
        pytorch_model,
        (dummy_input,),
        onnx_model_path,
        input_names=["input"],
        output_names=["output"],
        opset_version=11,
    )
    Copy to clipboard

## Step 3: Convert the model

Once the model is exported, we can proceed to convert it using QAIRT.

converted_model: qairt.Model = qairt.convert(onnx_model_path)
    Copy to clipboard

The convert API produces a [`qairt.Model`](https://docs.qualcomm.com/doc/80-87189-2/topic/qairt-core-api.html#qairt.Model) object which can be executed and saved
to disk. You can pass in extra options to the convert API as keyword arguments.

## Step 4. Compile the model

The next step after conversion and/or quantization is to compile the model ahead of time.
While this step is optional, it is recommended to avoid preparation time costs on the target device.

Converted models can be compiled using:

compiled_model: qairt.CompiledModel = qairt.compile(converted_model, backend="HTP")
    Copy to clipboard

If you know the target device ahead, you may customize the compilation process using a `qairt.api.config.CompileConfig` object.

For example, to compile for Snapdragon 8cx Gen 4 (SC8380XP) device, you can use the following config:

config = CompileConfig(backend="HTP", soc_details="chipset:SC8380XP")
    Copy to clipboard

If you are not using “SC8380XP” as the target device, see this link for a list of supported chipsets - [Chipsets](https://docs.qualcomm.com/bundle/publicresource/topics/80-63442-50/overview.html#supported-snapdragon-devices)
to obtain the chipset for your device.

Next, we can compile the model using the config object:

compiled_model: qairt.CompiledModel = qairt.compile(converted_model, config=config)
    
    # print the information from the model
    print(json.dumps(compiled_model.module.info.as_dict(), indent=4))
    Copy to clipboard

The compilation process will produce a [`qairt.api.compiled_model.CompiledModel`](https://docs.qualcomm.com/doc/80-87189-2/topic/qairt-core-api.html#qairt.CompiledModel) object. You should see information about
the compiled model below:

{
        "name": "mobilenet_v2",
        "graphs": [
            {
                "name": "mobilenet_v2",
                "inputs": [
                    {
                        "name": "input",
                        "dimensions": [
                            1,
                            3,
                            224,
                            224
                        ],
                        "data_type": "QNN_DATATYPE_FLOAT_32"
                    }
                ],
                "outputs": [
                    {
                        "name": "output",
                        "dimensions": [
                            1,
                            1000
                        ],
                        "data_type": "QNN_DATATYPE_FLOAT_32"
                    }
                ]
            }
        ],
        "soc_name": "60",
        "backend": "HTP",
        "backend_info": {
            "arch": 73,
            "vtcm_size": 4,
            "optimization_level": 0
        }
    }
    Copy to clipboard

## Step 5: Set up an X Elite device

To execute locally on an X Elite target, there is no additional device setup required.

The API will detect the platform processor and trigger execution using WoS as both host and target.
To ensure compatibility for this tutorial, we provide the following validation below.

if platform.system() == "Windows" and "ARMv8" in platform.processor():
        print("INFO: X Elite detected. Enabling tutorial for X Elite")
    Copy to clipboard

## Step 6: Executing on device

The QAIRT SDK includes a set of sample images of shape : (3 x H x W) where H and W are at least of dimension: 224 x 224.

We will use these images to execute the compiled model on the device.

image_location = os.path.join(os.environ["QAIRT_SDK_ROOT"], "examples", "QAIRT", "python", "images")
    IMAGE_DATASET = {
        "african elephant": os.path.join(image_location, "african_elephant.jpg"),
        "samoyed": os.path.join(image_location, "samoyed.jpg"),
        "sea lion": os.path.join(image_location, "sea_lion.jpg")
    }
    Copy to clipboard

Each image is loaded in a range of [0,1] and then normalized using mean: [0.485, 0.456, 0.406] and std: [0.229, 0.224, 0.225]

# To make things simpler, we can define a simple function to preprocess each image.
    def preprocess_input(image: str) -> np.ndarray:
        image_obj = Image.open(image)
    
        preprocess = transforms.Compose(
            [
                transforms.Resize(224),
                transforms.CenterCrop(224),
                transforms.ToTensor(),
                transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
            ]
        )
        tensor = preprocess(image_obj).unsqueeze(0)
        return tensor.numpy()
    Copy to clipboard

We can now execute the model on using the images from the dataset. Data may also be passed in as dictionary
of name to numpy array.

outputs = []
    
    for label, image_url in IMAGE_DATASET.items():
        image: np.ndarray = preprocess_input(image_url)
    
        result: ExecutionResult = compiled_model(image)
    
        _, output_tensors = compiled_model.output_tensors[0]
    
        outputs.append((result[output_tensors[0].name], label))
    Copy to clipboard

## Step 7: Post-Processing

For post-processing, will use imagenet labels obtained from [Qualcomm AI Hub](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/apidoc/imagenet_classes.txt)

Here is a small snippet of code that computes softmax probabilities from the output and prints the top-5 predictions using the class
labels.

def postprocess_results(output_ndarray: np.ndarray) -> str:
        # Softmax function
        on_device_probabilities = np.exp(output_ndarray) / np.sum(np.exp(output_ndarray), axis=1)
    
        # Read the ImageNet class labels
        sample_classes = "https://qaihub-public-assets.s3.us-west-2.amazonaws.com/apidoc/imagenet_classes.txt"
        response = requests.get(sample_classes, stream=True, timeout=5)
        response.raw.decode_content = True
        categories = [s.strip().decode("utf-8") for s in response.raw]
    
        # Print the top five predictions
        print("Top-5 predictions:")
        top5_classes = np.argsort(on_device_probabilities[0], axis=0)[-5:]
        prediction = categories[top5_classes[-1]]
        for c in reversed(top5_classes):
            print(f"{c} {categories[c]:20s} {on_device_probabilities[0][c]:>6.1%}")
        print()
    
        return prediction
    Copy to clipboard

The code below prints predictions for each image in the dataset.

for arr, label in outputs:
        # Postprocess the results
        prediction = postprocess_results(arr)
    
        prediction = prediction.lower()
        label = label.lower()
        if prediction == label:
            print(f"Successful prediction: {prediction}\n")
        else:
            print(f"Failed prediction: {prediction}. Expected {label}\n")
    Copy to clipboard

Top-5 predictions:
    386 African elephant      84.8%
    385 Indian elephant        6.6%
    101 tusker                 4.7%
    346 water buffalo          2.6%
    343 warthog                0.5%
    
    Successful prediction: african elephant
    
    Top-5 predictions:
    258 Samoyed               81.3%
    259 Pomeranian             7.6%
    261 keeshond               2.0%
    279 Arctic fox             1.8%
    257 Great Pyrenees         1.4%
    
    Successful prediction: samoyed
    
    Top-5 predictions:
    150 sea lion              99.9%
    147 grey whale             0.0%
    360 otter                  0.0%
    460 breakwater             0.0%
    146 albatross              0.0%
    
    Successful prediction: sea lion
    Copy to clipboard

We can see that the model is able to correctly predict the labels for the images. This shows that the
model is executing correctly on the target device.

## Step 8: Save the compiled model

We can also save the compiled model for future use or deployment.

compiled_model.save(artifacts_dir / "mobilenet_v2.bin")
    Copy to clipboard

You should see a binary file named `mobilenet_v2.bin` in the `artifacts` directory. The same file
may be reloaded and executed on the target device without the need to compile the model again.

compiled_model = qairt.load(artifacts_dir / "mobilenet_v2.bin")
    Copy to clipboard

## Next Steps

- [LLM Inference on HTP](https://docs.qualcomm.com/doc/80-87189-2/topic/genai_builder.html) – Build and deploy LLMs on Snapdragon devices.
- [Profiling Models with QAIRT Visualizer](https://docs.qualcomm.com/doc/80-87189-2/topic/profiling_models_with_visualizer.html#profiling-models-android-visualizer) – Profile model performance with the Visualizer.
- [API Documentation](https://docs.qualcomm.com/doc/80-87189-2/topic/api.html) – Full API reference.

Last Published: Jul 08, 2026

[Previous Topic
For Windows on Snapdragon devices](https://docs.qualcomm.com/bundle/publicresource/80-87189-2/topics/tutorials.md) [Next Topic
For arm-linux devices](https://docs.qualcomm.com/bundle/publicresource/80-87189-2/topics/tutorials.md)