# Working with Machine Learning Models in the Qualcomm Neural Processing SDK for AI

Source: [https://docs.qualcomm.com/doc/80-63442-4/topic/working-with-machine-learning-models.html](https://docs.qualcomm.com/doc/80-63442-4/topic/working-with-machine-learning-models.html)

Tips for developers working on machine learning apps on Android

## 1. Model training and conversion

Machine learning frameworks have specific formats for storing neural network models.
                The Qualcomm® Neural Processing SDK includes tools for converting pre-trained models
                to the Deep Learning Container (DLC) format. The Qualcomm® Neural Processing Engine
                (NPE) runtime then uses the .dlc file in executing the neural network.

Network details like input layer name, output layer name and input shape are required
                before converting a model. The SDK includes tools for retrieving those details and
                getting the network to run the application.

The SDK also includes tools for converting models from TensorFlow and ONNX frameworks
                to .dlc format:

- [TensorFlow Model Conversion](https://docs.qualcomm.com/bundle/publicresource/topics/80-63442-2/model_conv_tensorflow.html)
- [ONNX Model Conversion](https://docs.qualcomm.com/bundle/publicresource/topics/80-63442-2/model_conv_onnx.html)

Once the model conversion is done, the next step is to analyze the input shape and
                number of outputs from the neural network to work alongside an Android
                application.

## 2. Quantizing a model

By default, the conversion tools in the Qualcomm Neural Processing SDK convert
                non-quantized models into a non-quantized .dlc file. All network parameters remain
                in a 32-bit, floating-point representation of the original model.

For converted models (.dlc files) that are too large, the SDK includes a [quantization tool](https://docs.qualcomm.com/bundle/publicresource/topics/80-63442-2/tools.html#snpe-dlc-quantize), snpe-dlc-quantize, to
                optimize the model to an 8-bit, fixed-point representation without compromising on
                quality.

For information on when to use a quantized model, see [Quantized vs Non-Quantized Models](https://docs.qualcomm.com/bundle/publicresource/topics/80-63442-2/quantized_models.html) in the
                SDK documentation.

## 3. Setting up the runtime environment

It is necessary to set up the runtime hardware (core) on which the converted model
                will run.

The Snapdragon SoC consists of the CPU, the Qualcomm® Adreno™ GPU and the Qualcomm®
                Hexagon™ DSP. The variety of cores allows for faster processing and, therefore,
                faster prediction. Adreno GPU and Hexagon DSP are designed to optimize inference on
                the device, but depending on the neural network, [some layers](https://docs.qualcomm.com/bundle/publicresource/topics/80-63442-2/network_layers.html) may not support processing in
                those runtime environments.

By examining the layers and design of the neural network, the Neural Processing API
                    [determines which runtime environments (CPU, GPU,
                    DSP) are supported](https://docs.qualcomm.com/bundle/publicresource/topics/80-63442-2/network_layers.html).

## 4. Loading the model

Once the runtime environment is set, the next step is to load the model (converted
                into a .dlc file) into it.

The following code sets up the environment and loads the model through the API:

    final SNPE.NeuralNetworkBuilder builder = new SNPE.NeuralNetworkBuilder(mApplicationContext) // Allows selecting a runtime order for the network. // In the example below use DSP and fall back, in order, to GPU then CPU // depending on whether any of the runtimes are available. .setRuntimeOrder(DSP, GPU, CPU) // Loads a model from DLC file .setModel(new File("<model-path>")) // Build the network network = builder.build();Copy to clipboard

## 5. Processing input frames for real-time prediction

In the following example of a mobile app, the device processes camera frames
                continuously using the Camera2 API:

    private class CameraSession extends android.hardware.camera2.CameraCaptureSession.CaptureCallback { @Override public void onCaptureCompleted(@NonNull CameraCaptureSession session, @NonNull CaptureRequest request, @NonNull TotalCaptureResult result) { super.onCaptureCompleted(session, request, result); // Getting the bitmap with size of 299x299 Bitmap mBitmap = mTextureView.getBitmap(299, 299); // Compressing it into JPEG formatting ByteArrayOutputStream stream = new ByteArrayOutputStream(); mBitmap.compress(Bitmap.CompressFormat.JPEG, 50, stream); // Converting image into byte array. byte[] byteArray = stream.toByteArray(); Bitmap compressedBitmap = BitmapFactory.decodeByteArray(byteArray, 0, byteArray.length); }Copy to clipboard

In an image-based, deep learning mobile application, getting the camera frames is not
                the only task. The most important task is to convert the frames into a proper input
                shape. For example, the Inception\_v3 model requires an input with the shape of [1,
                299, 299, 3].

## 6. Image classification using input frames and model object

Before the bitmap is handed off for prediction, it must undergo basic image
                processing, such as conversion to RGB, grayscale, etc. The image processing depends
                on the input shape required by the model. Examples:

- Inception network input image size: 299x299x3 (3 channel image input)
- MobileNet input image size: 224x224x3 (3 channel image input)
- VGG16/19 input image size: 224x224x3 (3 channel image input)
- fer2013 network input: 48x48x1 (1 channel image input)

Next, it is necessary to convert the processed image into the tensor. The prediction
                API requires a tensor format with type Float.

After all the processing is complete, the tensor goes to the neural network API for
                prediction, as shown in the following
                code:

    // mNeuralNetwork is instance of NeuralNetwork class. final Map outputs = mNeuralNetwork.execute(inputs);Copy to clipboard

where NeuralNetwork is an instance of the NeuralNetwork class, and inputs is a map of
                input name and tensor.

## References

In the snpe-sdk/examples/android/ directory of the Qualcomm Neural Processing SDK,
                refer to the demo Android application for image classification based on static
                images.

- [SDK Setup](https://docs.qualcomm.com/bundle/publicresource/topics/80-63442-2/network_layers.html)
- [Supported Layers](https://docs.qualcomm.com/bundle/publicresource/topics/80-63442-2/network_layers.html)
- [Model Conversion and Quantization](https://docs.qualcomm.com/bundle/publicresource/topics/80-63442-2/usergroup2.html)
- [Image Formatting](https://docs.qualcomm.com/bundle/publicresource/topics/80-63442-2/image_input.html)
- [Tutorials and Example](https://docs.qualcomm.com/bundle/publicresource/topics/80-63442-2/usergroup5.html)
- [Tools Usage](https://docs.qualcomm.com/bundle/publicresource/topics/80-63442-2/tools.html)
- [Docs for Java](https://docs.qualcomm.com/bundle/publicresource/topics/80-63442-2/android_tutorial.html)

**Parent Topic:** [Developing Apps with the Qualcomm Neural Processing SDK for AI](https://docs.qualcomm.com/doc/80-63442-4/topic/developing-apps-qualcomm-neural-processing-sdk.html)

Last Published: Jun 24, 2024

[Previous Topic
Developing Apps with the Qualcomm Neural Processing SDK for AI](https://docs.qualcomm.com/bundle/publicresource/80-63442-4/topics/developing-apps-qualcomm-neural-processing-sdk.md) [Next Topic
Tuning and Optimizing Machine Learning Models](https://docs.qualcomm.com/bundle/publicresource/80-63442-4/topics/tuning-and-optimizing-machine-learning-models.md)