# Qairt Quantizer

The [qairt-converter](https://docs.qualcomm.com/doc/80-63442-10/topic/SNPE_general_tools.html#qairt-converter) tool now converts non-quantized models into a non-quantized or quantized
DLC file depending on the overrides provided during the Converter step. `qairt-quantizer` now can be used to quantize all the tensors which
are missing encodings during `qairt-converter` step (fill in the gaps) or can be used to calibrate the provided encodings through a list of images.
The [qairt-quantizer](https://docs.qualcomm.com/doc/80-63442-10/topic/SNPE_general_tools.html#qairt-quantizer) tool is used to quantize the model to one of supported fixed point formats.

For example, the following command will convert an Inception v3 DLC file into a quantized Inception v3 DLC file.

$ qairt-quantizer --input_dlc inception_v3.dlc \
                      --input_list image_file_list.txt \
                      --output_dlc inception_v3_quantized.dlc
    Copy to clipboard

To properly calculate the ranges for the quantization parameters, a representative set of input data needs to be used as
input into [qairt-quantizer](https://docs.qualcomm.com/doc/80-63442-10/topic/SNPE_general_tools.html#qairt-quantizer) using the `--input_list` parameter.
The `--input_list` specifies paths to raw image files to be used for calibration during quantization.
For details refer to `--input_list` argument in [qnn-net-run](https://docs.qualcomm.com/doc/80-63442-10/topic/SNPE_general_tools.html#qnn-net-run) for supported
input formats (in order to calculate output activation encoding information for all layers, **do not** include the line
which specifies desired outputs).

The tool requires the batch dimension of the DLC input file to be set to 1 during model conversion. The batch dimension
can be changed to a different value for inference, by [resizing](https://docs.qualcomm.com/doc/80-63442-10/topic/network_resize.html) the network during initialization.

## Additional details

- `qairt-quantizer` is majorly similar to `snpe-dlc-quant` with the following differences:

    - `qairt-quantizer` can now be used to generate encodings using calibration dataset provided via the `--input_list` flag
for the tensors for the following scenarios:

        - Fill in the gaps: If any tensor is missing encoding during the `qairt-converter` step i.e. the tensors for which override
is not specified in `--quantization_overrides` or source model encodings (QAT).
        - If encodings is not specified for all the tensors via overrides or QAT encodings.
    - HTP is set as the default backend in the QAIRT quantizer, which may enable certain HTP-specific behaviors that
wouldn’t be triggered by default in legacy quantizers where the backend is left empty. This difference can affect
how some backend-dependent features behave during conversion/quantization.

        - For example, during quantization, an optimization called `IntBiasUpdates` is applied to the FullyConnected op if
the backend is set to `HTP` in SNPE, whereas it is always applied in QAIRT.
    - The external overrides and source model encodings (QAT) are now applied during `qairt-converter` stage by default.
So the quantizer options to ignore the overrides and source model encodings, `--ignore_encodings` (legacy) and `--ignore_quantization_overrides` are now no-op.
    - An alternative option is to the `--export_format=DLC_STRIP_QUANT` flag of `qairt-converter`, when specified the converter will ignore/remove all the encodings in
the source model and output float model which can be recalibrated using `qairt-quantizer` and `--input_list` flag.
    - Another alternative for using this feature is through `qairt-quantizer` options `--input_list` and `--ignore_quantization_overrides``in combination
which signals the quantizer to ignores all the encodings applied during conversion and generates encodings using the calibration dataset provided via ``--input_list`.
    - The float fallback feature controlled via command-line option `--enable_float_fallback`, present as `--float_fallback` in legacy quantizers
is also a no-op for `qairt-quantizer` and can be skipped. The float fallback was added to produce a fully quantized or mixed precision graph by applying encoding overrides
or source model encodings, by propagating encodings across data invariant Ops and falling back the missing tensors to float datatype.
To simplify the steps, this is taken care during `qairt-converter`. `qairt-converter` applies the overrides and encodings, and the tensors which are missing
encodings will fall back to the default float datatype.
    - To summarize, `qairt-quantizer` command-line arguments `--ignore_quantization_overrides`, and `--enable_float_fallback` are now no-op,
and are applied by default during `qairt-converter` step itself.

Note

`--enable_float_fallback` and `--input_list` are mutually exclusive options. One of them is
mandatory argument for quantizer.
- Outputs can be specified for qairt-quantizer by modifying the input\_list in the following ways:

#<output_layer_name>[<space><output_layer_name>]
        %<output_tensor_name>[<space><output_tensor_name>]
        <input_layer_name>:=<input_layer_path>[<space><input_layer_name>:=<input_layer_path>]
        Copy to clipboard

    **Note:** Output tensors and layers can be specified individually, but when specifying both, the order shown must
be used to specify each.
- qairt-quantizer also supports quantization using AIMET, inplace of default Quantizer,
when `--use_aimet_quantizer` command line option is provided. To use AIMET Quantizer,
run the setup script to create AIMET specific environment, by executing the following command

$ source {SNPE_ROOT}/bin/aimet_env_setup.sh --env_path <path where AIMET venv needs to be created> \
                                                    --aimet_sdk_tar <AIMET Torch SDK tarball>
        Copy to clipboard
- Advance AIMET algorithms- AdaRound and AMP is also supported in qairt-quantizer. The user needs to provide a YAML
config file through the command line option `--config` and specify the algorithm “adaround” or “amp” through `--apply_algorithms`
along with `--use_aimet_quantizer` flag.
- The template of the YAML file for AMP is shown below:

aimet_quantizer:
           datasets:
               <dataset_name>:
                   dataloader_callback: '<path/to/unlabled/dataloader/callback/function>'
                   dataloader_kwargs: {arg1: val, arg2: val2}
        
           amp:
               dataset: <dataset_name>,
               candidates:  [[[8, 'int'], [16, 'int']], [[16, 'float'], [16, 'float']]],
               allowed_accuracy_drop: 0.02
               eval_callback_for_phase2: '<path/to/evaluator/callback/function>'
        Copy to clipboard

> 
> 
> *dataloader\_callback* is used to set the path of a callback function which returns labeled dataloader of type torch.DataLoader.
> The data should be in source network input format. *dataloader\_kwargs* is an optional dictionary through which the user
> can provide keyword arguments of the above defined callback function. *dataset* is used to specify the name of the dataset
> that has been defined above. *candidates* is list of lists for all possible bitwidth values for activations and parameters.
> *allowed\_accuracy\_drop* is used to specify the maximum allowed drop in accuracy from FP32 baseline. The pareto front
> curve is plotted only till the point where the allowable accuracy drop is met. *eval\_callback\_for\_phase2* is used to set
> the path of the evaluator function which takes predicted value batch as the first argument and ground truth batch as the
> second argument and returns calculated metric float value.

- The template of the YAML file for AdaRound is shown below:

aimet_quantizer:
            datasets:
                <dataset_name>:
                    dataloader_callback: '<path/to/unlabled/dataloader/callback/function>'
                    dataloader_kwargs: {arg1: val, arg2: val2}
        
            adaround:
                dataset: <dataset_name>
                num_batches: 1
        Copy to clipboard

> 
> 
> *dataloader\_callback* is used to set the path of a callback function which returns unlabeled dataloader of type torch.DataLoader.
> The data should be in source network input format. *dataloader\_kwargs* is an optional dictionary through which the user
> can provide keyword arguments of the above defined callback function. *dataset* is used to specify the name of the dataset
> that has been defined above. *num\_batches* is used to specify the number of batches to be used for adaround iteration.

- AdaRound can also run in default mode, without config file, by just passing “adaround”
in the command line option `--apply_algorithms` along with `--use_aimet_quantizer` flag. This flow uses the data provided
through the input\_list option to take rounding decisions.

- **Note:**
    - 1. AIMET Torch Tarball naming convention should be as follows -
aimetpro-release-&lt;VERSION (optionally with build ID)&gt;.torch-&lt;cpu/gpu&gt;-.\*.tar.gz.
For example, aimetpro-release-x.xx.x.torch-xxx-release.tar.gz.
    2. Once the setup script is run, ensure that AIMET\_ENV\_PYTHON environment variable is set to
&lt;AIMET virtual environment path&gt;/bin/python
    3. Minimum AIMET version supported is, **AIMET-1.33.0**

Last Published: Jul 02, 2026

[Previous Topic
Qairt Converter](https://docs.qualcomm.com/bundle/publicresource/80-63442-10/topics/qairt_converter.md) [Next Topic
Model Tips](https://docs.qualcomm.com/bundle/publicresource/80-63442-10/topics/usergroup4.md)