# Preparation on Linux

The QNN Gen AI Transformer uses the [qnn-genai-transformer-composer](https://docs.qualcomm.com/doc/80-63442-10/topic/qnn-genai-transformer-composer.html#qnn-genai-transformer-composer) utility to
prepare models for inference.

## Preparation

Open a command shell on Linux host and run:

# Make sure environment is setup as per instructions, or can cd into bin folder on Linux host
    cd ${QNN_SDK_ROOT}/bin/x86_64-linux-clang/
    ./qnn-genai-transformer-composer --quantize Z4
                                     --outfile <output filename with complete path>.bin
                                     --model <path-to-downloaded-LLama-model-directory>
    Copy to clipboard

## Dialog JSON Configuration

See [Genie Dialog JSON configuration string](https://docs.qualcomm.com/doc/80-63442-10/topic/dialog_json.html#genie-dialog-json-config-string) for details on the fields and what
they mean. An example config can be found at
`${QNN_SDK_ROOT}/examples/Genie/configs/llama2-7b/llama2-7b-genaitransformer.json`. Note that the tokenizer path and
model bin fields will need to be updated based on your actual preparation steps.

## Inference

Choose your target platform for inference:

- [Linux](https://docs.qualcomm.com/doc/80-63442-10/topic/no_lora_linux_inference.html)
- [Android](https://docs.qualcomm.com/doc/80-63442-10/topic/no_lora_android_inference.html)

Last Published: Jun 04, 2026

[Previous Topic
Android](https://docs.qualcomm.com/bundle/publicresource/80-63442-10/topics/android_inference.md) [Next Topic
Linux](https://docs.qualcomm.com/bundle/publicresource/80-63442-10/topics/no_lora_linux_inference.md)