# QNN GPU - Llama 2 7B - Android

The following tutorial demonstrates running a Llama 2 7B basic dialog on Android with the the QNN GPU backend using
[genie-t2t-run](https://docs.qualcomm.com/doc/80-63442-10/topic/genie-t2t-run.html#genie-t2t-run).

Note

This section assumes that the QNN GPU context binaries have been obtained via the QAIRT SDK workflow.

## Dialog JSON configuration

See [Genie Dialog JSON configuration string](https://docs.qualcomm.com/doc/80-63442-10/topic/dialog_json.html#genie-dialog-json-config-string) for details on the fields and what
they mean. An example JSON config for this tutorial can be found at
`${QNN_SDK_ROOT}/examples/Genie/configs/llama2-7b/llama2-7b-gpu.json`. Note that the tokenizer path and
context binary fields will need to be updated based on your actual preparation steps.

## Inference

To run on QNN GPU backend, open a command shell on android and run the following.

adb shell mkdir -p /data/local/tmp/
    adb push ${QNN_SDK_ROOT}/bin/aarch64-android/genie-t2t-run /data/local/tmp/
    adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libGenie.so /data/local/tmp/
    adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libQnnGpu.so /data/local/tmp/
    adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libQnnSystem.so /data/local/tmp/
    adb push <path to llama2-7b-gpu.json> /data/local/tmp/
    adb push <path to tokenizer.json> /data/local/tmp/
    adb push <path to model bin file> /data/local/tmp/
    
    # open adb shell
    adb shell
    
    export LD_LIBRARY_PATH=/data/local/tmp/
    export PATH=/data/local/tmp/:$PATH
    
    cd /data/local/tmp/
    ./genie-t2t-run -c <path to llama2-7b-gpu.json> -p "Tell me about Qualcomm"
    Copy to clipboard

Last Published: Jun 04, 2026

[Previous Topic
Windows](https://docs.qualcomm.com/bundle/publicresource/80-63442-10/topics/windows.md) [Next Topic
KV Share Dialog](https://docs.qualcomm.com/bundle/publicresource/80-63442-10/topics/kvshare.md)