# Genie Node JSON configuration string

The following sections contain information that pertain to the format of the JSON configuration string that is supplied
to [GenieNodeConfig\_createFromJson](https://docs.qualcomm.com/doc/80-63442-10/topic/function_GenieNode_8h_1a2c64ec143da2dbbd2241854833e86c41.html#exhale-function-genienode-8h-1a2c64ec143da2dbbd2241854833e86c41). This JSON
configuration can also be supplied to the genie-app tool.

Note

Please refer to the example configs contained in the SDK at ${SDK\_ROOT}/examples/Genie/configs/pipeline.

## General configuration schema

The following provides the schema of the JSON configuration format that is provided to
[GenieNodeConfig\_createFromJson](https://docs.qualcomm.com/doc/80-63442-10/topic/function_GenieNode_8h_1a2c64ec143da2dbbd2241854833e86c41.html#exhale-function-genienode-8h-1a2c64ec143da2dbbd2241854833e86c41). Note that the JSON
configurations follows dialog schema for text-generation and embedding schema for image and text encoding.

Text Generator schema:

{
      "text-generator" : {
        "type": "object",
        "properties": {
          "version" : {"type": "integer"},
          "type" : {"type": "string", "enum":["basic"]},
          "context" : {
            "type": "object",
            "properties": {
              "version" : {"type": "integer"},
              "size": {"type": "integer"},
              "n-vocab": {"type": "integer"},
              "bos-token": {"type": "integer"},
              "eos-token": {"type": "integer"}
            }
          },
          "sampler" : {
            "type": "object",
            "properties": {
              "version" : {"type": "integer"},
              "seed" : {"type": "integer"},
              "temp" : {"type": "float"},
              "top-k" : {"type": "integer"},
              "top-p" : {"type": "float"},
              "greedy" : {"type": "boolean"}
            }
          },
          "tokenizer" : {
            "type": "object",
            "properties": {
              "version" : {"type": "integer"},
              "path" : {"type": "string"}
            }
          },
          "engine" : {
            "type": "object",
            "properties": {
              "version" : {"type": "integer"},
              "n-threads" : {"type": "integer"},
              "backend" : {
                "type": "object",
                "properties": {
                  "version" : {"type": "integer"},
                  "type" : {"type": "string","enum" : ["QnnHtp", "QnnGenAiTransformer"]},
                  "QnnHtp" : {
                    "type": "object",
                    "properties": {
                      "version" : {"type": "integer"},
                      "spill-fill-bufsize" : {"type": "integer"},
                      "data-alignment-size" : {"type": "integer"},
                      "use-mmap" : {"type": "boolean"},
                      "mmap-budget" : {"type": "integer"},
                      "poll" : {"type": "boolean"},
                      "pos-id-dim" : {"type": "integer"},
                      "cpu-mask" : {"type": "string"},
                      "kv-dim" : {"type": "integer"},
                      "allow-async-init" : {"type": "boolean"},
                      "rope-theta" : {"type": "double"}
                    }
                  },
                  "QnnGenAiTransformer" : {
                    "type": "object",
                    "properties": {
                      "version" : {"type": "integer"},
                      "n-logits" : {"type": "integer"},
                      "n-layer" : {"type": "integer"},
                      "n-embd" : {"type": "integer"},
                      "n-heads" : {"type": "integer"},
                      "kv-quantization" : {"type": "boolean"}
                    }
                  },
                  "extensions" : {"type": "string"}
                }
              },
              "model" : {
                "type": "object",
                "properties": {
                  "version" : {"type": "integer"},
                  "type" : {"type": "string","enum":["binary", "library"]},
                  "binary" : {
                    "type": "object",
                    "properties": {
                      "version" : {"type": "integer"},
                      "ctx-bins" : {"type": "array", "items": {"type": "string"}}
                    }
                  },
                  "library" : {
                    "type": "object",
                    "properties": {
                      "version" : {"type": "integer"},
                      "model-bin" : {"type": "string"}
                    }
                  }
                }
              }
            }
          }
        }
      }
    }
    Copy to clipboard

| Option | Applicability | Description |
| --- | --- | --- |
| text-generator::version | all backends | Version of node object that is supported by APIs.(1) |
| text-generator::type | all backends | Type of node supported by APIs.(basic) |
| text-generator::stop-sequence | all backends | Stop query when a set of sequences detected in response.<br>Argument passed in as an array of strings |
| text-generator::max-num-tokens | all backends | Stop query when max number of tokens generated in response. |
| context::version | all backends | Version of context object that is supported by APIs. (1) |
| context::size | all backends | Context length. Maximum number of tokens to store. |
| context::n-vocab | all backends | Model vocabulary size. |
| context::bos-token | all backends | Beginning of sentence token. |
| context::eos-token | all backends | End of sentence token.<br>Argument passed in as an integer or array of integers |
| context::eot-token | all backends | End of turn token. |
| sampler::version | all backends | Version of sampler object that is supported by APIs. (1) |
| sampler::type | all backends | Type of sampler to use. Supported options: basic, custom |
| sampler::callback-name | all backends | Name of the callback function to use for Sampling. |
| sampler::seed | all backends | Sampling random number generation seed. |
| sampler::temp | all backends | Sampling temperature. |
| sampler::top-k | all backends | Top-k number of samples. |
| sampler::top-p | all backends | Top-p sampling threshold. |
| sampler::greedy | all backends | Sampler that need to be used is random or greedy.<br>true value specify greedy sampling. |
| tokenizer::version | all backends | Version of tokenizer object that is supported by APIs. (1) |
| tokenizer::path | all backends | Path to tokenizer file. |
| engine::version | all backends | Version of engine object that is supported by APIs. (1) |
| engine::n-threads | all backends | Number of threads to use for KV-cache updates. |
| debug::path | all backends | File path to dump debug information. |
| debug::dump-tensors | all backends | Raw data dump of input and output tensors |
| debug::dump-specs | all backends | Dumps Input output tensor specification such as bw, scale,<br>offset, dimensions |
| debug::dump-outputs | all backends | Raw data dump of output tensor from engine |
| backend::version | all backends | Version of backend object that is supported by APIs. (1) |
| backend::type | all backends | Type of engine like “QnnHtp” for QNN HTP,<br>“QnnGenAiTransformer” for QNN GenAITransformer backend and<br>“QnnGpu” for QNN GPU. |
| backend::extensions | QNN HTP | Path to backend extensions configuration file. |
| QnnHtp::version | QNN HTP | Version of QnnHtp object that is supported by APIs. (1) |
| QnnHtp::spill-fill-bufsize | QNN HTP | Buffer size to pre-allocate for the QNN HTP spill fill.<br>This field depends upon the HTP VTCM memory size. It should<br>be set greater than the spill-fill required by each context<br>binary in the model. Consult the QNN HTP backend<br>documentation in the QAIRT SDK for more details. |
| QnnHtp::data-alignment-size | QNN HTP | Data will be aligned by rounding up the size to the nearest<br>multiple of alignment number. Typically should be zero. |
| QnnHtp::use-mmap | QNN HTP | Memory map the context binary files. Typically should be<br>turned on. |
| QnnHtp::mmap-budget | QNN HTP | Memory map the context binary files in chunks of the given<br>size. Typically should be 25MB. |
| QnnHtp::poll | QNN HTP | Specify whether to busy-wait on threads. |
| QnnHtp::pos-id-dim | QNN HTP | Dimension of positional embeddings, usually (kv-dim) / 2. |
| QnnHtp::cpumask | QNN HTP | CPU affinity mask. |
| QnnHtp::kv-dim | QNN HTP | Dimension of the KV-cache embedding. |
| QnnHtp::allow-async-init | QNN HTP | Allow context binaries to be initialized asynchronously<br>if the backend supports it. |
| QnnHtp::rope-theta | QNN HTP | Used to calculate rotary positional encodings. |
| QnnHtp::enable-graph-switching | QNN HTP | Enables graph switching for graphs within each context<br>binary. |
| QnnGenAiTransformer::version | QNN GenAiTransformer | Version of QnnGenAiTransformer object that is supported<br>by APIs. (1) |
| QnnGenAiTransformer::n-logits | QNN GenAiTransformer | Number of logit vectors that result will have for sampling. |
| QnnGenAiTransformer::n-layer | QNN GenAiTransformer | Number of decoder layers model is having. |
| QnnGenAiTransformer::n-embd | QNN GenAiTransformer | Size of embedding vector for each token. |
| QnnGenAiTransformer::n-heads | QNN GenAiTransformer | Number of heads model is having. |
| QnnGenAiTransformer::kv-quantization | QNN GenAiTransformer | Quantize KV Cache to Q8\_0\_32. |
| model::version | all backends | Version of model object that is supported by APIs. (1) |
| model::type | all backends | Type of model object “binary” for QNN HTP and “library”<br>for QNN GenAiTransformer. |
| model::positional-encoding | all backends | Captures positional encoding parameters for a model. |
| positional-encoding::type | all backends | Type of positional encoding. Supported types are rope,<br>alibi and absolute |
| positional-encoding::rope-dim | all backends | Dimension of Rope positional embeddings, usually<br>(kv-dim) / 2. |
| positional-encoding::rope-theta | all backends | Used to calculate rotary position encodings for type rope |
| binary::version | QNN HTP | Version of binary object that is supported by APIs. (1) |
| binary::ctx-bins | QNN HTP | List of serialized model files. |
| library::version | QNN GenAiTransformer | Version of library object that is supported by APIs. (1) |
| library::model-bin | QNN GenAiTransformer | Path to model.bin file. |

Text Encoder schema:

{
      "text-encoder" : {
        "type": "object",
        "properties": {
          "version" : {"type": "integer"},
          "context" : {
            "type": "object",
            "properties": {
              "version" : {"type": "integer"},
              "ctx-size": {"type": "integer"},
              "n-vocab": {"type": "integer"},
              "embed-size": {"type": "integer"},
              "pad-token": {"type": "integer"}
            }
          },
          "prompt" : {
            "type": "object",
            "properties": {
              "version" : {"type": "integer"},
              "prompt-template" : {"type": "array", "items": {"type": "string"}}
            }
          },
          "tokenizer" : {
            "type": "object",
            "properties": {
              "version" : {"type": "integer"},
              "path" : {"type": "string"}
            }
          },
          "truncate-input" : {"type" : "boolean"},
          "engine" : {
            "type": "object",
            "properties": {
              "version" : {"type": "integer"},
              "n-threads" : {"type": "integer"},
              "backend" : {
                "type": "object",
                "properties": {
                  "version" : {"type": "integer"},
                  "type" : {"type": "string","enum" : ["QnnHtp", "QnnGenAiTransformer"]},
                  "QnnHtp" : {
                    "type": "object",
                    "properties": {
                      "version" : {"type": "integer"},
                      "spill-fill-bufsize" : {"type": "integer"},
                      "data-alignment-size" : {"type": "integer"},
                      "use-mmap" : {"type": "boolean"},
                      "allow-async-init" : {"type": "boolean"},
                      "pooled-output" : {"type": "boolean"},
                      "disable-kv-cache" : {"type": "boolean"}
                    }
                  },
                  "QnnGenAiTransformer" : {
                    "type": "object",
                    "properties": {
                      "version" : {"type": "integer"},
                      "n-layer" : {"type": "integer"},
                      "n-embd" : {"type": "integer"},
                      "n-heads" : {"type": "integer"}
                    }
                  },
                  "extensions" : {"type": "string"}
                }
              },
              "model" : {
                "type": "object",
                "properties": {
                  "version" : {"type": "integer"},
                  "type" : {"type": "string","enum":["binary", "library"]},
                  "binary" : {
                    "type": "object",
                    "properties": {
                      "version" : {"type": "integer"},
                      "ctx-bins" : {"type": "array", "items": {"type": "string"}}
                    }
                  },
                  "library" : {
                    "type": "object",
                    "properties": {
                      "version" : {"type": "integer"},
                      "model-bin" : {"type": "string"}
                    }
                  }
                }
              }
            }
          }
        }
      }
    }
    Copy to clipboard

| Option | Applicability | Description |
| --- | --- | --- |
| text-encoder::version | all backends | Version of encoder object that is supported by APIs.(1) |
| text-encoder::truncate-input | all backends | To allow truncation of input, when it exceeds the context<br>length. |
| context::version | all backends | Version of context object that is supported by APIs. (1) |
| context::ctx-size | all backends | Context length. Maximum number of tokens to process. |
| context::n-vocab | all backends | Model vocabulary size. |
| context::embed-size | all backends | Embedding length. Embedding vector length for each token. |
| context::pad-token | all backends | Token id for pad token. |
| prompt::version | all backends | Version of prompt object that is supported by APIs. (1) |
| prompt::prompt-template | all backends | Prefix and Suffix string that will be added to each<br>prompt. |
| tokenizer::version | all backends | Version of tokenizer object that is supported by APIs. (1) |
| tokenizer::path | all backends | Path to tokenizer file. |
| engine::version | all backends | Version of engine object that is supported by APIs. (1) |
| engine::n-threads | all backends | Number of threads to use for KV-cache updates. |
| backend::version | all backends | Version of backend object that is supported by APIs. (1) |
| backend::type | all backends | Type of engine like “QnnHtp” for QNN HTP and<br>“QnnGenAiTransformer” for QNN GenAITransformer backend. |
| backend::extensions | QNN HTP | Path to backend extensions configuration file. |
| QnnHtp::version | QNN HTP | Version of QnnHtp object that is supported by APIs. (1) |
| QnnHtp::spill-fill-bufsize | QNN HTP | Buffer size to pre-allocate for the QNN HTP spill fill.<br>This field depends upon the HTP VTCM memory size. It should<br>be set greater than the spill-fill required by each context<br>binary in the model. Consult the QNN HTP backend<br>documentation in the QAIRT SDK for more details. |
| QnnHtp::use-mmap | QNN HTP | Memory map the context binary files. Typically should be<br>turned on. |
| QnnHtp::data-alignment-size | QNN HTP | Data will be aligned by rounding up the size to the nearest<br>multiple of alignment number. Typically should be zero. |
| QnnHtp::allow-async-init | QNN HTP | Allow context binaries to be initialized asynchronously<br>if the backend supports it. |
| QnnHtp::pooled-output | QNN HTP | To decide in between pooled or per token embedding result<br>as generation result. |
| QnnHtp::disable-kv-cache | QNN HTP | Disables the KV cache Manager, as models will not have KV<br>cache. |
| QnnGenAiTransformer::version | QNN GenAiTransformer | Version of QnnGenAiTransformer object that is supported<br>by APIs. (1) |
| QnnGenAiTransformer::n-layer | QNN GenAiTransformer | Number of decoder layers model is having. |
| QnnGenAiTransformer::n-embd | QNN GenAiTransformer | Size of embedding vector for each token. |
| QnnGenAiTransformer::n-heads | QNN GenAiTransformer | Number of heads model is having. |
| model::version | all backends | Version of model object that is supported by APIs. (1) |
| model::type | all backends | Type of model object “binary” for QNN HTP and “library”<br>for QNN GenAiTransformer. |
| binary::version | QNN HTP | Version of binary object that is supported by APIs. (1) |
| binary::ctx-bins | QNN HTP | List of serialized model files. |
| library::version | QNN GenAiTransformer | Version of library object that is supported by APIs. (1) |
| library::model-bin | QNN GenAiTransformer | Path to model.bin file. |

Image Encoder schema:

{
      "image-encoder" : {
        "type": "object",
        "properties": {
          "version" : {"type": "integer"},
          "engine" : {
            "type": "object",
            "properties": {
              "version" : {"type": "integer"},
              "n-threads" : {"type": "integer"},
              "backend" : {
                "type": "object",
                "properties": {
                  "version" : {"type": "integer"},
                  "type" : {"type": "string","enum" : ["QnnHtp"]},
                  "QnnHtp" : {
                    "type": "object",
                    "properties": {
                      "version" : {"type": "integer"},
                      "spill-fill-bufsize" : {"type": "integer"},
                      "data-alignment-size" : {"type": "integer"},
                      "use-mmap" : {"type": "boolean"},
                      "allow-async-init" : {"type": "boolean"}
                    }
                  },
                  "extensions" : {"type": "string"}
                }
              },
              "model" : {
                "type": "object",
                "properties": {
                  "version" : {"type": "integer"},
                  "type" : {"type": "string","enum":["binary"]},
                  "binary" : {
                    "type": "object",
                    "properties": {
                      "version" : {"type": "integer"},
                      "ctx-bins" : {"type": "array", "items": {"type": "string"}}
                    }
                  }
                }
              }
            }
          }
        }
      }
    }
    Copy to clipboard

| Option | Applicability | Description |
| --- | --- | --- |
| image-encoder::version | all backends | Version of encoder object that is supported by APIs.(1) |
| engine::version | all backends | Version of engine object that is supported by APIs. (1) |
| engine::n-threads | all backends | Number of threads to use for KV-cache updates. |
| backend::version | all backends | Version of backend object that is supported by APIs. (1) |
| backend::type | all backends | Type of engine like “QnnHtp” for QNN HTP and<br>“QnnGenAiTransformer” for QNN GenAITransformer backend. |
| backend::extensions | QNN HTP | Path to backend extensions configuration file. |
| QnnHtp::version | QNN HTP | Version of QnnHtp object that is supported by APIs. (1) |
| QnnHtp::spill-fill-bufsize | QNN HTP | Buffer size to pre-allocate for the QNN HTP spill fill.<br>This field depends upon the HTP VTCM memory size. It should<br>be set greater than the spill-fill required by each context<br>binary in the model. Consult the QNN HTP backend<br>documentation in the QAIRT SDK for more details. |
| QnnHtp::use-mmap | QNN HTP | Memory map the context binary files. Typically should be<br>turned on. |
| QnnHtp::data-alignment-size | QNN HTP | Data will be aligned by rounding up the size to the nearest<br>multiple of alignment number. Typically should be zero. |
| QnnHtp::allow-async-init | QNN HTP | Allow context binaries to be initialized asynchronously<br>if the backend supports it. |
| QnnHtp::pooled-output | QNN HTP | To decide in between pooled or per token embedding result<br>as generation result. |
| QnnHtp::disable-kv-cache | QNN HTP | Disables the KV cache Manager, as models will not have KV<br>cache. |
| model::version | all backends | Version of model object that is supported by APIs. (1) |
| model::type | all backends | Type of model object “binary” for QNN HTP and “library”<br>for QNN GenAiTransformer. |
| binary::version | QNN HTP | Version of binary object that is supported by APIs. (1) |
| binary::ctx-bins | QNN HTP | List of serialized model files. |

Last Published: Jul 02, 2026

[Previous Topic
GenieNode](https://docs.qualcomm.com/bundle/publicresource/80-63442-10/topics/node.md) [Next Topic
GenieDialog](https://docs.qualcomm.com/bundle/publicresource/80-63442-10/topics/dialog.md)