# Migrating from Notebook Workflows

This guide is for users migrating from the QNN model preparation notebooks. It explains the key
behavioral differences between the notebook workflow and the Gen AI Builder, and provides a
complete variable-to-API mapping for translating your existing configuration.

For the full configuration reference, see Configuring the Gen AI Builder.

## Key Differences

Before consulting the mapping table, note these default behavioral differences between the
notebook workflow and the builder:

| Setting | Notebook Default | Builder Default |
| --- | --- | --- |
| Auto-regression numbers | `[32, 128]` (varies by notebook) | `[1, 128]` (when `weight_sharing=True`) |
| Split count | Hardcoded (e.g., 3 or 9) | Auto-calculated from model size |
| Target naming | `NspTargets.Android.GEN4` | `chipset:SM8750` |

These defaults exist because the builder is optimized for common deployment configurations. If
your notebook used different values, override them explicitly using
transformation options or
`set_compilation_options()`.

## Variable Mapping

The table below maps notebook environment variables to their Gen AI Builder equivalents.

| Notebook Variable | Builder Equivalent | Notes |
| --- | --- | --- |
| `TARGET_PLATFORM` + `PLATFORM_GEN` | `set_targets(["chipset:SM8750"])` | See [Supported Snapdragon Devices](https://docs.qualcomm.com/doc/80-63442-10/topic/QNN_general_overview.html#supported-snapdragon-devices) for chipset IDs |
| `ARNS = [32, 128]` | `set_transformation_options(options={"arn": [32, 128]})` | Builder default with weight\_sharing: `[1, 128]` |
| `CL_LIST = [2048, 4096]` | `set_transformation_options(options={"context_length": [2048, 4096]})` | Or use `multi_graph=True` for defaults |
| `NUM_SPLITS = 3` | `set_transformation_options(options={"split.num_splits": 3})` | Builder auto-calculates if not set |
| `SPLIT_EMBEDDING = True` | `set_transformation_options(options={"split.split_embedding": True})` | Default: `True` |
| `SPLIT_LMHEAD = True` | `set_transformation_options(options={"split.split_lm_head": True})` | Default varies by model |
| `ENABLE_NATIVE_KV` | `builder.native_kv = True` | Also sets `permute_kv_cache_io` |
| `EMBEDDING_ON_CPU = True` | `builder.prepare_embedding_lut = True` | Prepares CPU embedding lookup table |
| `O: 3.0` | `set_compilation_options(options={"graphs.optimization_type": 3})` | Default from `set_targets()` |
| `vtcm_mb: 8` | `set_compilation_options(options={"graphs.vtcm_size_in_mb": 8})` |  |
| `hvx_threads: N` | `set_compilation_options(options={"graphs.hvx_threads": N})` |  |
| `perf_profile: "burst"` | `set_compilation_options(options={"devices.cores.perf_profile": "burst"})` | Default from `set_targets()` |
| `extended_udma: True` | `set_compilation_options(options={"context.extended_udma": True})` | HTP v81+ only |
| `fp16_relaxed_precision: 0` | `HtpGraphConfig(fp16_relaxed_precision=0)` via full `CompileConfig` | Not in convenience dict |
| `rpc_control_latency: 100` | `HtpDeviceCoreConfig(rpc_control_latency=100)` via full `CompileConfig` | Not in convenience dict |
| `pd_session: "unsigned"` | `HtpDeviceConfig(pd_session="unsigned")` via full `CompileConfig` | Not in convenience dict |
| `mem_type: "shared_buffer"` | `HtpMemoryConfig(mem_type="shared_buffer")` via full `CompileConfig` | Not in convenience dict |
| `share_resources: True` | `HtpGroupContextConfig(share_resources=True)` via full `CompileConfig` | Not in convenience dict |
| `ENABLE_LORA` | `builder.lora_config = LoraBuilderInputConfig(...)` | See Low-Rank Adaptation (LoRA) Tutorial |
| `SSD_PARAMS_FILE` | `builder.speculative_config = SsdBuilderConfig(...)` | See Speculative Decoding Tutorial |
| Explicit `qairt-quantizer` call | `builder.set_conversion_options(calibration_config=CalibrationConfig(...))` | Only needed for non-exhaustive AIMET v1 encodings |
| `PerfSetting` / `HtpConfigFile` JSON | `CompileConfig.from_backend_extensions("HTP", "path/to/json")` | See HTP Backend Extensions |

Note

Variables marked “Not in convenience dict” require a full `CompileConfig` object rather than
the `options={...}` shorthand. See HTP Backend Extensions for how to construct one
from a JSON file or Python objects.

## Backend Extension JSON Files

If you have an existing HTP perf config JSON from a notebook workflow, you can load it directly
into a `CompileConfig` and pass it to the builder. See HTP Backend Extensions for the
full JSON structure, loading methods, and round-trip serialization.

Last Published: May 08, 2026

Previous Topic
 
Attaching Models for Specific AR Values Next Topic

API Documentation