# Revision History

> 
> 
> 

This page contains the change log revision history starting from QAIRT SDK v2.34.0. For details on earlier versions, please refer to archives here: [Linux](https://docs.qualcomm.com/doc/80-63442-2/topic/revision_history_archived.html), [Windows](https://docs.qualcomm.com/doc/80-63442-2/topic/revision_history_windows_archived.html).

| Version | Date | Description |
| --- | --- | --- |
| 2.34.0 | April 2025 | <ul class="simple"><br><li><p>API:Genie: Added GenieSampler_registerUserDataCallback API which adds a userData argument to the sampler custom callback. {130164}</p></li><br><li><p>API:Genie: Added <cite>GenieEngine.h</cite>, <cite>GenieDialog_getEngine</cite>, and <cite>GenieDialog_bindEngine</cite> APIs. {126715}</p></li><br><li><p>API:SNPE: Added Java API <cite>setUnconsumedTensorsOutput()</cite>, equivalent to the C/C++ builder API<br><cite>Snpe_SNPEBuilder_SetUnconsumedTensorsAsOutputs()</cite> / <cite>SNPEBuilder::setUnconsumedTensorsAsOutputs()</cite>. {125891}</p></li><br><li><p>CPU: Added BOOL support in CPU Concat Op. {130940}</p></li><br><li><p>CPU: Added axes parameter support in L2Norm. {121463}</p></li><br><li><p>DSP:SNPE: Added the ability to display the exact priority of the HVX thread in the log to help identify potential issues related<br>to HVX concurrency scenarios. {117790}</p></li><br><li><p>Genie: Added KV quantization support for GenAiTransformer backend. {123438}</p></li><br><li><p>Genie: Added a LoRAv3 reference/sample Genie configuration to the SDK examples. {130008}</p></li><br><li><p>Genie: Added the Eaglet dialog type. {126452}</p></li><br><li><p>Genie: Added token-acceptance-rate to the GenieProfile output for some dialog types. {123350}</p></li><br><li><p>Genie: Introduced a performance optimization where logits are sampled using the native datatype output of the model. {121359}</p></li><br><li><p>HTP: Deprecated optrace collection via debug configuration files. Use optrace via profiling instead. {124739}</p></li><br><li><p>HTP: Fixed an issue where the number of items was missing in the multicore callback. {129636}</p></li><br><li><p>HTP: Implemented service call to do dspqueue_close for multicore environments. {126381}</p></li><br><li><p>HTP: Introduced parallel graph execution, enabling concurrent running of multiple graphs on a single HTP core to improve<br>throughput and resource utilization {89181}</p></li><br><li><p>HTP: Performance improvement for Softmax Op with 32 channels or less. {130819}</p></li><br><li><p>Op:GPU: Added support for GridSample Op. {127898}</p></li><br><li><p>Op:HTP: Optimized DepthWiseConv2d op execution by ensuring it runs on HMX {128655}</p></li><br><li><p>Op:HTP: Optimized DepthwiseConv op performance for an ASR model on SM8750 HTP W8A16. {129860}</p></li><br><li><p>OpDef: Added dynamic shape support for FullyConnected Op. {116235}</p></li><br><li><p>OpDef: Added optional parameter <cite>buffer_padding</cite> to Buffer Op. {125962}</p></li><br><li><p>Tool:Converter: Added support for BQ and LPBQ in JSON serializer and deserializer. {132650}</p></li><br><li><p>Tool:Converter: Added support for quantized DLC files as input to the quantizer module. 1. If all tensors are quantized or<br>overridden float, return directly. 2. If half-quantized DLC, dequantize the fixed-point tensors back to float before quantization.<br>3. Quantize all float tensors. {129135}</p></li><br><li><p>Tool:Converter: Added support to trigger Quantizer with float_fallback mode. {129131}</p></li><br><li><p>Tool:Converter: Fixed handling of dynamic input shapes with a more informative error message. {127631}</p></li><br><li><p>Tool:Converter: Introduced a new Converter argument to guide different Converter output export formats: –export_format<br>[&quot;DLC_DEFAULT&quot;, &quot;DLC_STRIP_QUANT&quot;] {129132}</p></li><br><li><p>Tool:Converter: QAIRT Quantizer now skips quantization steps if float_fallback is specified for an input Quant DLC. {130397}</p></li><br><li><p>Tool:qnn-onnx-converter: Added the <cite>–preserve_onnx_output_order</cite> option to maintain ONNX output order in the converted graph.<br>{126070}</p></li><br><li><p>QNN Core: Fixed an issue where QNN Savecontext failed for multiple models on Windows platforms due to the inability to find the<br>graph in the DLC. {130104}</p></li><br><li><p>CPU: Added int32 data datatype for ScatterElements. {126766}</p></li><br><li><p>CPU: Fixed L2Norm to handle multiple axis {127053}</p></li><br><li><p>CPU: Fixed verifier failures for single-layer resize models on ONNX16 framework. {124524}</p></li><br><li><p>CPU: Implemented deep copy of <cite>opConfig</cite> in CPU to prevent model failures. {128204}</p></li><br><li><p>DSP: Fixed an SNPE inference failure due to QnnContext_createFromBinary failing with a memory allocation error. {127804}</p></li><br><li><p>DSP: Fixed an SNPE inference failure where multiple models failed due to errors obtaining input tensor names {127809}</p></li><br><li><p>DSP: Fixed inference failures for specific models on HTP due to network partition issues. {131151}</p></li><br><li><p>GPU: Fixed accuracy error in QnnGpuOperationTestActivationAndroid. {125640}</p></li><br><li><p>GPU: Fixed accuracy error in QnnGpuOperationTestTransposeConvAndroid. {125992}</p></li><br><li><p>GPU: Fixed inference regressions in models having Convolution Op in <cite>gpu_fp16</cite> mode for some devices. {120026}</p></li><br><li><p>Genie: Fixed issue in genie-t2t-run where dialog de-initialization data was not saved. {132621}</p></li><br><li><p>Genie: Fixed issue where GenieEmbedding_generate would return a rank of 0. {131581}</p></li><br><li><p>Genie: Fixed issue where quantized values may overflow or underflow. {125929}</p></li><br><li><p>HTP: Addressed inference time regressions on multiple chipsets for HTP and HTP_FP16 configurations. {128165}</p></li><br><li><p>HTP: Corrected the TransportResult resize function to properly set the number of cores. {132311}</p></li><br><li><p>HTP: Fixed a LayerNorm validation failure by checking rank of bias only if it’s present in LayerNorm Op. {106186}</p></li><br><li><p>HTP: Fixed a Windows compatibility issue related to non-shared weight VA reservation. {130567}</p></li><br><li><p>HTP: Fixed a crash in libQnnHtp.so that occurred in graph switch scenarios involving spill fill buffer sharing. {131575}</p></li><br><li><p>HTP: Fixed a deadlock in <cite>allocateAndMapPersistentSpillFillBuffer()</cite> that occurred due to locking conflicts. {132488}</p></li><br><li><p>HTP: Fixed a hang issue in GenAI TNR tests when using asynchronous group initialization with weight sharing and spill-fill sharing<br>with weight sharing. {132586}</p></li><br><li><p>HTP: Fixed a multithreaded concurrency issue with LLM and small models that caused a ‘memHandles registration failure’. {131051}</p></li><br><li><p>HTP: Fixed a performance regression for a MobileBERT model that was introduced in a previous release. {132111}</p></li><br><li><p>HTP: Fixed a prepare failure for the L2Norm op with fp16 when the relaxed_precision_flag is not set during converter stage.<br>{129566}</p></li><br><li><p>HTP: Fixed an issue where QNN HTP inference failed during MC detailed profiling. {132564}</p></li><br><li><p>HTP: Fixed an issue where multiple VA sharing groups caused the error ‘Unable to map reserved buffer for non-shared weights’.<br>{131009}</p></li><br><li><p>HTP: Fixed an issue where qnn-context-binary-generator would hang, consuming excessive CPU and memory. {126833}</p></li><br><li><p>HTP: Fixed intermittent hangs that occurred during the creation of a context from a binary in concurrent scenarios. {131049}</p></li><br><li><p>HTP: Fixed the checker failures related to the OpPackage example by correcting the include path. {130707}</p></li><br><li><p>HTP: Improved performance to address inference time regressions observed on multiple chipsets. {131073}</p></li><br><li><p>HTP: Resolved an issue related to spill-fill buffer sharing, which caused incorrect output. {124544}</p></li><br><li><p>HTP: Resolved an issue with x86_prepare failures during savecontext. High CPU utilization during graph preparation was addressed.<br>{125093}</p></li><br><li><p>HTP: Resolved failures in LoRA v2 test cases due to DSP transport call issues, impacting multi-model context and graph switch<br>scenarios. {130142}</p></li><br><li><p>HTP: Resolved inference time regressions on SM8750. Avoided broadcast overhead on mul_op to improve performance of uint16<br>elementwise multiplication. {125746}</p></li><br><li><p>HTP: Reverted the enablement of the 64-bit flag to address reported hangs. {130301}</p></li><br><li><p>HTP: Updated PGE support check to use support Features on SoC Model. {127754}</p></li><br><li><p>LPAI: Fixed a failure in LPAI direct mode {131750}</p></li><br><li><p>LPAI: Fixed an issue where LPAI single layer models were failing. {130729}</p></li><br><li><p>Op:DSP: Supported LayerNorm; modified the hard code check. {122112}</p></li><br><li><p>Op:HTP: Added 5D support for float Sigmoid. {128867}</p></li><br><li><p>Op:HTP: Addressed performance issues when converting models with w8a16 compared to w8a8 on SM8350 by optimizing matmul and Gemm<br>OPs. {121404}</p></li><br><li><p>Op:HTP: Fixed ReduceMax FP16 compilation error. {127900}</p></li><br><li><p>Op:HTP: Fixed a QNN context-binary-generator failure due to a TCM insufficient tile error when processing a custom model. {129510}</p></li><br><li><p>Op:HTP: Fixed context binary generation failures for ArgMin/ArgMax ops due to TCM overflow. {108763}</p></li><br><li><p>Op:HTP: Fixed model validation errors during context saving, specifically addressing issues with the DepthToSpace Op. {131083}</p></li><br><li><p>Op:HTP: Fixed numerical issue for DepthwiseConv2d -&gt; HardSwish in a MobileNetV3 model. {128158}</p></li><br><li><p>Op:HTP: Fixed rank constraints of Op replacement rule. {130194}</p></li><br><li><p>Op:HTP: Improved DepthwiseConv2D performance. {126421}</p></li><br><li><p>Op:HTP: Optimized Reshape Ops when PCQ is enabled on constant tensors going into a MatMul Op, improving performance. {130415}</p></li><br><li><p>Op:HTP: Registered QInt16 for Concat Op to resolve graph preparation failures when using QuantInt16 tensors. {125735}</p></li><br><li><p>Op:HTP: Resolved an issue where context binary size calculation failed during graph preparation. {124130}</p></li><br><li><p>Op:HTP: Resolved an on-device hang issue during execution of Dynamic MobileNet V2, specifically during the Transpose Op {126806}</p></li><br><li><p>Op:HTP: Resolved context binary generation failures for the BevFormer model with AMP encodings. {129991}</p></li><br><li><p>SDK: Fixed build issues in Qnn SampleApp, Qnn SampleAppAsyncExecution and Qnn SampleAppSharedBuffer. {131442}</p></li><br><li><p>SDK: Removed “pytorch to onnx conversion avoidance suggestions” from QNN SDK Docs. {132125}</p></li><br><li><p>SDK: <cite>ReleaseNotes.txt</cite> renamed to <cite>QAIRT_ReleaseNotes.txt</cite> and now contains release notes for both Unix and WoS. {127817}</p></li><br><li><p>SNPE: Fixed API <cite>Snpe_SNPEBuilder_SetInitCacheMode()</cite>/<cite>SNPEBuilder::setInitCacheMode()</cite> breakage for non-HTP backends when using<br>the <cite>snpe-net-run</cite> option <cite>–enable_init_cache</cite>. {129545}</p></li><br><li><p>SNPE: Fixed the <cite>–enable_init_cache</cite> option (API <cite>SNPEBuilder::setInitCacheMode()</cite>/<cite>Snpe_SNPEBuilder_SetInitCacheMode()</cite>) in<br><cite>net-run</cite> for AIP runtime. {131929}</p></li><br><li><p>Tool:Converter: Corrected an issue where qnn-context-binary-generator logged an incorrect QPC path when the –backend_binary<br>option was used. {126169}</p></li><br><li><p>Tool:Converter: Corrected the allowed length for pad amounts for 4D tensors in the emitter. {132185}</p></li><br><li><p>Tool:Converter: Enabled data invariant optimizations for the Tile Op. If the input of Tile Op is quantized, the input dataType and<br>qInfo are copied to the output. {126372}</p></li><br><li><p>Tool:Converter: Fixed Layout Transform to avoid unintentionally loading deferred weights. {132173}</p></li><br><li><p>Tool:Converter: Fixed a segfault issue in IrJsonDeserializer during deserialization of newly generated model JSON files. {129816}</p></li><br><li><p>Tool:Converter: Fixed an issue where Accuracy Evaluator runs failed at the Netrun stage. {129997}</p></li><br><li><p>Tool:Converter: Fixed an issue where FOLD_MULTIPLE_TRANSPOSE was incorrectly pruning graph outputs. {127963}</p></li><br><li><p>Tool:Converter: Fixed an issue where context binary generation failed with a ‘Graph Finalize failure’ when using multi-Qranium<br>pipelined partitioning. {124908}</p></li><br><li><p>Tool:Converter: Fixed an issue where qnn-context-binary generation failed for LVM UNet models due to tensor updateability and<br>GroupNorm Op validation errors with the HTP backend. {127887}</p></li><br><li><p>Tool:Converter: Fixed an issue where the qnn-context-binary-generator tool failed on Windows-X86 when processing LoRAv3 models.<br>{130894}</p></li><br><li><p>Tool:Converter: Fixed index error failure in remove identity optimization. {125867}</p></li><br><li><p>Tool:Converter: Fixed issue when folding multiple transposes to retain graph output names. {128685}</p></li><br><li><p>Tool:Converter: Resolved a serialization issue with MatMul ops involving int16*int16 data types when using dynamic 16-bit weights.<br>{129733}</p></li><br><li><p>Tool:Converter:ONNX: Added support for dynamic inputs for Clip Op. {124203}</p></li><br><li><p>Tool:Converter:ONNX: Fixed an issue in the Converter to ensure correct name sanitization following C++ naming conventions.<br>{129356}</p></li><br><li><p>Tool:Converter:ONNX: Fixed axis tracking in ScatterElements. {118614}</p></li><br><li><p>Tool:Converter:ONNX: Fixed issue for reverse GRU Op to ensure the correct order of input names for the first output. {130544}</p></li><br><li><p>Tool:Converter:ONNX: Updated translation for ExpandOp to reduce inference time. {127065}</p></li><br><li><p>Tool:qairt-accuracy-evaluator: Fixed issue where the input list was incorrectly passed to the quantizer. {130537}</p></li><br><li><p>Tool:qairt-accuracy-evaluator: - Added support for the ‘algorithms’ quantizer parameter in the evaluator. - Provided input shape<br>to the converter for PyTorch models. {126291}</p></li><br><li><p>Tool:qnn-accuracy-debugger: Enhanced the qnn-accuracy-debugger tool to provide more meaningful metrics for intermediate tensor<br>cosine similarity. {126437}</p></li><br><li><p>Tool:qnn-net-run: Resolved an issue in accuracy evaluator runs where the error “‘Namespace’ object has no attribute<br>‘preserve_graph_output_order’” was encountered. {132180}</p></li><br><li><p>Tool:qnn-onnx-converter: Aligned the ONNX Resize Op translator’s behavior with ONNX definitions. {123092}</p></li><br><li><p>Tool:snpe-architecture-checker: Fixed an issue where snpe-architecture-checker would fail due to an uninitialized variable.<br>{126778}</p></li><br><li><p>Tool:snpe-stress-net-run: Fixed a memory leak issue when loading QNN models. {128498}</p></li><br></ul> |

Last Published: Oct 02, 2025

[Previous Topic
Limitations](https://docs.qualcomm.com/bundle/publicresource/80-63442-2/topics/limitations.md) [Next Topic
References](https://docs.qualcomm.com/bundle/publicresource/80-63442-2/topics/appx_ref.md)