# Revision History > > > This page contains the change log revision history starting from QAIRT SDK v2.34.0. For details on earlier versions, please refer to archives here: [Linux](https://docs.qualcomm.com/doc/80-63442-2/topic/revision_history_archived.html), [Windows](https://docs.qualcomm.com/doc/80-63442-2/topic/revision_history_windows_archived.html). | Version | Date | Description | | --- | --- | --- | | 2.34.0 | April 2025 |

API:Genie: Added GenieSampler_registerUserDataCallback API which adds a userData argument to the sampler custom callback. {130164}

API:Genie: Added GenieEngine.h, GenieDialog_getEngine, and GenieDialog_bindEngine APIs. {126715}

API:SNPE: Added Java API setUnconsumedTensorsOutput(), equivalent to the C/C++ builder API
Snpe_SNPEBuilder_SetUnconsumedTensorsAsOutputs() / SNPEBuilder::setUnconsumedTensorsAsOutputs(). {125891}

CPU: Added BOOL support in CPU Concat Op. {130940}

CPU: Added axes parameter support in L2Norm. {121463}

DSP:SNPE: Added the ability to display the exact priority of the HVX thread in the log to help identify potential issues related
to HVX concurrency scenarios. {117790}

Genie: Added KV quantization support for GenAiTransformer backend. {123438}

Genie: Added a LoRAv3 reference/sample Genie configuration to the SDK examples. {130008}

Genie: Added the Eaglet dialog type. {126452}

Genie: Added token-acceptance-rate to the GenieProfile output for some dialog types. {123350}

Genie: Introduced a performance optimization where logits are sampled using the native datatype output of the model. {121359}

HTP: Deprecated optrace collection via debug configuration files. Use optrace via profiling instead. {124739}

HTP: Fixed an issue where the number of items was missing in the multicore callback. {129636}

HTP: Implemented service call to do dspqueue_close for multicore environments. {126381}

HTP: Introduced parallel graph execution, enabling concurrent running of multiple graphs on a single HTP core to improve
throughput and resource utilization {89181}

HTP: Performance improvement for Softmax Op with 32 channels or less. {130819}

Op:GPU: Added support for GridSample Op. {127898}

Op:HTP: Optimized DepthWiseConv2d op execution by ensuring it runs on HMX {128655}

Op:HTP: Optimized DepthwiseConv op performance for an ASR model on SM8750 HTP W8A16. {129860}

OpDef: Added dynamic shape support for FullyConnected Op. {116235}

OpDef: Added optional parameter buffer_padding to Buffer Op. {125962}

Tool:Converter: Added support for BQ and LPBQ in JSON serializer and deserializer. {132650}

Tool:Converter: Added support for quantized DLC files as input to the quantizer module. 1. If all tensors are quantized or
overridden float, return directly. 2. If half-quantized DLC, dequantize the fixed-point tensors back to float before quantization.
3. Quantize all float tensors. {129135}

Tool:Converter: Added support to trigger Quantizer with float_fallback mode. {129131}

Tool:Converter: Fixed handling of dynamic input shapes with a more informative error message. {127631}

Tool:Converter: Introduced a new Converter argument to guide different Converter output export formats: –export_format
["DLC_DEFAULT", "DLC_STRIP_QUANT"] {129132}

Tool:Converter: QAIRT Quantizer now skips quantization steps if float_fallback is specified for an input Quant DLC. {130397}

Tool:qnn-onnx-converter: Added the –preserve_onnx_output_order option to maintain ONNX output order in the converted graph.
{126070}

QNN Core: Fixed an issue where QNN Savecontext failed for multiple models on Windows platforms due to the inability to find the
graph in the DLC. {130104}

CPU: Added int32 data datatype for ScatterElements. {126766}

CPU: Fixed L2Norm to handle multiple axis {127053}

CPU: Fixed verifier failures for single-layer resize models on ONNX16 framework. {124524}

CPU: Implemented deep copy of opConfig in CPU to prevent model failures. {128204}

DSP: Fixed an SNPE inference failure due to QnnContext_createFromBinary failing with a memory allocation error. {127804}

DSP: Fixed an SNPE inference failure where multiple models failed due to errors obtaining input tensor names {127809}

DSP: Fixed inference failures for specific models on HTP due to network partition issues. {131151}

GPU: Fixed accuracy error in QnnGpuOperationTestActivationAndroid. {125640}

GPU: Fixed accuracy error in QnnGpuOperationTestTransposeConvAndroid. {125992}

GPU: Fixed inference regressions in models having Convolution Op in gpu_fp16 mode for some devices. {120026}

Genie: Fixed issue in genie-t2t-run where dialog de-initialization data was not saved. {132621}

Genie: Fixed issue where GenieEmbedding_generate would return a rank of 0. {131581}

Genie: Fixed issue where quantized values may overflow or underflow. {125929}

HTP: Addressed inference time regressions on multiple chipsets for HTP and HTP_FP16 configurations. {128165}

HTP: Corrected the TransportResult resize function to properly set the number of cores. {132311}

HTP: Fixed a LayerNorm validation failure by checking rank of bias only if it’s present in LayerNorm Op. {106186}

HTP: Fixed a Windows compatibility issue related to non-shared weight VA reservation. {130567}

HTP: Fixed a crash in libQnnHtp.so that occurred in graph switch scenarios involving spill fill buffer sharing. {131575}

HTP: Fixed a deadlock in allocateAndMapPersistentSpillFillBuffer() that occurred due to locking conflicts. {132488}

HTP: Fixed a hang issue in GenAI TNR tests when using asynchronous group initialization with weight sharing and spill-fill sharing
with weight sharing. {132586}

HTP: Fixed a multithreaded concurrency issue with LLM and small models that caused a ‘memHandles registration failure’. {131051}

HTP: Fixed a performance regression for a MobileBERT model that was introduced in a previous release. {132111}

HTP: Fixed a prepare failure for the L2Norm op with fp16 when the relaxed_precision_flag is not set during converter stage.
{129566}

HTP: Fixed an issue where QNN HTP inference failed during MC detailed profiling. {132564}

HTP: Fixed an issue where multiple VA sharing groups caused the error ‘Unable to map reserved buffer for non-shared weights’.
{131009}

HTP: Fixed an issue where qnn-context-binary-generator would hang, consuming excessive CPU and memory. {126833}

HTP: Fixed intermittent hangs that occurred during the creation of a context from a binary in concurrent scenarios. {131049}

HTP: Fixed the checker failures related to the OpPackage example by correcting the include path. {130707}

HTP: Improved performance to address inference time regressions observed on multiple chipsets. {131073}

HTP: Resolved an issue related to spill-fill buffer sharing, which caused incorrect output. {124544}

HTP: Resolved an issue with x86_prepare failures during savecontext. High CPU utilization during graph preparation was addressed.
{125093}

HTP: Resolved failures in LoRA v2 test cases due to DSP transport call issues, impacting multi-model context and graph switch
scenarios. {130142}

HTP: Resolved inference time regressions on SM8750. Avoided broadcast overhead on mul_op to improve performance of uint16
elementwise multiplication. {125746}

HTP: Reverted the enablement of the 64-bit flag to address reported hangs. {130301}

HTP: Updated PGE support check to use support Features on SoC Model. {127754}

LPAI: Fixed a failure in LPAI direct mode {131750}

LPAI: Fixed an issue where LPAI single layer models were failing. {130729}

Op:DSP: Supported LayerNorm; modified the hard code check. {122112}

Op:HTP: Added 5D support for float Sigmoid. {128867}

Op:HTP: Addressed performance issues when converting models with w8a16 compared to w8a8 on SM8350 by optimizing matmul and Gemm
OPs. {121404}

Op:HTP: Fixed ReduceMax FP16 compilation error. {127900}

Op:HTP: Fixed a QNN context-binary-generator failure due to a TCM insufficient tile error when processing a custom model. {129510}

Op:HTP: Fixed context binary generation failures for ArgMin/ArgMax ops due to TCM overflow. {108763}

Op:HTP: Fixed model validation errors during context saving, specifically addressing issues with the DepthToSpace Op. {131083}

Op:HTP: Fixed numerical issue for DepthwiseConv2d -> HardSwish in a MobileNetV3 model. {128158}

Op:HTP: Fixed rank constraints of Op replacement rule. {130194}

Op:HTP: Improved DepthwiseConv2D performance. {126421}

Op:HTP: Optimized Reshape Ops when PCQ is enabled on constant tensors going into a MatMul Op, improving performance. {130415}

Op:HTP: Registered QInt16 for Concat Op to resolve graph preparation failures when using QuantInt16 tensors. {125735}

Op:HTP: Resolved an issue where context binary size calculation failed during graph preparation. {124130}

Op:HTP: Resolved an on-device hang issue during execution of Dynamic MobileNet V2, specifically during the Transpose Op {126806}

Op:HTP: Resolved context binary generation failures for the BevFormer model with AMP encodings. {129991}

SDK: Fixed build issues in Qnn SampleApp, Qnn SampleAppAsyncExecution and Qnn SampleAppSharedBuffer. {131442}

SDK: Removed “pytorch to onnx conversion avoidance suggestions” from QNN SDK Docs. {132125}

SDK: ReleaseNotes.txt renamed to QAIRT_ReleaseNotes.txt and now contains release notes for both Unix and WoS. {127817}

SNPE: Fixed API Snpe_SNPEBuilder_SetInitCacheMode()/SNPEBuilder::setInitCacheMode() breakage for non-HTP backends when using
the snpe-net-run option –enable_init_cache. {129545}

SNPE: Fixed the –enable_init_cache option (API SNPEBuilder::setInitCacheMode()/Snpe_SNPEBuilder_SetInitCacheMode()) in
net-run for AIP runtime. {131929}

Tool:Converter: Corrected an issue where qnn-context-binary-generator logged an incorrect QPC path when the –backend_binary
option was used. {126169}

Tool:Converter: Corrected the allowed length for pad amounts for 4D tensors in the emitter. {132185}

Tool:Converter: Enabled data invariant optimizations for the Tile Op. If the input of Tile Op is quantized, the input dataType and
qInfo are copied to the output. {126372}

Tool:Converter: Fixed Layout Transform to avoid unintentionally loading deferred weights. {132173}

Tool:Converter: Fixed a segfault issue in IrJsonDeserializer during deserialization of newly generated model JSON files. {129816}

Tool:Converter: Fixed an issue where Accuracy Evaluator runs failed at the Netrun stage. {129997}

Tool:Converter: Fixed an issue where FOLD_MULTIPLE_TRANSPOSE was incorrectly pruning graph outputs. {127963}

Tool:Converter: Fixed an issue where context binary generation failed with a ‘Graph Finalize failure’ when using multi-Qranium
pipelined partitioning. {124908}

Tool:Converter: Fixed an issue where qnn-context-binary generation failed for LVM UNet models due to tensor updateability and
GroupNorm Op validation errors with the HTP backend. {127887}

Tool:Converter: Fixed an issue where the qnn-context-binary-generator tool failed on Windows-X86 when processing LoRAv3 models.
{130894}

Tool:Converter: Fixed index error failure in remove identity optimization. {125867}

Tool:Converter: Fixed issue when folding multiple transposes to retain graph output names. {128685}

Tool:Converter: Resolved a serialization issue with MatMul ops involving int16*int16 data types when using dynamic 16-bit weights.
{129733}

Tool:Converter:ONNX: Added support for dynamic inputs for Clip Op. {124203}

Tool:Converter:ONNX: Fixed an issue in the Converter to ensure correct name sanitization following C++ naming conventions.
{129356}

Tool:Converter:ONNX: Fixed axis tracking in ScatterElements. {118614}

Tool:Converter:ONNX: Fixed issue for reverse GRU Op to ensure the correct order of input names for the first output. {130544}

Tool:Converter:ONNX: Updated translation for ExpandOp to reduce inference time. {127065}

Tool:qairt-accuracy-evaluator: Fixed issue where the input list was incorrectly passed to the quantizer. {130537}

Tool:qairt-accuracy-evaluator: - Added support for the ‘algorithms’ quantizer parameter in the evaluator. - Provided input shape
to the converter for PyTorch models. {126291}

Tool:qnn-accuracy-debugger: Enhanced the qnn-accuracy-debugger tool to provide more meaningful metrics for intermediate tensor
cosine similarity. {126437}

Tool:qnn-net-run: Resolved an issue in accuracy evaluator runs where the error “‘Namespace’ object has no attribute
‘preserve_graph_output_order’” was encountered. {132180}

Tool:qnn-onnx-converter: Aligned the ONNX Resize Op translator’s behavior with ONNX definitions. {123092}

Tool:snpe-architecture-checker: Fixed an issue where snpe-architecture-checker would fail due to an uninitialized variable.
{126778}

Tool:snpe-stress-net-run: Fixed a memory leak issue when loading QNN models. {128498}

| Last Published: Oct 02, 2025 [Previous Topic Limitations](https://docs.qualcomm.com/bundle/publicresource/80-63442-2/topics/limitations.md) [Next Topic References](https://docs.qualcomm.com/bundle/publicresource/80-63442-2/topics/appx_ref.md)