# Functions

- qairt.optimizer.onnx.change\_seq\_length(*ctx: GraphContext*, *new\_seq\_length: int*, *axis\_denotation\_config: Optional[AxisDenotationConfig] = None*) → GraphContext

    - Change the sequence length(AR) of an LLM

Modifies the graph in-place. If you need to preserve the original, pass a
`copy.deepcopy(ctx)` instead.

- Parameters

    - - **ctx** – Graph context containing the model
- **new\_seq\_length** – New sequence length to apply
- **axis\_denotation\_config** – Optional configuration for axis denotation inference
See `AxisDenotationConfig`
for details. Provide this if your model has non-standard input
names that don’t match the built-in patterns

- Returns

    - The same GraphContext, modified in-place

Example:

from qairt.optimizer.onnx import change_seq_length
    from qairt.optimizer.onnx import GraphContext
    
    ctx = GraphContext.from_files("model.onnx")
    change_seq_length(ctx, 128)
    ctx.save("model_modified.onnx")
    
    # With custom seed rules
    from qairt.optimizer.onnx import (
        change_seq_length,
        AxisDenotationConfig,
        AxisDenotationSeedRule,
        AxisDenotation
    )
    
    axis_denotation_config = AxisDenotationConfig(
        custom_seed_rules=[
            AxisDenotationSeedRule(
                name_pattern=r"my_custom_input",
                denotations=[AxisDenotation.BATCH, AxisDenotation.SEQ_LENGTH]
            )
        ]
    )
    change_seq_length(ctx, 128, axis_denotation_config)
    Copy to clipboard

- qairt.optimizer.onnx.change\_context\_length(*ctx: GraphContext*, *new\_context\_length: int*, *axis\_denotation\_config: Optional[AxisDenotationConfig] = None*) → GraphContext

    - Change the context length(CL) of an LLM

Modifies the graph in-place. If you need to preserve the original, pass a
`copy.deepcopy(ctx)` instead.

- Parameters

    - - **ctx** – Graph context containing the model
- **new\_context\_length** – New context length to apply
- **axis\_denotation\_config** – Optional configuration for axis denotation inference
See `AxisDenotationConfig`
for details. Provide this if your model has non-standard input
names that don’t match the built-in patterns

- Returns

    - The same GraphContext, modified in-place

Example:

from qairt.optimizer.onnx import change_context_length
    from qairt.optimizer.onnx import GraphContext
    
    ctx = GraphContext.from_files("model.onnx")
    change_context_length(ctx, 2048)
    ctx.save("model_modified.onnx")
    Copy to clipboard

- qairt.optimizer.onnx.change\_seq\_and\_context\_length(*ctx: GraphContext*, *new\_seq\_length: int*, *new\_context\_length: int*, *axis\_denotation\_config: Optional[AxisDenotationConfig] = None*) → GraphContext

    - Change both sequence length(AR) and context length(CL) of an LLM

Modifies the graph in-place. If you need to preserve the original, pass a
`copy.deepcopy(ctx)` instead.

- Parameters

    - - **ctx** – Graph context containing the model
- **new\_seq\_length** – New sequence length to apply
- **new\_context\_length** – New context length to apply
- **axis\_denotation\_config** – Optional configuration for axis denotation inference
See `AxisDenotationConfig`
for details. Provide this if your model has non-standard input
names that don’t match the built-in patterns

- Returns

    - The same GraphContext, modified in-place

Example:

from qairt.optimizer.onnx import change_seq_and_context_length
    from qairt.optimizer.onnx import GraphContext
    
    ctx = GraphContext.from_files("model.onnx")
    change_seq_and_context_length(ctx, 128, 2048)
    ctx.save("model_modified.onnx")
    Copy to clipboard

- qairt.optimizer.onnx.adapt\_moe(*ctx: GraphContext*, *\**, *overridden\_subselection: Optional[int] = None*, *remove\_op\_predicate: bool = False*, *enable\_validation: bool = False*) → GraphContext

    - High-level API for Mixture-of-Experts (MoE) model adaptation.

Adapt a Mixture-of-Experts (MoE) ONNX model, in-place.

Extracts and adapts the AR=N and AR=1 MoE components, inlines internal
functions, removes dead code, and runs shape inference.

When `enable_validation` is True, the adapted model is saved to a
temporary directory and compared against the post-transform model using
ONNX Runtime with random inputs. The temporary directory is cleaned up
automatically after validation.

- Parameters

    - - **ctx** – The model context to adapt.
- **overridden\_subselection** – Override the number of experts selected per
token. If `None`, the value is inferred from the model.
- **remove\_op\_predicate** – Whether to remove the op-predicate `Where` ops
(default `False`).
- **enable\_validation** – Whether to verify the transformed model against the
original using ONNX Runtime. Defaults to False.

- Returns

    - The same `GraphContext`, modified in-place.

Usage:

from qairt.optimizer.onnx import adapt_moe
    
    adapt_moe(ctx)
    adapt_moe(ctx, remove_op_predicate=True)
    Copy to clipboard

Last Published: May 08, 2026

Previous Topic
 
Optimizer API Next Topic

Classes