# qairt.modules.lora

## Overview

The Low-Rank Adaptation (LoRA) module provides configuration classes for defining and managing LoRA adapters
for parameter-efficient fine-tuning of large language models.

LoRA enables efficient model adaptation by training low-rank matrices that are applied to specific model layers,
rather than fine-tuning the entire model. The QAIRT LoRA module supports:

- Multiple adapters with different configurations
- Dynamic adapter switching at runtime
- Adapter composition (using multiple adapters simultaneously)

## Configuration classes

- *class* qairt.modules.lora.lora\_config.AdapterConfig(*\*args: Any*, *\*\*kwargs: Any*)

    - Bases: `AISWBaseModel`

Configuration for an adapter, including its name and associated LoRA configurations.

- adapter\_lora\_config*: List[Union[str, PathLike, [AdapterParamsConfig](https://docs.qualcomm.com/doc/80-87189-2/topic/qairt-gen-ai-modules-lora.html#qairt.modules.lora.lora_config.AdapterParamsConfig)]]*

    - List of LoRA configuration paths or objects.

- ensure\_list(*v*)

    - Ensures that the adapter\_lora\_config is always a list.

- Parameters

    - **v** – The input value to validate.

- Returns

    - A list containing the input value if it was not already a list.

- Return type

    - List

- name*: str*

    - Name of the adapter.

- *class* qairt.modules.lora.lora\_config.AdapterParamsConfig(*\*args: Any*, *\*\*kwargs: Any*)

    - Bases: `AISWBaseModel`

Configuration for individual adapter parameters.

- alpha*: int*

    - Alpha value for scaling.

- name*: str*

    - Name of the adapter.

- rank*: int*

    - Rank used in the adapter configuration.

- target\_modules*: List[str]*

    - List of source framework module names where the adapter is applied.

- *class* qairt.modules.lora.lora\_config.AdapterRunConfig(*\*args: Any*, *\*\*kwargs: Any*)

    - Bases: `AISWBaseModel`

Defines the configuration parameters for executing a LoRA (Low-Rank Adaptation) model.

This configuration is used to control how the LoRA adapter is applied during model inference.

- adapter\_name*: str*

    - The name or identifier of the LoRA adapter to be used during execution.

- alpha*: float*  *= 1.0*

    - A scaling factor applied to the LoRA weights.

- *class* qairt.modules.lora.lora\_config.LoraBuilderInputConfig(*\*args: Any*, *\*\*kwargs: Any*)

    - Bases: `AISWBaseModel`

Input configuration for LoRA, allowing either a path or an object.

- alpha\_tensor\_name*: str*

    - Name of the tensor where LoRA adapter is being applied.

- check\_exclusive\_inputs(*values*)

    - Validates that only one of lora\_config\_path or lora\_config\_obj is provided.

- Parameters

    - **values** (*dict*) – Dictionary of field values.

- Raises

    - **ValueError** – If both or neither of the fields are provided.

- Returns

    - Validated field values.

- Return type

    - dict

- create\_lora\_graph*: bool*  *= True*

    - Whether to create LoRA max rank-concatenated graph

- lora\_config\_obj*: Optional[[LoraConfig](https://docs.qualcomm.com/doc/80-87189-2/topic/qairt-gen-ai-modules-lora.html#qairt.modules.lora.lora_config.LoraConfig)]*  *= None*

    - LoRA configuration object.

- lora\_config\_path*: Optional[Union[str, PathLike]]*  *= None*

    - Path to the LoRA configuration file.

- quant\_updatable\_mode*: Literal['none', 'adapter\_only', 'all']*  *= 'adapter\_only'*

    - Mode for quant-updatable tensors.

- *class* qairt.modules.lora.lora\_config.LoraBuilderOutputConfig(*\*args: Any*, *\*\*kwargs: Any*)

    - Bases: `AISWBaseModel`

Defines the output configuration from the LoRA <cite>build_lora_graph</cite> process.

This configuration can be serialized into a YAML file
and passed to subsequent steps in the pipeline.

- base\_model\_artifacts*: Dict[str, Union[str, PathLike]]*

    - Dictionary containing paths to base model artifacts like ONNX, encodings and data files.

- lora\_tensor\_names*: Union[str, PathLike]*

    - Path or string reference to the tensor names used in the LoRA model.

- use\_case*: List[[UseCaseOutputConfig](https://docs.qualcomm.com/doc/80-87189-2/topic/qairt-gen-ai-modules-lora.html#qairt.modules.lora.lora_config.UseCaseOutputConfig)]*

    - A list of use case configurations that describe how the LoRA model will be used.
This is serialized to <cite>lora_importer_config.yaml</cite>.

- *class* qairt.modules.lora.lora\_config.LoraConfig(*\*args: Any*, *\*\*kwargs: Any*)

    - Bases: `AISWBaseModel`

Top-level configuration for LoRA, including adapters and use cases.

- adapter*: List[[AdapterConfig](https://docs.qualcomm.com/doc/80-87189-2/topic/qairt-gen-ai-modules-lora.html#qairt.modules.lora.lora_config.AdapterConfig)]*

    - List of adapter configurations.

- attach\_point\_onnx\_mapping*: Union[str, PathLike]*

    - Path to ONNX mapping file.

- use\_cases*: List[[UseCaseInputConfig](https://docs.qualcomm.com/doc/80-87189-2/topic/qairt-gen-ai-modules-lora.html#qairt.modules.lora.lora_config.UseCaseInputConfig)]*  *= FieldInfo(annotation=NoneType, required=True, alias='use-case', alias\_priority=2)*

    - List of use case configurations (aliased as ‘use-case’).

- *class* qairt.modules.lora.lora\_config.UseCaseInputConfig(*\*args: Any*, *\*\*kwargs: Any*)

    - Bases: `AISWBaseModel`

Configuration for a specific use case of the model.

- adapter\_alphas*: List[float]*

    - List of alpha values for each adapter.

- adapter\_names*: List[str]*

    - List of adapter names used in this use case.

- encodings*: Union[str, PathLike]*  *= FieldInfo(annotation=NoneType, required=True, alias='quant\_overrides', alias\_priority=2)*

    - Path to quantization overrides (aliased as ‘quant\_overrides’).

- model*: Union[str, PathLike]*  *= FieldInfo(annotation=NoneType, required=True, alias='model\_name', alias\_priority=2)*

    - Path or name of the model (aliased as ‘model\_name’).

- name*: str*

    - Name of the use case.

- quant\_updatable\_tensors*: Optional[Union[str, PathLike]]*  *= None*

    - Path to quant-updatable tensors file.

- *class* qairt.modules.lora.lora\_config.UseCaseOutputConfig(*\*args: Any*, *\*\*kwargs: Any*)

    - Bases: `AISWBaseModel`

Configuration for the output of a specific use case after LoRA processing.

- encodings*: Optional[Union[str, PathLike]]*  *= FieldInfo(annotation=NoneType, required=False, default=None, alias='quant\_overrides', alias\_priority=2)*

    - Path to quantization overrides (mapped to ‘quant\_overrides’ when serialized).

- graph*: Optional[str]*  *= ''*

    - Name of the graph for the use case

- lora\_weights*: Union[str, PathLike]*  *= FieldInfo(annotation=NoneType, required=True, alias='weights', alias\_priority=2)*

    - Path to the LoRA weights file (in safetensors format).

- model*: Optional[Union[str, PathLike]]*  *= FieldInfo(annotation=NoneType, required=False, default=None, alias='model\_name', alias\_priority=2)*

    - Path or name of the model (mapped to ‘model\_name’ when serialized).

- name*: str*

    - Name of the use case.

- output\_path*: Optional[Union[str, PathLike]]*  *= None*

    - Path where the importer output should be saved.

- *class* qairt.modules.lora.lora\_config.UseCaseRunConfig(*\*args: Any*, *\*\*kwargs: Any*)

    - Bases: `AISWBaseModel`

Defines the configuration for a specific use case involving one or more LoRA adapters.

- adapters*: List[[AdapterRunConfig](https://docs.qualcomm.com/doc/80-87189-2/topic/qairt-gen-ai-modules-lora.html#qairt.modules.lora.lora_config.AdapterRunConfig)]*

    - A list of LoRA adapter configurations to be used in this use case.
Each adapter is defined by its own <cite>AdapterRunConfig</cite>, specifying parameters
such as adapter name and scaling factor.

- use\_case\_name*: str*

    - A unique identifier for the use case, representing a single adapter or a group of adapters.

- qairt.modules.lora.lora\_config.get\_adapter\_count\_by\_use\_case(*lora\_config: [LoraBuilderInputConfig](https://docs.qualcomm.com/doc/80-87189-2/topic/qairt-gen-ai-modules-lora.html#qairt.modules.lora.lora_config.LoraBuilderInputConfig)*) → Dict[str, int]

    - Constructs a dictionary mapping each use case to the count of LoRA adapters it contains.

- Parameters

    - **lora\_config** ([*LoraBuilderInputConfig*](https://docs.qualcomm.com/doc/80-87189-2/topic/qairt-gen-ai-modules-lora.html#qairt.modules.lora.lora_config.LoraBuilderInputConfig)) – The configuration object containing LoRA adapter information.

- Returns

    - A dictionary where keys are use case names and values are the count of LoRA adapters in each use case.

- Return type

    - Dict[str, int]

- qairt.modules.lora.lora\_config.load\_use\_case\_config(*yaml\_path: Union[str, Path]*) → List[[UseCaseOutputConfig](https://docs.qualcomm.com/doc/80-87189-2/topic/qairt-gen-ai-modules-lora.html#qairt.modules.lora.lora_config.UseCaseOutputConfig)]

    - Loads use case configuration from a specified YAML file.

- Parameters

    - **yaml\_path** (*Union* *[* *str* *,* *Path* *]*) – The path to the YAML configuration file.

- Returns

    - A list of use case configuration objects parsed from the YAML file.

- Return type

    - List[[UseCaseOutputConfig](https://docs.qualcomm.com/doc/80-87189-2/topic/qairt-gen-ai-modules-lora.html#qairt.modules.lora.lora_config.UseCaseOutputConfig)]

- Raises

    - - **FileNotFoundError** – If the YAML file is not found at the specified path.
- **ValueError** – If the YAML content is invalid or missing the ‘use\_case’ key.

- qairt.modules.lora.lora\_config.serialize\_lora\_adapter\_weight\_config(*use\_cases: List[[UseCaseOutputConfig](https://docs.qualcomm.com/doc/80-87189-2/topic/qairt-gen-ai-modules-lora.html#qairt.modules.lora.lora_config.UseCaseOutputConfig)]*, *yaml\_path: str*, *base\_dir: str*) → None

    - Serializes selected fields from UseCaseOutputConfig objects to a YAML file for compile API.

Only the fields ‘name’, ‘graph’, ‘lora\_weights’ (as ‘weights’), and ‘encodings’ are serialized.
Relative paths are resolved using the provided base\_dir.

- Parameters

    - - **use\_cases** (*List* *[*[*UseCaseOutputConfig*](https://docs.qualcomm.com/doc/80-87189-2/topic/qairt-gen-ai-modules-lora.html#qairt.modules.lora.lora_config.UseCaseOutputConfig)*]*) – The configuration object containing use case data.
- **yaml\_path** (*str*) – The file path where the YAML output should be saved.
- **base\_dir** (*str*) – The base directory to resolve relative paths.

- Returns

    - None

- qairt.modules.lora.lora\_config.serialize\_lora\_importer\_config(*lora\_uc\_output\_config: List[[UseCaseOutputConfig](https://docs.qualcomm.com/doc/80-87189-2/topic/qairt-gen-ai-modules-lora.html#qairt.modules.lora.lora_config.UseCaseOutputConfig)]*, *yaml\_path: str*, *base\_dir: str*) → None

    - Serializes fields from List[UseCaseOutputConfig] to a YAML file.

Only the fields ‘name’, ‘model\_name’, ‘weights’, ‘quant\_overrides’, and ‘output\_path’
are serialized for each use case. Relative paths are resolved using the provided base\_dir.

- Parameters

    - - **lora\_uc\_output\_config** (*List* *[*[*UseCaseOutputConfig*](https://docs.qualcomm.com/doc/80-87189-2/topic/qairt-gen-ai-modules-lora.html#qairt.modules.lora.lora_config.UseCaseOutputConfig)*]*) – The configuration object containing use case data.
- **yaml\_path** (*str*) – The file path where the YAML output should be saved.
- **base\_dir** (*str*) – The base directory to resolve relative paths.

- Returns

    - None

- qairt.modules.lora.lora\_config.serialize\_lora\_input\_config(*lora\_config: [LoraConfig](https://docs.qualcomm.com/doc/80-87189-2/topic/qairt-gen-ai-modules-lora.html#qairt.modules.lora.lora_config.LoraConfig)*, *base\_directory: Union[str, PathLike]*) → str

    - Serializes a LoraConfig object into a YAML file and saves adapter parameter configs as JSON.

- Parameters

    - - **lora\_config** ([*LoraConfig*](https://docs.qualcomm.com/doc/80-87189-2/topic/qairt-gen-ai-modules-lora.html#qairt.modules.lora.lora_config.LoraConfig)) – The configuration object to serialize.
- **base\_directory** (*Union* *[* *str* *,* *PathLike* *]*) – Directory where the files will be saved.

- Returns

    - Path to the generated YAML configuration file.

- Return type

    - str

## Configuration examples

### Define adapter parameters

Adapter parameters define the structure and target modules for a LoRA adapter:

from qairt.modules.lora.lora_config import AdapterParamsConfig
    
    # Define adapter parameters
    adapter1_params = AdapterParamsConfig(
        name="long",
        rank=16,
        alpha=32,
        target_modules=["q_proj", "v_proj", "k_proj", "o_proj"],
    )
    
    adapter2_params = AdapterParamsConfig(
        name="elementary",
        rank=8,
        alpha=16,
        target_modules=["q_proj", "v_proj"],
    )
    Copy to clipboard

**Parameters:**

- `name`: Unique identifier for the adapter
- `rank`: Rank of the low-rank matrices (lower rank means fewer parameters, faster inference)
- `alpha`: Scaling factor applied to the adapter weights
- `target_modules`: List of model layer names where the adapter should be applied

### Create adapter configurations

Adapter configurations wrap the adapter parameters and can reference multiple parameter sets:

from qairt.modules.lora.lora_config import AdapterConfig
    
    # Create adapter configurations
    adapter1 = AdapterConfig(
        name="long",
        adapter_lora_config=[adapter1_params],
    )
    
    adapter2 = AdapterConfig(
        name="elementary",
        adapter_lora_config=[adapter2_params],
    )
    
    adapter3 = AdapterConfig(
        name="elementary+long",
        adapter_lora_config=[f"{llama3_exports}/onnx/elementary+long.json"],
    )
    Copy to clipboard

**Parameters:**

- `name`: Unique identifier for the adapter
- `adapter_lora_config`: List of LoRA configuration paths or AdapterParamsConfig objects. This parameter can accept:

    - AdapterParamsConfig objects (as shown in adapter1 and adapter2)
    - String paths to JSON files containing adapter parameters (as shown in adapter3)
    - A mix of both objects and paths

### Define use cases

Use cases define how adapters are combined and applied during inference:

from qairt.modules.lora.lora_config import UseCaseInputConfig
    
    llama3_exports = "./llama_3.2_3b/model_exports"
    
    # Single adapter use case
    use_case1 = UseCaseInputConfig(
        name="long",
        adapter_names=["long"],
        model=f"{llama3_exports}/onnx/model.onnx",
        adapter_alphas=[1.0],
        encodings=f"{llama3_exports}/lora/adapters/long.encodings",
        quant_updatable_tensors=f"{llama3_exports}/onnx/long_updatable_tensors.txt",
    )
    
    # Multi-adapter use case
    use_case2 = UseCaseInputConfig(
        name="elementary",
        adapter_names=["elementary"],
        model=f"{llama3_exports}/onnx/model.onnx",
        adapter_alphas=[1.0],
        encodings=f"{llama3_exports}/lora/adapters/elementary.encodings",
        quant_updatable_tensors=f"{llama3_exports}/onnx/elementary_updatable_tensors.txt",
    )
    
    # Single adapter use case
    use_case3 = UseCaseInputConfig(
        name="elementary+long",
        adapter_names=["elementary", "long"],
        model=f"{llama3_exports}/onnx/model.onnx",
        adapter_alphas=[1.0, 0.8],
        encodings=f"{llama3_exports}/onnx/elementary+long.encodings",
        quant_updatable_tensors=f"{llama3_exports}/onnx/elementary+long_updatable_tensors.txt",
    )
    Copy to clipboard

**Parameters:**

- `name`: Unique identifier for the use case
- `adapter_names`: List of adapter names to use in this use case
- `model`: Path to the base ONNX model
- `adapter_alphas`: Scaling factors for each adapter (controls relative influence)
- `encodings`: Path to quantization encodings for the adapters
- `quant_updatable_tensors`: Path to quant-updatable tensors file.

### Complete LoRA configuration

Combine all components into a complete LoRA configuration:

from qairt.modules.lora.lora_config import LoraConfig
    
    # Create the complete LoRA configuration
    lora_config_obj = LoraConfig(
        adapter=[adapter1, adapter2, adapter3],
        attach_point_onnx_mapping=f"{llama3_exports}/lora/attach_point_onnx_mapping.json",
        use_cases=[use_case1, use_case2, use_case3],
    )
    Copy to clipboard

**Parameters:**

- `adapter`: List of all adapter configurations
- `attach_point_onnx_mapping`: Path to JSON file mapping PyTorch module names to ONNX node names
- `use_cases`: List of all use case configurations

### Create builder input configuration

Create a builder input configuration to use with the GenAI Builder:

from qairt.modules.lora.lora_config import LoraBuilderInputConfig
    
    # Create input config with the programmatic configuration
    lora_input_config = LoraBuilderInputConfig(
        lora_config_obj=lora_config_obj,
        create_lora_graph=True,
        quant_updatable_mode="adapter_only",
        alpha_tensor_name="alpha",
    )
    
    # Or load from a YAML file
    lora_input_config_from_file = LoraBuilderInputConfig(
        lora_config_path="./llama_3.2_3b/lora/lora_config.yaml",
        create_lora_graph=True,
        quant_updatable_mode="adapter_only",
        alpha_tensor_name="alpha",
    )
    Copy to clipboard

**Parameters:**

- `lora_config_obj` or `lora_config_path`: Either a LoraConfig object or path to YAML config file
- `create_lora_graph`: Whether to create a max-rank concatenated LoRA graph
- `quant_updatable_mode`: Controls which quantization encodings can be updated:

    - `"none"`: No quantization encodings are updatable
    - `"adapter_only"`: Quantization encodings for only LoRA/adapter branch (Conv-&gt;Mul-&gt;Conv) change across use-case. The base branch quantization encodings remain the same.
    - `"all"`: All quantization encodings are updatable
- `alpha_tensor_name`: Name of the tensor where LoRA adapter scaling is applied

### Configure runtime adapter(s)

Configure adapters at runtime for inference. First, obtain an executor from your built container:

from qairt.gen_ai_api.executors.gen_ai_executor import GenAIExecutor, GenerationExecutionResult
    from qairt.modules.lora.lora_config import UseCaseRunConfig, AdapterRunConfig
    
    # Get an executor from the built LoRA container
    # (Assuming you have a built container and configured device)
    llm: GenAIExecutor = llama_lora_container.get_executor(device, clean_up=False)
    
    # Configure a single adapter
    use_case_config_1 = UseCaseRunConfig(
        use_case_name="long",
        adapters=[AdapterRunConfig(adapter_name="long", alpha=1.0)],
    )
    
    # Configure multiple adapters with custom alpha values
    use_case_config_2 = UseCaseRunConfig(
        use_case_name="elementary+long",
        adapters=[
            AdapterRunConfig(adapter_name="elementary", alpha=1.0),
            AdapterRunConfig(adapter_name="long", alpha=0.8),
        ],
    )
    
    # Generate text with the configured adapters
    prompt_1 = "Your first prompt here"
    result_1: GenerationExecutionResult = llm.generate(prompt_1, lora_config=use_case_config_1)
    
    prompt_2 = "Your second prompt here"
    result_2: GenerationExecutionResult = llm.generate(prompt_2, lora_config=use_case_config_2)
    
    # Access the generated text and metrics
    print(result_1.generated_text)
    print(result_1.metrics)
    Copy to clipboard

## Next steps

- Tutorial: [Low-Rank Adaptation (LoRA) Tutorial](https://docs.qualcomm.com/doc/80-87189-2/topic/lora_tutorial.html#lora-tutorial) - Complete guide for building and deploying LoRA-enabled models on Snapdragon devices

Last Published: May 26, 2026