# LLM Pipeline

LLM-specific pipeline implementation, configuration, and context classes.

## LLMPipeline

- *class* qairt.experimental.pipeline.torch.llm.pipeline.LLMPipeline(*config: [LLMPipelineConfig](https://docs.qualcomm.com/doc/80-87189-2/topic/qairt-pipeline-llm.html#qairt.experimental.pipeline.torch.llm.pipeline.LLMPipelineConfig)*)

    - Bases: [`Pipeline`](https://docs.qualcomm.com/doc/80-87189-2/topic/qairt-pipeline-base.html#qairt.experimental.pipeline.torch.common.bases.pipeline.Pipeline)[[`LLMPipelineConfig`](https://docs.qualcomm.com/doc/80-87189-2/topic/qairt-pipeline-llm.html#qairt.experimental.pipeline.torch.llm.pipeline.LLMPipelineConfig)]

Pipeline for PyTorch LLMs.

- *property* config*: [LLMPipelineConfig](https://docs.qualcomm.com/doc/80-87189-2/topic/qairt-pipeline-llm.html#qairt.experimental.pipeline.torch.llm.pipeline.LLMPipelineConfig)*

    - Return the domain-specific pipeline configuration.

- evaluate(*\*\*kwargs*) → dict[str, float]

    - Run configured evaluation metrics on the most recent stage output.

- Returns

    - `{display_name: score}` e.g. `{"PPL_wikitext": 8.41}`.

- Raises

    - - **RuntimeError** – If `construct()` has not been called.
- **ValueError** – If no `evaluator_config` or metrics are configured.

- *classmethod* from\_pretrained(*model\_id\_or\_path: str*, *recipe: Optional[Union[str, Path, dict[str, Any]]] = None*, *\*\*kwargs: Any*) → [LLMPipeline](https://docs.qualcomm.com/doc/80-87189-2/topic/qairt-pipeline-llm.html#qairt.experimental.pipeline.torch.llm.pipeline.LLMPipeline)

    - Create a pipeline from a recipe and execute the `model_loader` stage.

- Parameters

    - - **model\_id\_or\_path** – HuggingFace model ID or local path.
- **recipe** – Recipe file path or dict.
- **\*\*kwargs** – Forwarded to the base `from_pretrained` as config overrides.

- Returns

    - An `LLMPipeline` instance with the first stage executed.

- generate(*prompt: Union[str, list[dict[str, str]], Path]*, *device: Optional[Any] = None*, *\*\*kwargs*) → Any

    - Generate text using the last stage in the pipeline.

The last stage must have `can_generate = True` and implement `generate()`.
Users should be aware of the generation capabilities of the last stage and
provide appropriate kwargs (e.g. `max_length`, `num_beams`).

- Parameters

    - - **prompt** – Input prompt for generation. Can be a plain string, a list of
chat messages (dicts with “role” and “content” keys), or a Path to
a JSON file containing chat messages.
- **device** – Optional device specification (forwarded to the stage).
- **\*\*kwargs** – Additional generation parameters forwarded to the last stage.

- Returns

    - Generation result from the last stage.

- Raises

    - - **RuntimeError** – If `construct()` has not been called or the last stage
    has not been executed.
- **NotImplementedError** – If the last stage does not support generation.

## LLMPipelineConfig

- *class* qairt.experimental.pipeline.torch.llm.pipeline.LLMPipelineConfig(*\**, *model\_id\_or\_path: str*, *cache\_dir: str = './workspace'*, *backend: [BackendType](https://docs.qualcomm.com/doc/80-87189-2/topic/qairt-api-configs.html#qairt.api.configs.common.BackendType) = BackendType.HTP*, *soc\_details: Optional[Union[qti.aisw.tools.core.utilities.devices.api.device\_definitions.SocDetails, str]] = None*, *log\_level: Optional[str] = None*, *task: str = 'text-generation'*, *features: [PipelineFeatures](https://docs.qualcomm.com/doc/80-87189-2/topic/qairt-pipeline-llm.html#qairt.experimental.pipeline.torch.llm.pipeline.PipelineFeatures) = None*, *enable\_cache: bool = False*, *checkpoint: Optional[str] = None*, *enable\_observers: bool = False*, *observers: dict[str, Any] = None*, *stage\_info: OrderedDict[str, dict[str, Any]] = None*, *generator\_config: [qairt.experimental.pipeline.torch.common.configs.GeneratorConfig](https://docs.qualcomm.com/doc/80-87189-2/topic/qairt-pipeline-common.html#qairt.experimental.pipeline.torch.common.configs.GeneratorConfig) | None = None*, *evaluator\_config: [qairt.experimental.pipeline.torch.common.configs.EvaluatorConfig](https://docs.qualcomm.com/doc/80-87189-2/topic/qairt-pipeline-common.html#qairt.experimental.pipeline.torch.common.configs.EvaluatorConfig) | None = None*, *exporter\_config: [qairt.experimental.pipeline.torch.common.configs.ExporterConfig](https://docs.qualcomm.com/doc/80-87189-2/topic/qairt-pipeline-common.html#qairt.experimental.pipeline.torch.common.configs.ExporterConfig) | None = None*)

    - Bases: [`PipelineConfig`](https://docs.qualcomm.com/doc/80-87189-2/topic/qairt-pipeline-base.html#qairt.experimental.pipeline.torch.common.bases.pipeline.PipelineConfig), [`LLMPipelineContext`](https://docs.qualcomm.com/doc/80-87189-2/topic/qairt-pipeline-llm.html#qairt.experimental.pipeline.torch.llm.pipeline.LLMPipelineContext)

Global pipeline configuration for LLM pipelines.

- add\_config(*stage\_name: str*, *config: [qairt.experimental.pipeline.torch.common.configs.ExporterConfig](https://docs.qualcomm.com/doc/80-87189-2/topic/qairt-pipeline-common.html#qairt.experimental.pipeline.torch.common.configs.ExporterConfig) | [qairt.experimental.pipeline.torch.common.configs.GeneratorConfig](https://docs.qualcomm.com/doc/80-87189-2/topic/qairt-pipeline-common.html#qairt.experimental.pipeline.torch.common.configs.GeneratorConfig) | [qairt.experimental.pipeline.torch.common.configs.EvaluatorConfig](https://docs.qualcomm.com/doc/80-87189-2/topic/qairt-pipeline-common.html#qairt.experimental.pipeline.torch.common.configs.EvaluatorConfig)*) → None

    - Adds a config for a specific stage.

- Parameters

    - - **stage\_name** – Name of the stage to associate the config with.
- **config** – Must be one of ExporterConfig, GeneratorConfig or EvaluatorConfig.
Stage-specific configs override global configs for the same stage.

- add\_dataloader(*stage\_name: str*, *dataloader: DataLoader*) → None

    - Adds a data loader for a specific stage.

- Parameters

    - - **stage\_name** – Name of the stage to associate the dataloader with (e.g.,
“quantization” for calibration dataloader)
- **dataloader** – The DataLoader instance to add

- *field* evaluator\_config*: [EvaluatorConfig](https://docs.qualcomm.com/doc/80-87189-2/topic/qairt-pipeline-common.html#qairt.experimental.pipeline.torch.common.configs.EvaluatorConfig) | None*  *= None*

    - - Validated by

    - - `_check_unsupported_features`

- *field* exporter\_config*: [ExporterConfig](https://docs.qualcomm.com/doc/80-87189-2/topic/qairt-pipeline-common.html#qairt.experimental.pipeline.torch.common.configs.ExporterConfig) | None*  *= None*

    - - Validated by

    - - `_check_unsupported_features`

- *classmethod* from\_recipe(*recipe: Union[str, Path, dict[str, Any]]*) → [LLMPipelineConfig](https://docs.qualcomm.com/doc/80-87189-2/topic/qairt-pipeline-llm.html#qairt.experimental.pipeline.torch.llm.pipeline.LLMPipelineConfig)

    - Load pipeline configuration from a YAML recipe file or dict.

- Parameters

    - **recipe** – Path to YAML recipe file, or a pre-loaded recipe dict.

- Returns

    - `LLMPipelineConfig` instance.

- *field* generator\_config*: [GeneratorConfig](https://docs.qualcomm.com/doc/80-87189-2/topic/qairt-pipeline-common.html#qairt.experimental.pipeline.torch.common.configs.GeneratorConfig) | None*  *= None*

    - - Validated by

    - - `_check_unsupported_features`

- model\_computed\_fields*: ClassVar[dict[str, ComputedFieldInfo]]*  *= {}*

    - A dictionary of computed field names and their corresponding <cite>ComputedFieldInfo</cite> objects.

- model\_config*: ClassVar[ConfigDict]*  *= {'arbitrary\_types\_allowed': True, 'protected\_namespaces': ()}*

    - Configuration for the model, should be a dictionary conforming to [<cite>ConfigDict</cite>][pydantic.config.ConfigDict].

- model\_fields*: ClassVar[dict[str, FieldInfo]]*  *= {'backend': FieldInfo(annotation=BackendType, required=False, default=&lt;BackendType.HTP: 'HTP'&gt;), 'cache\_dir': FieldInfo(annotation=str, required=False, default='./workspace'), 'checkpoint': FieldInfo(annotation=Union[str, NoneType], required=False, default=None), 'enable\_cache': FieldInfo(annotation=bool, required=False, default=False), 'enable\_observers': FieldInfo(annotation=bool, required=False, default=False), 'evaluator\_config': FieldInfo(annotation=Union[EvaluatorConfig, NoneType], required=False, default=None), 'exporter\_config': FieldInfo(annotation=Union[ExporterConfig, NoneType], required=False, default=None), 'features': FieldInfo(annotation=PipelineFeatures, required=False, default\_factory=PipelineFeatures), 'generator\_config': FieldInfo(annotation=Union[GeneratorConfig, NoneType], required=False, default=None), 'log\_level': FieldInfo(annotation=Union[str, NoneType], required=False, default=None), 'model\_id\_or\_path': FieldInfo(annotation=str, required=True), 'observers': FieldInfo(annotation=dict[str, Any], required=False, default\_factory=dict), 'soc\_details': FieldInfo(annotation=Union[SocDetails, str, NoneType], required=False, default=None), 'stage\_info': FieldInfo(annotation=OrderedDict[str, dict[str, Any]], required=False, default\_factory=OrderedDict), 'task': FieldInfo(annotation=str, required=False, default='text-generation')}*

    - Metadata about the fields defined on the model,
mapping of field names to [<cite>FieldInfo</cite>][pydantic.fields.FieldInfo].

This replaces <cite>Model.__fields__</cite> from Pydantic V1.

- model\_post\_init(*context: Any*, */*) → None

    - We need to both initialize private attributes and call the user-defined model\_post\_init
method.

## LLMPipelineContext

- *class* qairt.experimental.pipeline.torch.llm.pipeline.LLMPipelineContext(*\**, *model\_id\_or\_path: str*, *cache\_dir: str = './workspace'*, *backend: [BackendType](https://docs.qualcomm.com/doc/80-87189-2/topic/qairt-api-configs.html#qairt.api.configs.common.BackendType) = BackendType.HTP*, *soc\_details: Optional[Union[qti.aisw.tools.core.utilities.devices.api.device\_definitions.SocDetails, str]] = None*, *log\_level: Optional[str] = None*, *task: str = 'text-generation'*, *features: [PipelineFeatures](https://docs.qualcomm.com/doc/80-87189-2/topic/qairt-pipeline-llm.html#qairt.experimental.pipeline.torch.llm.pipeline.PipelineFeatures) = None*)

    - Bases: [`PipelineContext`](https://docs.qualcomm.com/doc/80-87189-2/topic/qairt-pipeline-base.html#qairt.experimental.pipeline.torch.common.bases.pipeline_context.PipelineContext)

LLM-specific stage-visible pipeline context.

Extends `PipelineContext` with fields specific to large language models.
Injected into `StageConfig._pipeline_context` for all LLM stages.

- *field* features*: [PipelineFeatures](https://docs.qualcomm.com/doc/80-87189-2/topic/qairt-pipeline-llm.html#qairt.experimental.pipeline.torch.llm.pipeline.PipelineFeatures)*  *[Optional]*

    - 

- model\_computed\_fields*: ClassVar[dict[str, ComputedFieldInfo]]*  *= {}*

    - A dictionary of computed field names and their corresponding <cite>ComputedFieldInfo</cite> objects.

- model\_config*: ClassVar[ConfigDict]*  *= {'arbitrary\_types\_allowed': True, 'protected\_namespaces': ()}*

    - Configuration for the model, should be a dictionary conforming to [<cite>ConfigDict</cite>][pydantic.config.ConfigDict].

- model\_fields*: ClassVar[dict[str, FieldInfo]]*  *= {'backend': FieldInfo(annotation=BackendType, required=False, default=&lt;BackendType.HTP: 'HTP'&gt;), 'cache\_dir': FieldInfo(annotation=str, required=False, default='./workspace'), 'features': FieldInfo(annotation=PipelineFeatures, required=False, default\_factory=PipelineFeatures), 'log\_level': FieldInfo(annotation=Union[str, NoneType], required=False, default=None), 'model\_id\_or\_path': FieldInfo(annotation=str, required=True), 'soc\_details': FieldInfo(annotation=Union[SocDetails, str, NoneType], required=False, default=None), 'task': FieldInfo(annotation=str, required=False, default='text-generation')}*

    - Metadata about the fields defined on the model,
mapping of field names to [<cite>FieldInfo</cite>][pydantic.fields.FieldInfo].

This replaces <cite>Model.__fields__</cite> from Pydantic V1.

- model\_post\_init(*context: Any*, */*) → None

    - We need to both initialize private attributes and call the user-defined model\_post\_init
method.

- *field* task*: str*  *= 'text-generation'*

    -

## PipelineFeatures

- *class* qairt.experimental.pipeline.torch.llm.pipeline.PipelineFeatures(*\**, *speculative\_decoding: Tuple[bool, str | None] = (False, None)*, *lora: Optional[[LoRAFeatureConfig](https://docs.qualcomm.com/doc/80-87189-2/topic/qairt-pipeline-common.html#qairt.experimental.pipeline.torch.common.configs.LoRAFeatureConfig)] = None*)

    - Bases: `BaseModel`

Cross-cutting features that can be enabled/disabled for the entire pipeline.
These features can affect multiple stages. If set:

- stages will check to ensure feature related configs are provided.
- stages will enable feature-specific logic during execution.

**Field declaration order is load-bearing**: GeneratorFactory composes
generator mixins in the order fields are declared here.  The first-declared
field gets the highest MRO position and therefore runs first in forward().
Do not reorder fields without updating the MRO contract test.

- feature\_keys() → list[str]

    - Derive all active feature keys in field declaration order.

Iterates over the model fields in the order they are declared and
collects a key for every enabled field.  For tuple-valued fields (e.g.
`speculative_decoding = (True, "eaglet")`) the key is
`"{field_name}_{method}"`; plain `bool` or `Optional` fields use
the field name directly.

**The returned order is load-bearing**: it determines mixin MRO
position inside `GeneratorFactory._compose()`.  The first key in
the list produces the highest mixin in the MRO (called first in
`forward()`).  Reordering fields in `PipelineFeatures` therefore
changes execution order — do not do so without updating the MRO
contract test.

- Returns

    - Ordered list of active feature key strings, in field declaration
order.  Empty when no feature is enabled.

- *field* lora*: Optional[[LoRAFeatureConfig](https://docs.qualcomm.com/doc/80-87189-2/topic/qairt-pipeline-common.html#qairt.experimental.pipeline.torch.common.configs.LoRAFeatureConfig)]*  *= None*

    - 

- model\_computed\_fields*: ClassVar[dict[str, ComputedFieldInfo]]*  *= {}*

    - A dictionary of computed field names and their corresponding <cite>ComputedFieldInfo</cite> objects.

- model\_config*: ClassVar[ConfigDict]*  *= {}*

    - Configuration for the model, should be a dictionary conforming to [<cite>ConfigDict</cite>][pydantic.config.ConfigDict].

- model\_fields*: ClassVar[dict[str, FieldInfo]]*  *= {'lora': FieldInfo(annotation=Union[LoRAFeatureConfig, NoneType], required=False, default=None), 'speculative\_decoding': FieldInfo(annotation=Tuple[bool, Union[str, NoneType]], required=False, default=(False, None))}*

    - Metadata about the fields defined on the model,
mapping of field names to [<cite>FieldInfo</cite>][pydantic.fields.FieldInfo].

This replaces <cite>Model.__fields__</cite> from Pydantic V1.

- *field* speculative\_decoding*: Tuple[bool, str | None]*  *= (False, None)*

    -

Last Published: Jun 19, 2026

[Previous Topic
RunResult.output](https://docs.qualcomm.com/bundle/publicresource/80-87189-2/topics/qairt-pipeline-base.md)