# Guides

## Pipeline (Experimental)

The Pipeline API provides a declarative, stage-based orchestration system that replaces
manual notebook workflows for model loading, quantization, and compilation. Define your
entire workflow in a single YAML recipe and run it with a few lines of Python.

- [Pipeline Overview](https://docs.qualcomm.com/doc/80-87189-2/topic/pipeline_overview.html)
- [Getting Started with Pipeline](https://docs.qualcomm.com/doc/80-87189-2/topic/pipeline_getting_started.html)
- [Pipeline Configuration](https://docs.qualcomm.com/doc/80-87189-2/topic/pipeline_configuration.html)
- [Quantization Recipes](https://docs.qualcomm.com/doc/80-87189-2/topic/pipeline_quantization_recipes.html)
- [Customizing the Pipeline](https://docs.qualcomm.com/doc/80-87189-2/topic/pipeline_customization.html)
- [Advanced Usage](https://docs.qualcomm.com/doc/80-87189-2/topic/pipeline_expert_usage.html)
- [Migration Guide: Notebook → Pipeline](https://docs.qualcomm.com/doc/80-87189-2/topic/pipeline_migration.html)

## Gen AI Builder

The Gen AI Builder is a Python API that automates step 2 of the typical LLM deployment workflow:
it takes a quantized ONNX model and compiles it into a `GenAIContainer` ready for on-device
inference. The guides below cover configuration options, advanced features, and migration from
notebook-based workflows.

- [Gen AI Builder Overview](https://docs.qualcomm.com/doc/80-87189-2/topic/genai_overview.html)
- [Configuring the Gen AI Builder](https://docs.qualcomm.com/doc/80-87189-2/topic/genai_builder_configuration.html)
- [HTP Backend Extensions](https://docs.qualcomm.com/doc/80-87189-2/topic/genai_backend_extensions.html)
- [Advanced Features](https://docs.qualcomm.com/doc/80-87189-2/topic/genai_advanced_features.html)
- [Migrating from Notebook Workflows](https://docs.qualcomm.com/doc/80-87189-2/topic/genai_migration.html)

## ONNX Optimizer

The ONNX Optimizer transforms and optimizes ONNX models — especially LLMs —
before they are passed to [`qairt.convert()`](https://docs.qualcomm.com/doc/80-87189-2/topic/qairt-core-api.html#qairt.convert). Start with the overview to
learn what the optimizer does, when to use each transformation, and how to
fit it into an end-to-end flow. Then explore the worked examples for the
common workflows and a custom-pass walkthrough.

- [ONNX Optimizer Overview](https://docs.qualcomm.com/doc/80-87189-2/topic/qairt-optimizer-overview.html)
- [ONNX Optimizer Examples](https://docs.qualcomm.com/doc/80-87189-2/topic/qairt-optimizer-examples.html)

## Utilities

- [Resource Profiler](https://docs.qualcomm.com/doc/80-87189-2/topic/qairt-resource-profiler.html)
- [Logging Configuration](https://docs.qualcomm.com/doc/80-87189-2/topic/qairt-logging-utility.html)

Last Published: Jun 19, 2026

[Previous Topic
Step 5: Save the Container (Optional)](https://docs.qualcomm.com/bundle/publicresource/80-87189-2/topics/speculative_decoding_tutorial.md) [Next Topic
Pipeline Overview](https://docs.qualcomm.com/bundle/publicresource/80-87189-2/topics/pipeline_overview.md)