# Dialog Pause/Resume

The **Pause/Resume** feature allows an in-progress dialog query to be paused
from another thread and later resumed from the exact point at which it was
paused, preserving all intermediate state (KV cache, sampler state, partial
response). This is useful when the application needs to temporarily free the
accelerator for a higher-priority task, yield to the user, or coordinate
between multiple active dialogs without discarding work already done.

Pausing is signaled through the existing
[GenieDialog\_signal](https://docs.qualcomm.com/doc/80-63442-10/topic/function_GenieDialog_8h_1a24737fa908e20c0877f972d39f67e48c.html#exhale-function-geniedialog-8h-1a24737fa908e20c0877f972d39f67e48c)
API using the `GENIE_DIALOG_ACTION_PAUSE` action. Resumption is requested
through the regular query APIs by passing the `GENIE_DIALOG_SENTENCE_RESUME`
sentence code with a `NULL`/empty input.

## Pausing an active query

To pause an active query, call `GenieDialog_signal` with
`GENIE_DIALOG_ACTION_PAUSE` from a thread other than the one executing the
query:

typedef enum {
      /// Signals abort as an action to active dialog.
      GENIE_DIALOG_ACTION_ABORT = 0x01,
      /// Signals to pause an active query
      GENIE_DIALOG_ACTION_PAUSE = 0x02
    } GenieDialog_Action_t;
    Copy to clipboard

Once the pause is honored, the in-flight query call returns with the warning
status code:

#define GENIE_STATUS_WARNING_PAUSED  3
    Copy to clipboard

The caller should treat `GENIE_STATUS_WARNING_PAUSED` as a non-fatal result
and preserve the dialog handle for the subsequent resume.

## Resuming a paused query

Resumption is requested by invoking the same query API used for the original
call, but with the `GENIE_DIALOG_SENTENCE_RESUME` sentence code and no new
input:

typedef enum {
      /// The string is the entire query/response.
      GENIE_DIALOG_SENTENCE_COMPLETE = 0,
      /// ...
      /// Rewind the KV cache as per prefix query match before processing the query
      GENIE_DIALOG_SENTENCE_REWIND = 5,
      /// A paused query has resumed.
      GENIE_DIALOG_SENTENCE_RESUME = 6,
    } GenieDialog_SentenceCode_t;
    Copy to clipboard

For `GenieDialog_query`, the `queryStr` argument must be `NULL` (or an
empty string). Supplying any other value with `GENIE_DIALOG_SENTENCE_RESUME`
results in `GENIE_STATUS_ERROR_INVALID_ARGUMENT`. The same rule applies to
the batch, token, and embedding query variants — the respective input
buffer/count must be empty.

When the dialog resumes, the response callback will be invoked with the
`GENIE_DIALOG_SENTENCE_RESUME` sentence code before token generation
continues, so clients can distinguish a resumed stream from a fresh one.

## C++ example

The minimal flow is: issue a query on a worker thread, signal `PAUSE` from
a controller thread, then resume with `GENIE_DIALOG_SENTENCE_RESUME` once
the worker observes `GENIE_STATUS_WARNING_PAUSED`.

// Worker thread — runs the query
    Genie_Status_t status = GenieDialog_query(
        dialogHandle,
        "Tell me about Qualcomm.",
        GENIE_DIALOG_SENTENCE_COMPLETE,
        queryCallback,
        /*userData=*/nullptr);
    
    if (status == GENIE_STATUS_WARNING_PAUSED) {
      // Dialog paused mid-flight. Handle is still valid; state preserved.
    }
    
    // Controller thread — pauses the in-flight query
    GenieDialog_signal(dialogHandle, GENIE_DIALOG_ACTION_PAUSE);
    
    // Later, to resume — queryStr must be NULL, sentence code = RESUME
    GenieDialog_query(dialogHandle,
                      /*queryStr=*/nullptr,
                      GENIE_DIALOG_SENTENCE_RESUME,
                      queryCallback,
                      /*userData=*/nullptr);
    Copy to clipboard

The response callback observes `GENIE_DIALOG_SENTENCE_RESUME` once on the
resumption boundary, followed by `GENIE_DIALOG_SENTENCE_CONTINUE` /
`GENIE_DIALOG_SENTENCE_END` as usual:

void queryCallback(const char* response,
                       GenieDialog_SentenceCode_t sentenceCode,
                       const void* /*userData*/) {
      switch (sentenceCode) {
        case GENIE_DIALOG_SENTENCE_BEGIN:   /* stream start */  break;
        case GENIE_DIALOG_SENTENCE_RESUME:  /* resumed after pause */ break;
        case GENIE_DIALOG_SENTENCE_CONTINUE: /* mid-stream token */  break;
        case GENIE_DIALOG_SENTENCE_END:     /* stream complete */    break;
        default: break;
      }
      if (response) { /* print or collect response */ }
    }
    Copy to clipboard

Note

Pause/Resume is supported for basic and speculative (SSD) dialogs. It is
**not** supported together with multi-batch processing — a query issued
with a batch size greater than one cannot be paused.

## Tutorial on how to use Pause in genie-t2t-run

Note

For a reference implementation of the Pause/Resume flow, see the
`genie-t2t-run` source in `${QNN_SDK_ROOT}/examples/Genie/genie-t2t-run`.

`genie-t2t-run` exposes pause via the same `--action` interface used for
abort. The signal thread sleeps for the duration given by `--sleep`
(milliseconds) before issuing the action against the active query, allowing
the pause to be timed relative to query progress.

For example:

./genie-t2t-run -c llama2-7b-htp.json
                    -p "Tell me about Qualcomm."
                    --action PAUSE --sleep 3000
    Copy to clipboard

On success, the tool prints `Query successfully paused` once the in-flight
query returns with `GENIE_STATUS_WARNING_PAUSED`, and the response
callback emits a `[RESUME]` marker when the paused stream is resumed.

Last Published: Jun 04, 2026

[Previous Topic
Query Cancellation](https://docs.qualcomm.com/bundle/publicresource/80-63442-10/topics/signal.md) [Next Topic
KV$ Rewind](https://docs.qualcomm.com/bundle/publicresource/80-63442-10/topics/rewind.md)