Use Cases

DriftQ is most valuable when you have slow, flaky, or expensive work (LLMs, tool calls, ingestion) and you need it to be durable, retryable, and debuggable — without turning application code into a fragile workflow engine.

Use DriftQ when…

work can’t fit safely inside an HTTP request
retries would cause duplicate side-effects
you need ownership/leases and explicit ack/nack
you want replay, auditability, and DLQ

Probably overkill when…

tasks are sub-second and synchronous is fine
duplicates or loss are acceptable
you don’t need retries, DLQ, or replay

One-line pitch

DriftQ turns unreliable “background work” into a consistent workflow system: idempotency-safe, retryable, observable, and resumable.

One common application pattern

A typical setup looks like this: an API layer stays the front door (auth, validation, quotas). DriftQ becomes the work backbone. Workers execute long-running or failure- prone jobs. The frontend subscribes to progress and results (SSE, WebSocket, polling — your call).

Typical flow

Client triggers work via an API request
The API validates and enqueues a message into DriftQ
Workers consume and execute the pipeline
Workers ack success / nack failure with retries and DLQ
The UI reconnects and resumes using the same run ID

The key discipline

Define your envelope (run_id, tenant_id, user_id), choose stable idempotency keys, and decide what “ack means done” for each step. That’s what prevents expensive duplicates.

Each scenario below is the same structure: what hurts, how DriftQ helps, and what you gain. This is the stuff teams hit daily in real production apps.

Async LLM runs (no more request timeouts)

LLMFastAPIWorkers

Problem

LLM + retrieval + tool calls can take 5–60 seconds. Doing this inside a FastAPI request/response causes timeouts, client retries, and user refreshes that trigger the same run multiple times.

How DriftQ helps

FastAPI enqueues a durable “run” message into DriftQ and returns immediately. Workers consume and execute the pipeline with ownership + retries + idempotency.

Outcome

Fast, predictable APIs. Durable runs. Less duplicated cost. Workers scale independently from the web tier.

Streaming tokens + reconnect support

StreamingSSE/WebSocketResumable UX

Problem

Streaming directly from an HTTP request breaks when the client disconnects (mobile, tab switch, refresh). Users reconnect and accidentally restart expensive work.

How DriftQ helps

DriftQ coordinates run lifecycle. The UI subscribes to run updates keyed by run_id. Disconnects don’t kill the run; the user reconnects and continues.

Outcome

Streaming UX that survives disconnects. Fewer “it restarted” bugs. Less wasted spend.

Idempotency for expensive LLM calls

IdempotencyCost control

Problem

Double-clicks, retries, and deploy restarts cause the same prompt to be executed twice, wasting money and possibly producing inconsistent state.

How DriftQ helps

Use a stable idempotency key per run (e.g., hash of user_id + conversation_id + prompt + params). DriftQ reserves/leases ownership for that key and commits it on ack.

Outcome

Retries become safe. Costs drop. “Why did it run twice?” stops happening.

Tool-calling agents without duplicate side-effects

AgentsTool callsCorrectness

Problem

LangChain-style agents call tools: send emails, create tickets, charge cards, provision resources. If a run retries, you can easily execute side-effects twice.

How DriftQ helps

Model each tool step as its own message with its own idempotency key. Ack commits the effect. Nack retries with backoff. DLQ quarantines poison tasks.

Outcome

Exactly-once effects (in practice). Fewer production incidents and less “agent did it twice” embarrassment.

Backpressure under provider rate limits

Backpressure429Reliability

Problem

When OpenAI/Anthropic rate-limit you (or you spike traffic), synchronous handling causes cascading failures: timeouts → retries → more load → outage.

How DriftQ helps

DriftQ absorbs bursts and can reject/slow producers under overload. Workers enforce concurrency caps and retry with backoff instead of stampeding.

Outcome

Graceful degradation. More “slower but alive” and less “everything down”. Incidents get easier to manage.

Document ingestion & embeddings (RAG pipelines)

RAGEmbeddingsIngestion

Problem

Ingestion is multi-step (fetch → parse → chunk → embed → index). Failures create partial indexes, repeated embeddings, and messy manual recovery.

How DriftQ helps

Represent ingestion as step messages with per-step idempotency. DLQ bad documents without blocking the rest. Replay a run when needed.

Outcome

Predictable ingestion. Less wasted embedding spend. Easy recovery and debugging.

Webhook storms

WebhooksBurst trafficDeduping

Problem

Webhooks arrive in bursts and often duplicate. Synchronous processing leads to timeouts, and the sender retries, amplifying the storm.

How DriftQ helps

FastAPI validates and enqueues immediately. Workers process asynchronously using the webhook event ID as an idempotency key.

Outcome

Stable ingestion. Controlled processing. Fewer outages during bursts.

Replacing cron scripts with durable jobs

CronScheduled tasksOps

Problem

Cron on one machine fails silently, runs twice, or gets forgotten. You discover issues days later.

How DriftQ helps

Treat scheduled tasks as messages. A scheduler produces into DriftQ. Workers run with retries/backoff and DLQ.

Outcome

You know what ran and what failed. Jobs survive restarts. Less “cron roulette.”

Replay for debugging and incident response

ReplayWALAudit

Problem

A customer reports “your app did something weird.” Logs are incomplete. State already mutated. You can’t reproduce the run reliably.

How DriftQ helps

WAL-backed persistence + structured envelopes enable replay of workflows or specific steps. Trace events make “what happened” human-readable.

Outcome

Faster root-cause analysis. Better trust. A path to compliance/auditing if you need it.

When DriftQ is not the right tool

If your app is tiny, tasks are sub-second, and you can tolerate occasional loss or duplication, you’ll move faster with something simpler. DriftQ is infrastructure — it must earn its place.

Blunt adoption advice

Don’t integrate DriftQ everywhere at once. Pick one high-pain workflow (ingestion, webhooks, notifications, long-running jobs). Prove reliability and cost savings. Then expand.