Retries & DLQ

Retry policy, backoff, and strict dead-letter routing.

DriftQ’s retry behavior is driven by a retry policy carried on the message envelope. This keeps retry semantics attached to the work itself (not hidden inside a consumer).

Mental model: each delivery attempt either ends in ack (done) or redelivery (retry). Retry policy decides how many times and how fast.

Setting retry policy

When producing via /v1/produce, you can pass retry-related query params like:

  • retry_max_attempts — maximum number of deliveries before DLQ
  • retry_backoff_ms — base backoff between retries
  • retry_max_backoff_ms — cap for backoff growth
Example
curl -i -X POST "http://localhost:8080/v1/produce?topic=t&value=hello&retry_max_attempts=3&retry_backoff_ms=200&retry_max_backoff_ms=5000"

Attempts

DriftQ tracks how many times a message has been delivered via attempts. Each time the lease expires without a successful ack, the message is scheduled again and attempts is incremented.

Important: retries are driven by redelivery. If you never ack, attempts keep climbing.

DLQ routing

Once attempts reaches max_attempts, DriftQ stops retrying and routes the message to a DLQ (dead-letter queue).

  • The message is moved to a DLQ path/topic (strict dead-letter routing)
  • DriftQ increments dlq_messages_total{reason=...} to explain why it was dead-lettered

Why DLQ exists: it prevents infinite retry loops and gives you a clean place to inspect and remediate failed work.

What you typically do with DLQ messages

  • Alert on DLQ growth (your system is failing “for real”)
  • Inspect payload + error reason
  • Fix the root cause, then replay or re-produce corrected messages

Keep it strict: if something hits DLQ, treat it as a signal that needs attention, not background noise.