Patterns
πŸ“‹

Event-Driven Orchestrator-Worker(EDOW)

Central orchestrator assigns tasks to worker agents through event streaming

Complexity: mediumPattern

Core Mechanism

Central orchestrator coordinates work by publishing events/commands to topics or queues. Stateless workers in a consumer group pull events, process tasks idempotently, and emit completion or result events. The orchestrator advances the workflow based on observed events, enabling decoupled, asynchronous scaling and recovery via replay and DLQs.

Workflow / Steps

  1. Receive trigger (API/webhook/schedule) and persist correlation/workflow state.
  2. Publish task events to a work topic/queue with keys for partition affinity and ordering where needed.
  3. Workers consume in a consumer group, perform task with retries and timeouts, and write idempotent side effects.
  4. On completion/failure, workers emit result/compensation events; errors can be routed to DLQ with metadata.
  5. Orchestrator listens for result events, updates state, branches next steps, and aggregates final outcome.

Best Practices

Design idempotent handlers; use request/operation keys and deduplication to achieve exactly-once effects.
Adopt the transactional outbox/inbox pattern for cross-system reliability and to avoid dual-write anomalies.
Version event schemas; enforce compatibility via schema registry and contract tests.
Use consumer groups and partitions/shards for horizontal scalability; avoid hot keys.
Apply backpressure: prefetch/flow control, max in-flight per consumer, and adaptive concurrency.
Implement DLQs with triage metadata; set bounded retries with exponential backoff and jitter.
Correlate events with workflow/run IDs; emit structured logs/metrics/traces for end-to-end observability.
Secure topics/queues with authZ/authN; redact PII and secrets in payloads and logs.

When NOT to Use

  • Ultra-low latency, strictly synchronous request/response paths.
  • Simple, short-lived flows where orchestration overhead adds needless complexity.
  • Workloads requiring strict cross-service ACID transactions without compensation.

Common Pitfalls

  • Dual writes without atomicity β†’ lost or duplicated work; missing outbox/inbox.
  • Assuming exactly-once delivery; most brokers are at-least-onceβ€”achieve idempotent processing instead.
  • Unbounded retries causing storms; no backoff, rate limits, or circuit breakers.
  • Hot partitions/keys leading to skew and latency spikes.
  • Opaque workflows with missing correlation IDs and trace propagation.

Key Features

Decoupled orchestration via topics/queues
Consumer group-based load balancing
Automatic scaling and rebalancing
Event replay and DLQs for recovery
Ordering by key/partition where required
Stateless workers; durable workflow state in store
Policy-driven retries and compensation
End-to-end observability (metrics/logs/traces)

KPIs / Success Metrics

  • Throughput (msgs/s, MB/s) per topic and per consumer group; consumer lag.
  • Latency p50/p95 end-to-end and per stage; retry/failure rate; DLQ rate and time-to-recover.
  • Workflow SLOs: time-to-completion, success rate, and cost per completed workflow.

Token / Resource Usage

  • Bound message size (e.g., ~256 KB SNS/SQS, ~10 MB Google Pub/Sub, Kafka configurable); store blobs externally and pass references.
  • Tune partitions, prefetch, and max in-flight to balance throughput vs. memory/CPU; compress payloads.
  • If LLMs participate, cap tokens per step; summarize events; log per-step token and cost budgets.

Best Use Cases

  • Document/media processing pipelines; batch and stream ETL.
  • Microservices orchestration with compensations (order β†’ payment β†’ fulfillment).
  • Real-time workflows with partitioned ordering needs and elastic scaling.

References & Further Reading

Tools & Libraries

  • Apache Kafka, RabbitMQ, Google Pub/Sub, AWS SNS/SQS, NATS JetStream, Redis Streams
  • Orchestrators: Temporal, Camunda, AWS Step Functions
  • Clients: librdkafka/Confluent, Sarama (Go), Spring Kafka/AMQP, aio-pika

Community & Discussions

Patterns

closed

Loading...

Built by Kortexya