001 - Saga Orchestration

ADR Metadata ACCEPTED

Date

2024-01-01

Deciders

Architecture Team

Context

The MRI system needs to process complex workflows that involve multiple distributed steps (Inbound -> Validation -> OCR -> Integration -> Notification). A monolithic approach would couple these steps too tightly, while simple choreography (event-driven chains) can become hard to track and debug (“Distributed Monolith”). We need a way to maintain data consistency and handle failures gracefully across distributed services.

Decision

We decided to use the Saga Orchestration Pattern.

A central Orchestrator (likely implemented with AWS Step Functions or a specialized Lambda) will be responsible for:

Receiving the initial trigger.
Dispatching commands to workers (Consumers).
Listening for results.
Deciding the next step.
Handling rollback/compensation logic if a step fails.

Consequences

✓ Positive Consequences

Centralized Visibility: We know exactly where a request is in the flow.
Easier Error Handling: The orchestrator can implement retries, timeouts, and fallbacks centrally.
Decoupled Workers: Workers only know about their specific task and don’t need to know who triggers them or what happens next.

✗ Negative Consequences

Single Point of Failure: The Orchestrator becomes a critical component. If it goes down, the workflow stops (mitigated by using managed services like Step Functions).
Complexity: Implementing an orchestrator is more complex than simple event chains.