Distributed Transactions in Microservices: Implementing the Saga Pattern

Managing distributed transactions is one of the most critical challenges in microservices architecture. Since microservices operate with decentralized data storage, traditional ACID transactions across services are not feasible. The Saga Pattern is a proven solution for ensuring data consistency in such distributed systems.

In this blog, we’ll discuss:

What is the Saga Pattern?
Types of Saga Patterns: Orchestration vs. Choreography
How to Choose Between Them
Implementing Orchestration-Based Saga with Spring Boot
An Approach to Event-Driven Saga with Kafka

1. What is the Saga Pattern?

The Saga Pattern breaks a long-running distributed transaction into a series of smaller atomic transactions, each managed by a microservice. If any step fails, compensating actions are performed to roll back the preceding operations.

Example: In an e-commerce system, a customer places an order:

Payment is processed.
Inventory is reserved.
Shipping is scheduled.

If inventory reservation fails, the payment must be refunded and the order canceled.

2. Types of Saga Patterns

a. Orchestration-Based Saga

A central orchestrator manages the transaction.
The orchestrator communicates with each microservice to execute steps sequentially and handles failures by triggering compensating actions.

Characteristics:

Centralized control.
Easier to implement and monitor.
Ideal for workflows with strict dependencies.

b. Choreography-Based Saga

Each microservice publishes events and reacts to events from other services.
No centralized controller exists; the flow is managed via event-driven architecture.

Characteristics:

Decentralized and highly scalable.
Suitable for loosely coupled systems.
Complexity grows with the number of services and events.

3. Choosing Between Orchestration and Choreography

Criteria	Orchestration	Choreography
Control	Centralized control ensures traceability.	Decentralized; each service acts autonomously.
Ease of Debugging	Easier to debug due to centralized logic.	Harder to debug due to event chains.
Scalability	Less scalable for large systems.	Highly scalable and flexible.
Use Case	Complex workflows with strong dependencies.	Loosely coupled systems, real-time needs.
Example	Banking, insurance claims, e-commerce order processing.	IoT, event tracking, real-time analytics.

4. Implementing Orchestration-Based Saga with Spring Boot

We’ll implement a basic Orchestration-Based Saga for an e-commerce order workflow with the following services:

Order Service: Creates the order.
Payment Service: Processes payment.
Inventory Service: Reserves inventory.
Shipping Service: Arranges shipping.

Saga Orchestrator

The orchestrator will:

Track the state of each step in a database.
Call each service sequentially.
Handle failures by triggering compensating actions.

Orchestrator Design Considerations

Transaction State Management: The orchestrator should track the saga’s progress at each stage (e.g., "Order Created", "Payment Processed").
Rollback Mechanism: In case of failure, the orchestrator should initiate compensating transactions to roll back any changes made by previous steps.
Timeout Handling: Add timeouts to prevent the saga from hanging indefinitely if one of the services fails to respond.
Retries: Implement retry mechanisms for transient failures.
Database Integration: Use a persistent store (such as a relational database) to track the status of each step in the saga and allow for recovery if the orchestrator crashes or the system is restarted.
Error Handling: Handle both expected (e.g., validation errors) and unexpected errors (e.g., network failures) gracefully.

Basic Code Snippet

@RestController

@RequestMapping("/saga")

public class RobustSagaOrchestrator {

@Autowired

private SagaRepository sagaRepository;

@Autowired

private OrderServiceClient orderServiceClient;

@Autowired

private PaymentServiceClient paymentServiceClient;

@Autowired

private InventoryServiceClient inventoryServiceClient;

@Autowired

private ShippingServiceClient shippingServiceClient;

private static final int MAX_RETRIES = 3;

@PostMapping("/create-order")

public ResponseEntity<String> createOrder(@RequestBody OrderRequest request) {

SagaExecution sagaExecution = new SagaExecution();

sagaExecution.setOrderId(request.getOrderId());

sagaExecution.setStatus("PENDING");

sagaExecution.setCurrentStep("Order Creation");

sagaRepository.save(sagaExecution);

try {

// Step 1: Create Order

executeWithRetry(() -> orderServiceClient.createOrder(request), sagaExecution, "Payment Processing");

// Step 2: Process Payment

executeWithRetry(() -> paymentServiceClient.processPayment(request.getOrderId()), sagaExecution, "Inventory Reservation");

// Step 3: Reserve Inventory

executeWithRetry(() -> inventoryServiceClient.reserveInventory(request.getOrderId()), sagaExecution, "Shipping");

// Step 4: Schedule Shipping

executeWithRetry(() -> shippingServiceClient.scheduleShipping(request.getOrderId()), sagaExecution, "COMPLETED");

return ResponseEntity.ok("Order completed successfully!");

} catch (Exception e) {

updateSagaStatus(sagaExecution, "FAILED");

triggerCompensation(sagaExecution);

return ResponseEntity.status(HttpStatus.INTERNAL_SERVER_ERROR)

.body("Transaction failed. Rollback executed.");

}

private void executeWithRetry(Runnable action, SagaExecution sagaExecution, String nextStep) {

int retries = 0;

while (retries < MAX_RETRIES) {

try {

action.run();

updateSagaStep(sagaExecution, nextStep);

return;

} catch (Exception e) {

retries++;

if (retries >= MAX_RETRIES) {

throw new RuntimeException("Maximum retries reached for " + nextStep);

}

// Log retry attempt

}

private void updateSagaStep(SagaExecution saga, String step) {

saga.setCurrentStep(step);

saga.setLastUpdated(new Date());

sagaRepository.save(saga);

}

private void updateSagaStatus(SagaExecution saga, String status) {

saga.setStatus(status);

saga.setLastUpdated(new Date());

sagaRepository.save(saga);

}

private void triggerCompensation(SagaExecution saga) {

try {

if ("Inventory Reservation".equals(saga.getCurrentStep())) {

paymentServiceClient.rollbackPayment(saga.getOrderId());

orderServiceClient.rollbackOrder(saga.getOrderId());

} else if ("Payment Processing".equals(saga.getCurrentStep())) {

orderServiceClient.rollbackOrder(saga.getOrderId());

}

} catch (Exception e) {

// Log failure in compensation logic

}

Kafka-Based Saga (Choreography Approach)

In contrast to orchestration, the Kafka-based Saga uses an event-driven architecture, where each service reacts to events and publishes events for the next service to act upon. This approach removes the need for a central orchestrator and allows for more scalable systems with loose coupling between services.

Event-Driven Saga Architecture

Event Flow: Services publish and consume events asynchronously using Kafka.
No Centralized Orchestrator: Each service listens to events and performs its part of the transaction. If something fails, services can publish compensating events (e.g., PaymentFailed, InventoryRollback).
Decentralized Control: Each service has its own logic for handling events and compensating for failures.

Key Considerations for Kafka-based Saga:

Event Schema: Define a common event schema (e.g., JSON or Avro) to ensure all services understand the events being exchanged.
Event Handlers: Each service should have event handlers to process events and perform actions.
Idempotency: Ensure event handlers are idempotent so that duplicate events don’t cause inconsistencies.
Event Publishing: Each service should publish events (e.g., OrderCreated, PaymentProcessed, InventoryReserved) to Kafka topics.
Compensating Actions: Services should listen for failure events and trigger compensating actions when needed.

In summary, Kafka-based saga is suitable for loosely coupled systems with event-driven workflows, while orchestration-based saga provides centralized control and is often better for scenarios where strict order of operations and compensating actions are necessary. Both approaches have their place depending on the architecture and system requirements.

Choosing Between Envoy and NGINX Ingress Controllers for Kubernetes

As Kubernetes has become the standard for deploying containerized applications, ingress controllers play a critical role in managing how external traffic is routed to services within the cluster. Envoy and NGINX are two of the most popular options for ingress controllers, and each has its strengths, weaknesses, and ideal use cases. In this blog, we’ll explore: How both ingress controllers work. A detailed comparison of their features. When to use Envoy vs. NGINX for ingress management. What is an Ingress Controller? An ingress controller is a specialized load balancer that: Manages incoming HTTP/HTTPS traffic. Routes traffic to appropriate services based on rules defined in Kubernetes ingress resources. Provides features like TLS termination, path-based routing, and host-based routing. How Envoy Ingress Controller Works Envoy , initially built by Lyft, is a high-performance, modern service proxy and ingress solution. Here's how it operates in Kubernetes: Ingress Resource : You d...

vinaypal.com

Search This Blog