Skip to main content

Distributed Transactions in Microservices: Implementing the Saga Pattern

Managing distributed transactions is one of the most critical challenges in microservices architecture. Since microservices operate with decentralized data storage, traditional ACID transactions across services are not feasible. The Saga Pattern is a proven solution for ensuring data consistency in such distributed systems.

In this blog, we’ll discuss:

  1. What is the Saga Pattern?
  2. Types of Saga Patterns: Orchestration vs. Choreography
  3. How to Choose Between Them
  4. Implementing Orchestration-Based Saga with Spring Boot
  5. An Approach to Event-Driven Saga with Kafka

1. What is the Saga Pattern?

The Saga Pattern breaks a long-running distributed transaction into a series of smaller atomic transactions, each managed by a microservice. If any step fails, compensating actions are performed to roll back the preceding operations.

Example: In an e-commerce system, a customer places an order:

  • Payment is processed.
  • Inventory is reserved.
  • Shipping is scheduled.

If inventory reservation fails, the payment must be refunded and the order canceled.


2. Types of Saga Patterns

a. Orchestration-Based Saga

  • A central orchestrator manages the transaction.
  • The orchestrator communicates with each microservice to execute steps sequentially and handles failures by triggering compensating actions.

Characteristics:

  • Centralized control.
  • Easier to implement and monitor.
  • Ideal for workflows with strict dependencies.

b. Choreography-Based Saga

  • Each microservice publishes events and reacts to events from other services.
  • No centralized controller exists; the flow is managed via event-driven architecture.

Characteristics:

  • Decentralized and highly scalable.
  • Suitable for loosely coupled systems.
  • Complexity grows with the number of services and events.

3. Choosing Between Orchestration and Choreography

CriteriaOrchestrationChoreography
ControlCentralized control ensures traceability.Decentralized; each service acts autonomously.
Ease of DebuggingEasier to debug due to centralized logic.Harder to debug due to event chains.
ScalabilityLess scalable for large systems.Highly scalable and flexible.
Use CaseComplex workflows with strong dependencies.Loosely coupled systems, real-time needs.
ExampleBanking, insurance claims, e-commerce order processing.IoT, event tracking, real-time analytics.


4. Implementing Orchestration-Based Saga with Spring Boot

We’ll implement a basic Orchestration-Based Saga for an e-commerce order workflow with the following services:

  1. Order Service: Creates the order.
  2. Payment Service: Processes payment.
  3. Inventory Service: Reserves inventory.
  4. Shipping Service: Arranges shipping.

Saga Orchestrator

The orchestrator will:

  1. Track the state of each step in a database.
  2. Call each service sequentially.
  3. Handle failures by triggering compensating actions.


Orchestrator Design Considerations

  1. Transaction State Management: The orchestrator should track the saga’s progress at each stage (e.g., "Order Created", "Payment Processed").
  2. Rollback Mechanism: In case of failure, the orchestrator should initiate compensating transactions to roll back any changes made by previous steps.
  3. Timeout Handling: Add timeouts to prevent the saga from hanging indefinitely if one of the services fails to respond.
  4. Retries: Implement retry mechanisms for transient failures.
  5. Database Integration: Use a persistent store (such as a relational database) to track the status of each step in the saga and allow for recovery if the orchestrator crashes or the system is restarted.
  6. Error Handling: Handle both expected (e.g., validation errors) and unexpected errors (e.g., network failures) gracefully.


Basic Code Snippet 

@RestController
@RequestMapping("/saga")
public class RobustSagaOrchestrator {

    @Autowired
    private SagaRepository sagaRepository;
    @Autowired
    private OrderServiceClient orderServiceClient;
    @Autowired
    private PaymentServiceClient paymentServiceClient;
    @Autowired
    private InventoryServiceClient inventoryServiceClient;
    @Autowired
    private ShippingServiceClient shippingServiceClient;

    private static final int MAX_RETRIES = 3;

    @PostMapping("/create-order")
    public ResponseEntity<String> createOrder(@RequestBody OrderRequest request) {
        SagaExecution sagaExecution = new SagaExecution();
        sagaExecution.setOrderId(request.getOrderId());
        sagaExecution.setStatus("PENDING");
        sagaExecution.setCurrentStep("Order Creation");
        sagaRepository.save(sagaExecution);

        try {
            // Step 1: Create Order
            executeWithRetry(() -> orderServiceClient.createOrder(request), sagaExecution, "Payment Processing");

            // Step 2: Process Payment
            executeWithRetry(() -> paymentServiceClient.processPayment(request.getOrderId()), sagaExecution, "Inventory Reservation");

            // Step 3: Reserve Inventory
            executeWithRetry(() -> inventoryServiceClient.reserveInventory(request.getOrderId()), sagaExecution, "Shipping");

            // Step 4: Schedule Shipping
            executeWithRetry(() -> shippingServiceClient.scheduleShipping(request.getOrderId()), sagaExecution, "COMPLETED");

            return ResponseEntity.ok("Order completed successfully!");
        } catch (Exception e) {
            updateSagaStatus(sagaExecution, "FAILED");
            triggerCompensation(sagaExecution);
            return ResponseEntity.status(HttpStatus.INTERNAL_SERVER_ERROR)
                    .body("Transaction failed. Rollback executed.");
        }
    }

    private void executeWithRetry(Runnable action, SagaExecution sagaExecution, String nextStep) {
        int retries = 0;
        while (retries < MAX_RETRIES) {
            try {
                action.run();
                updateSagaStep(sagaExecution, nextStep);
                return;
            } catch (Exception e) {
                retries++;
                if (retries >= MAX_RETRIES) {
                    throw new RuntimeException("Maximum retries reached for " + nextStep);
                }
                // Log retry attempt
            }
        }
    }

    private void updateSagaStep(SagaExecution saga, String step) {
        saga.setCurrentStep(step);
        saga.setLastUpdated(new Date());
        sagaRepository.save(saga);
    }

    private void updateSagaStatus(SagaExecution saga, String status) {
        saga.setStatus(status);
        saga.setLastUpdated(new Date());
        sagaRepository.save(saga);
    }

    private void triggerCompensation(SagaExecution saga) {
        try {
            if ("Inventory Reservation".equals(saga.getCurrentStep())) {
                paymentServiceClient.rollbackPayment(saga.getOrderId());
                orderServiceClient.rollbackOrder(saga.getOrderId());
            } else if ("Payment Processing".equals(saga.getCurrentStep())) {
                orderServiceClient.rollbackOrder(saga.getOrderId());
            }
        } catch (Exception e) {
            // Log failure in compensation logic
        }
    }
}


Kafka-Based Saga (Choreography Approach)

In contrast to orchestration, the Kafka-based Saga uses an event-driven architecture, where each service reacts to events and publishes events for the next service to act upon. This approach removes the need for a central orchestrator and allows for more scalable systems with loose coupling between services.

Event-Driven Saga Architecture

  • Event Flow: Services publish and consume events asynchronously using Kafka.
  • No Centralized Orchestrator: Each service listens to events and performs its part of the transaction. If something fails, services can publish compensating events (e.g., PaymentFailed, InventoryRollback).
  • Decentralized Control: Each service has its own logic for handling events and compensating for failures.

Key Considerations for Kafka-based Saga:

  1. Event Schema: Define a common event schema (e.g., JSON or Avro) to ensure all services understand the events being exchanged.
  2. Event Handlers: Each service should have event handlers to process events and perform actions.
  3. Idempotency: Ensure event handlers are idempotent so that duplicate events don’t cause inconsistencies.
  4. Event Publishing: Each service should publish events (e.g., OrderCreated, PaymentProcessed, InventoryReserved) to Kafka topics.
  5. Compensating Actions: Services should listen for failure events and trigger compensating actions when needed.

In summary, Kafka-based saga is suitable for loosely coupled systems with event-driven workflows, while orchestration-based saga provides centralized control and is often better for scenarios where strict order of operations and compensating actions are necessary. Both approaches have their place depending on the architecture and system requirements.








Comments

Popular posts from this blog

Mastering Java Logging: A Guide to Debug, Info, Warn, and Error Levels

Comprehensive Guide to Java Logging Levels: Trace, Debug, Info, Warn, Error, and Fatal Comprehensive Guide to Java Logging Levels: Trace, Debug, Info, Warn, Error, and Fatal Logging is an essential aspect of application development and maintenance. It helps developers track application behavior and troubleshoot issues effectively. Java provides various logging levels to categorize messages based on their severity and purpose. This article covers all major logging levels: Trace , Debug , Info , Warn , Error , and Fatal , along with how these levels impact log printing. 1. Trace The Trace level is the most detailed logging level. It is typically used for granular debugging, such as tracking every method call or step in a complex computation. Use this level sparingly, as it can generate a large volume of log data. 2. Debug The Debug level provides detailed information useful during dev...

Choosing Between Envoy and NGINX Ingress Controllers for Kubernetes

As Kubernetes has become the standard for deploying containerized applications, ingress controllers play a critical role in managing how external traffic is routed to services within the cluster. Envoy and NGINX are two of the most popular options for ingress controllers, and each has its strengths, weaknesses, and ideal use cases. In this blog, we’ll explore: How both ingress controllers work. A detailed comparison of their features. When to use Envoy vs. NGINX for ingress management. What is an Ingress Controller? An ingress controller is a specialized load balancer that: Manages incoming HTTP/HTTPS traffic. Routes traffic to appropriate services based on rules defined in Kubernetes ingress resources. Provides features like TLS termination, path-based routing, and host-based routing. How Envoy Ingress Controller Works Envoy , initially built by Lyft, is a high-performance, modern service proxy and ingress solution. Here's how it operates in Kubernetes: Ingress Resource : You d...