Relevant

Design a distributed cache system

caching

Non-Functional Requirements

The get and put operations must execute in O(1) average time complexity, ensuring constant-time performance regardless of the number of elements in the cache. This requirement is critical to support high-performance use cases where frequent reads and writes are expected.
Need high availability and reliability (fault tolerance).
Eventual consistency is acceptable.

Sequence Flow

Design Rationale

Each node has its own LRU cache.
Consistent hashing to handle data partitioning.

DistributedCache (Coordinator)

This class coordinates all cache nodes.

class DistributedCache {

    // 1. Locate the partition, elect leader for the partition
    private PartitionManager partitionManager;

    // 2. Route writes
    private ReplicationService replicationService;

        /**
     * Write request.
     */
    public void put(int key,
                    int value) {

        Partition partition =
                partitionManager.getPartition(key);

        partition
                .getPrimary()
                .put(key, value);

        System.out.println(
                "Stored key "
                        + key
                        + " in Partition "
                        + partition.getPartitionId());

        // ReplicationService will be called here later.
        replicationService.replicate(partition, key, value); // request immediately returns to the client, no waiting
        // TODO: should pass replication event to the service, including version number to prevent data loss
    }

    /**
     * Read from primary.
     */
    public int get(int key) {

        Partition partition =
                partitionManager.getPartition(key);

        return partition
                .getPrimary()
                .get(key);
    }
}

How to Handle Data Partitioning?

PartitionManager depends on an abstraction like PartitionStrategy. This follows the Strategy Pattern and lets you swap partitioning algorithms without changing the rest of the system.

public class PartitionManager {

    private static final int HEALTH_CHECK_INTERVAL = 5;

    private final PartitionStrategy strategy;

    private final Map<Integer, Partition> partitions;

    private final ScheduledExecutorService scheduler =
            Executors.newSingleThreadScheduledExecutor();

    public PartitionManager(
            PartitionStrategy strategy,
            Map<Integer, Partition> partitions) {

        this.strategy = strategy;
        this.partitions = partitions;
    }

    /**
     * Start periodic health monitoring.
     */
    public void start() {

        scheduler.scheduleAtFixedRate(
                this::checkPartitionHealth,
                0,
                HEALTH_CHECK_INTERVAL,
                TimeUnit.SECONDS);
    }

    /**
     * Stop monitoring.
     */
    public void shutdown() {
        scheduler.shutdownNow();
    }

    /**
     * Route a key to its partition.
     */
    public Partition getPartition(int key) {

        int partitionId =
                strategy.getPartitionId(key);

        return partitions.get(partitionId);
    }

    /**
     * Periodically checks every partition's leader.
     */
    private void checkPartitionHealth() {

        for (Partition partition : partitions.values()) {

            ReplicaNode primary = partition.getPrimary();

            if (!primary.isHealthy()) {

                promoteLeader(partition);
            }
        }
    }

    /**
     * Promote the first healthy replica.
     */
    private void promoteLeader(Partition partition) {

        for (ReplicaNode replica : partition.getReplicas()) {

            if (replica.isHealthy()) {

                partition.setPrimary(replica);

                System.out.println(
                        replica.getNodeId()
                        + " promoted as PRIMARY for Partition "
                        + partition.getPartitionId());

                return;
            }
        }

        System.out.println(
                "No healthy replicas available for Partition "
                + partition.getPartitionId());
    }
}

PartitionStrategy: ConsistentHashRing

public interface PartitionStrategy {

    int getPartitionId(int key);

    void addPartition(int partitionId);

    void removePartition(int partitionId);
}

public class ConsistentHashRing implements PartitionStrategy {

    private final TreeMap<Integer, Integer> ring = new TreeMap<>();

    @Override
    public Partition getPartition(int key) {
        int hash = hash(String.valueOf(key));

        Map.Entry<Integer, Integer> entry =
                ring.ceilingEntry(hash);

        if (entry == null) {
            entry = ring.firstEntry();
        }

        return entry.getValue();
    }

    @Override
    public void addPartition(int partitionId) {
        // Add virtual nodes
    }

    @Override
    public void removePartition(int partitionId) {
        // Remove virtual nodes
    }

    private int hash(String value) {
        return Math.abs(value.hashCode());
    }
}

NOTE: If you want to make the design even more extensible, separate the hash algorithm from the partitioning strategy. This lets you change the hashing algorithm without modifying the core implementation.

Partition

A partition represents one shard of the cache.

import java.util.List;

public class Partition {

    private final int partitionId;

    private final ReplicaNode primary;

    private final List<ReplicaNode> replicas;

    public Partition(int partitionId,
                     ReplicaNode primary,
                     List<ReplicaNode> replicas) {

        this.partitionId = partitionId;
        this.primary = primary;
        this.replicas = replicas;
    }

    public int getPartitionId() {
        return partitionId;
    }

    public ReplicaNode getPrimary() {
        return primary;
    }

    public List<ReplicaNode> getReplicas() {
        return replicas;
    }
}

ReplicaNode

A replica is simply a wrapper around an LRU cache.

public class ReplicaNode {

    private final String nodeId;

    private final LRUCache cache;

    private volatile boolean alive = true;

    public ReplicaNode(String nodeId, int capacity) {
        this.nodeId = nodeId;
        this.cache = new LRUCache(capacity);
    }

    /**
     * Store key-value pair in this node.
     */
    public void put(int key, int value) {

        if (!alive) {
            throw new RuntimeException(nodeId + " is DOWN");
        }

        cache.put(key, value);
    }

    /**
     * Read value from this node.
     */
    public int get(int key) {

        if (!alive) {
            throw new RuntimeException(nodeId + " is DOWN");
        }

        return cache.get(key);
    }

    public String getNodeId() {
        return nodeId;
    }

    /**
     * Simulate node failure.
     */
    public void shutdown() {
        alive = false;
    }

    /**
     * Simulate node recovery.
     */
    public void recover() {
        alive = true;
    }

    /**
     * Used by heartbeat service.
     */
    public boolean heartbeat() {
        return alive;
    }
}

How to Handle Data Replication?

This service is responsible for replicating data to all replicas of a partition.


public class ReplicationEvent {
    private final Partition partition;
    private final int key;
    private final int value;
    private final long version;
    ...
}

public interface ReplicationService {

    void replicate(ReplicationEvent event);
}

public class AsyncReplicationService
        implements ReplicationService {

    private final BlockingQueue<ReplicationEvent> replicationQueue =
            new LinkedBlockingQueue<>();

    private final BlockingQueue<RetryTask> retryQueue =
            new LinkedBlockingQueue<>();

    public void replicate(ReplicationEvent event) {
        replicationQueue.offer(event);
    }

    public BlockingQueue<ReplicationEvent> getReplicationQueue() {
        return replicationQueue;
    }

    public BlockingQueue<RetryTask> getRetryQueue() {
        return retryQueue;
    }
}

NOTE: The service doesn't replicate. It merely accepts events.

ReplicationWorker

public class ReplicationWorker implements Runnable {

    private final AsyncReplicationService replicationService;

    public ReplicationWorker(
            AsyncReplicationService replicationService) {

        this.replicationService = replicationService;
    }

    @Override
    public void run() {

        while (!Thread.currentThread().isInterrupted()) {

            try {

                ReplicationEvent event =
                        replicationService
                                .getReplicationQueue()
                                .take();

                replicate(event);

            } catch (InterruptedException e) {
                Thread.currentThread().interrupt();
            }
        }
    }

    private void replicate(ReplicationEvent event) {

        for (ReplicaNode replica :
                event.getPartition().getReplicas()) {

            try {

                replica.put(
                        event.getKey(),
                        event.getEntry());

            } catch (Exception ex) {

                replicationService
                        .getRetryQueue()
                        .offer(new RetryTask(
                                replica,
                                event));
            }
        }
    }
}

System Design: Distributed Cache (Redis)