← Home • Blog • Synkti

The Banker's Algorithm for Distributed Systems

January 24, 2026 • by Bobby Mathews

TL;DR Classical OS theory offers a powerful lens for distributed systems design. The Banker's Algorithm—Dijkstra's 1965 deadlock avoidance strategy—provides the foundation for what I call Observability-Driven Development: systems that observe their entire state space before acting, never entering states they cannot safely exit. This post catalogs the production patterns that emerge from this principle.

The Insight: OS Theory Scales Up

In 1965, Edsger Dijkstra introduced the Banker's Algorithm for deadlock avoidance in operating systems. The core insight was deceptively simple: before granting any resource request, simulate the allocation and verify the resulting state is safe.

A "safe state" means all processes can eventually complete. If granting a request would leave the system in an unsafe state—where deadlock becomes possible—the request is denied, and the process waits.

The Banker's Algorithm (OS Theory, 1965)

1. Process requests resources
2. Algorithm SIMULATES granting the request
3. Checks if resulting state is SAFE
   (safe = exists a sequence where all processes complete)
4. IF safe → Grant request
5. ELSE → Deny request, process waits

Invariant: Never enter a state you cannot safely exit.

Sixty years later, I found myself applying the same principle to distributed systems running on volatile cloud infrastructure. The domains are different—processes and memory vs. containers and cloud APIs—but the invariant is identical.

The Core Insight Dijkstra's invariant—"never enter a state you cannot safely exit"—translates directly to distributed systems: observe the entire state space before traversing it. Don't launch instances until you've verified the infrastructure exists. Don't start failover until you've confirmed a replacement is available. The system should never fail blindly.

Observability-Driven Development (ODD)

Traditional distributed systems fail cryptically. A deployment times out. A health check fails. An instance terminates unexpectedly. The operator sees Error: deadline exceeded and begins the debugging ritual: check logs, check metrics, check configs, guess.

Observability-Driven Development inverts this. Before executing any operation, the system observes its complete state space and verifies the path ahead is clear. If preconditions aren't met, the system doesn't fail—it guides the operator to resolution.

Synkti: Banker's Algorithm Applied to Distributed Systems

User invokes: synkti --project-name my-app

1. System observes entire state space:
   ├── Where am I running? (EC2 instance or local machine?)
   ├── Does infrastructure exist? (S3 buckets, IAM roles, security groups)
   ├── Is the orchestrator binary in S3?
   ├── Are model weights uploaded?
   └── Are peer instances discoverable?

2. Checks if state is SAFE (all dependencies present)

3. IF safe → Execute (start orchestrator or monitoring)
4. ELSE → Don't proceed. Guide user to safe state:

   "Missing: orchestrator binary in S3
    Run: ./scripts/upload-binary.sh --project-name my-app"

Invariant: Never enter a state you cannot safely complete.

The mapping to Dijkstra's algorithm is direct:

Banker's Algorithm	Synkti (Distributed Systems)
Available resources matrix	S3 buckets, IAM roles, running instances
Maximum demand matrix	Required dependencies (binary, model weights)
Allocation matrix	Current state (infrastructure created? deps uploaded?)
Safe state check	`is_safe_to_proceed()` verification
Grant/Deny request	Execute operation / Guide user to fix

The result: no blind failures. Every error message tells you exactly what's wrong and how to fix it. The system self-diagnoses before attempting operations that would fail.

P2P Choreography vs. Centralized Orchestration

Kubernetes solved container orchestration with a centralized control plane: an API server, etcd for state, a scheduler, and various controllers. This architecture has a fundamental flaw: state drift.

The control plane maintains a model of the cluster in etcd—a representation of what it believes is true. But the map is not the territory. Reality exists at the nodes. The model is always an approximation, always slightly stale, always drifting from truth.

The State Drift Problem

┌─────────────────────────────────────────────────────────────┐
│  Control Plane (API Server + etcd)                          │
│  "Desired state: 5 replicas, all healthy"                   │
└─────────────────────────────────────────────────────────────┘
                    │ reconciliation loop
                    ▼
┌─────────────────────────────────────────────────────────────┐
│  Reality (Nodes)                                            │
│  "Actual state: 3 running, 1 pending, 1 OOMKilled"          │
└─────────────────────────────────────────────────────────────┘

The control plane's model diverges from reality.
Reconciliation is perpetually playing catch-up.
Network partitions turn drift into divergence.

This creates the reconciliation tax: continuous CPU cycles diffing desired vs. actual state, network bandwidth syncing state to the center, exponential complexity handling edge cases where the model has diverged from reality.

The Alternative: Truth at the Edge

In P2P choreography, there is no central model to drift. Each node IS the authoritative source of its own state. When you need to know a node's status, you ask the node. No stale cache. No fiction.

P2P Choreography Architecture

┌───────────────────────────────────────────────────────────┐
│                 Kubernetes (Centralized)                  │
│                                                           │
│  ┌─────────────┐                                          │
│  │ Control     │◄─── reconciliation loop ───►             │
│  │ Plane       │     (continuous overhead)                │
│  └──────┬──────┘                                          │
│         │ commands                                        │
│  ┌──────┴──────┬──────────┬──────────┐                    │
│  ▼             ▼          ▼          ▼                    │
│  Node A      Node B     Node C     Node D                 │
│  (passive)   (passive)  (passive)  (passive)              │
│                                                           │
│  Single source of truth: Control Plane                    │
│  Failure mode: SPOF, state drift, split-brain             │
└───────────────────────────────────────────────────────────┘

┌───────────────────────────────────────────────────────────┐
│                 Synkti (P2P Choreography)                 │
│                                                           │
│  ┌────────────┐    ┌────────────┐    ┌────────────┐       │
│  │  Node A    │◄──►│  Node B    │◄──►│  Node C    │       │
│  │  (self-    │    │  (self-    │    │  (self-    │       │
│  │  governing)│    │  governing)│    │  governing)│       │
│  └────────────┘    └────────────┘    └────────────┘       │
│        │                 │                 │              │
│        └─────────────────┴─────────────────┘              │
│                          │                                │
│                EC2 Tags (discovery)                       │
│                SynktiCluster=my-app                       │
│                SynktiRole=worker                          │
│                                                           │
│  Source of truth: Each node (distributed)                 │
│  Failure mode: None (no central point of failure)         │
└───────────────────────────────────────────────────────────┘

Each Synkti node:

Tags itself as a cluster member on startup
Discovers peers by querying cloud provider APIs
Makes autonomous decisions about failover and load balancing
Untags itself on graceful shutdown

No central coordinator means no single point of failure, no single point of control, and no state to drift. The "cluster state" is simply the set of tagged instances—always consistent with reality by construction.

RAII for Cloud Infrastructure

In C++, RAII (Resource Acquisition Is Initialization) ties resource lifetime to object scope. When an object goes out of scope, its destructor runs and resources are freed automatically. No manual cleanup. No resource leaks.

// C++ RAII
{
    std::vector<int> v(1000);  // Acquire memory
    // Use v
}  // Memory automatically freed (destructor runs)

The same principle applies to cloud infrastructure:

// Rust RAII for cloud infrastructure
{
    let infra = Infrastructure::new("my-experiment").await?;
    // Use infra (S3 buckets, IAM roles, instances)
}  // Infrastructure automatically destroyed (Drop runs)

This inverts the traditional mental model. Infrastructure is not real estate you accumulate—it's a library dependency you borrow. When you're done, you return it.

Runtime Compilation: The Cluster That Refuses to Run Broken Code When a Synkti worker panics or exits abnormally, it immediately terminates its own instance. This isn't a bug—it's a feature. The cluster is "in compilation stage," refusing to run in a degraded state. Like Rust's compiler enforcing memory safety at compile time, Synkti enforces system health at runtime. Each failure uploads logs to S3 for diagnosis before self-terminating. The system returns borrowed resources (spot instances) rather than persisting in a broken state. This "fail fast" philosophy means bugs are discovered and fixed early, not ignored until they cascade into production outages.

Resource Type	Lifetime	Management	Rationale
S3 orchestrator binary	Permanent	Manual	Intelligence—versioned
S3 model weights	Permanent	Manual	Intelligence—expensive to re-download
S3 checkpoints	Ephemeral	Automatic	Runtime state—auto-expires
IAM roles	Permanent	Manual (one-time)	Configuration—propagation delay
Instance profile	Permanent	Manual (one-time)	Configuration—tied to IAM
Security groups	Permanent	Manual (one-time)	Configuration—$0 cost
EC2 instances	Ephemeral	Automatic	Dumb compute—managed by AWS SDK

The pattern: intelligence is permanent, compute is ephemeral. Model weights, orchestrator logic, IAM permissions—these are carefully managed assets that persist across deployments. EC2 instances are disposable workers, abstracted away except for the first and second-order effects of the algorithms they execute.

The result: no resource leaks, no accumulating cloud bills from forgotten test infrastructure, and a clean slate for each deployment.

The Fungible Compute Principle

Traditional operations treats servers as pets. You name them, configure them carefully, nurse them when sick. When one dies, it's a crisis requiring human intervention.

Fungible compute inverts this. Instances are disposable. Intelligence persists elsewhere.

Intelligence Persists, Compute is Disposable

┌───────────────────────────────────────────────────────────┐
│               PERMANENT INTELLIGENCE (S3)                 │
│  s3://my-project-models/                                  │
│    ├── bin/synkti           ← Orchestrator logic          │
│    └── qwen2.5-7b/          ← Model weights               │
└───────────────────────────────────────────────────────────┘
                            │
                            │ Downloaded on boot
                            ▼
┌───────────────────────────────────────────────────────────┐
│                    FUNGIBLE COMPUTE                       │
│  ┌────────────┐   ┌────────────┐   ┌────────────┐         │
│  │ EC2 Spot   │   │ EC2 Spot   │   │ EC2 Spot   │         │
│  │ Instance   │   │ Instance   │   │ Instance   │         │
│  │            │   │            │   │            │         │
│  │ Disposable │   │ Disposable │   │ Disposable │         │
│  └────────────┘   └────────────┘   └────────────┘         │
│                                                           │
│  Any instance can be deleted/replaced at any time.        │
│  The intelligence (orchestration logic, models) persists. │
└───────────────────────────────────────────────────────────┘

Each instance boots with a simple user-data script:

# 1. Download orchestrator from S3
aws s3 cp s3://${project}-models/bin/synkti /usr/local/bin/synkti

# 2. Download model weights
aws s3 sync s3://${project}-models/qwen2.5-7b/ /models/

# 3. Start orchestrator (which observes context and acts accordingly)
synkti --project-name ${project}

When a spot instance terminates, nothing of value is lost. A replacement boots, downloads the same intelligence from S3, discovers its peers, and joins the cluster. No configuration drift. No special recovery procedures. No operator intervention.

Stateless Failover

Stateful failover is operationally expensive. When a node fails, you must: detect the failure, find a healthy replacement, replicate state (potentially gigabytes of data), switch traffic, and hope nothing was lost in transit.

For GPU workloads, step 3 is often impossible. Docker checkpoint (CRIU) cannot serialize GPU memory. The KV cache for a 7B parameter model is ~14GB. Replicating that within a 2-minute spot termination notice is... optimistic.

Stateless failover sidesteps the problem entirely. Don't replicate state—spawn a fresh replacement.

Stateless Failover Timeline

Spot Termination Notice (120 seconds)
              │
              ▼
    ┌─────────────────┐
    │ 1. DRAIN (~5s)  │  Stop accepting new requests
    │                 │  Let in-flight requests complete
    └────────┬────────┘
              │
              ▼
    ┌─────────────────┐
    │ 2. SELECT (~1s) │  Query peers, pick replacement AZ
    └────────┬────────┘
              │
              ▼
    ┌─────────────────┐
    │ 3. SPAWN (~35s) │  Launch new spot instance
    │                 │  Download binary + model from S3
    └────────┬────────┘
              │
              ▼
    ┌─────────────────┐
    │ 4. HEALTH (~2s) │  Verify vLLM is ready
    │                 │  Register with load balancer
    └────────┬────────┘
              │
              ▼
         Traffic restored

Total: ~45 seconds (well within 120s grace period)
No state replication. Fresh instance, fresh start.

This works because:

vLLM is stateless—inference engines don't accumulate state like training loops
Cold start is fast—~35s to launch, download model, and serve
HTTP is retry-friendly—clients already have retry logic
The cost is trivial—rebuilding 32k tokens costs $0.012

The Responsible Intelligence Binary

Traditional tooling fragments responsibility across multiple binaries: kubectl for cluster management, kubelet for node agents, helm for package management, terraform for infrastructure. The operator must know which tool to use, where to run it, and in what sequence.

A responsible intelligence binary knows where it is and what it should do. It observes its context, infers its role, and acts appropriately—without configuration files telling it which mode to operate in.

One Binary, Two Roles

$ synkti --project-name my-app

┌───────────────────────────────────────────────────────────┐
│  Binary observes: "Where am I?"                           │
│                                                           │
│  ┌────────────────────┐    ┌────────────────────┐         │
│  │   LOCAL MACHINE    │    │    EC2 INSTANCE    │         │
│  │                    │    │                    │         │
│  │  Role: Deployer    │    │  Role: Worker      │         │
│  │                    │    │                    │         │
│  │  • Manage infra    │    │  • Join cluster    │         │
│  │  • Run terraform   │    │  • Discover peers  │         │
│  │  • Launch workers  │    │  • Run orchestrator│         │
│  │  • Show dashboard  │    │  • Self-terminate  │         │
│  │                    │    │    on failure      │         │
│  └────────────────────┘    └────────────────────┘         │
│                                                           │
│  No configuration. No mode flags. The binary just knows.  │
└───────────────────────────────────────────────────────────┘

The EC2 worker doesn't need Terraform because infrastructure was already created by you running the binary locally. The worker's responsibilities are minimal and self-contained:

Discover peers via EC2 tags (P2P)
Run its workload (vLLM inference)
Self-terminate if it crashes (RAII)

The binary detects its context through layered observation—the same Banker's Algorithm principle, now applied at the application level:

async fn is_running_on_ec2() -> bool {
    // Layer 1: IMDSv2 token endpoint
    if check_imdsv2_token().await { return true; }

    // Layer 2: Instance identity document
    if check_instance_identity().await { return true; }

    // Layer 3: System UUID pattern
    if check_system_uuid() { return true; }

    false
}

// The binary observes its state space before deciding its role
let on_ec2 = is_running_on_ec2().await;

if on_ec2 {
    // Worker mode: minimal responsibilities, self-managing
    run_orchestrator(project).await
} else {
    // Deployer mode: infrastructure management available
    deploy_and_monitor(project).await
}

This is responsible intelligence in two senses: the binary is responsible for knowing its own role, and it behaves responsibly within that role—workers don't try to manage infrastructure they shouldn't touch, deployers don't try to join clusters they're not part of.

The design is also extensible. The same binary can become a CLI tool, a worker, a monitoring agent, or anything else—determined by context observation, not hardcoded roles. Future capabilities are added by extending the observation logic, not by creating new binaries.

The Narrative Arc Notice how Dijkstra's invariant applies at every layer. The system observes before acting (ODD). The infrastructure scopes its lifetime (RAII). The compute is fungible while intelligence persists. Failover is stateless because state is the enemy of fungibility. And the binary knows its role by observing its context (Responsible Intelligence). Same principle—never enter a state you cannot safely exit—applied at different scales. The Banker's Algorithm is fractal.

The Philosophy: Five Principles

These patterns share an underlying philosophy:

Simplicity over complexity—P2P is simpler than control planes
Deletion over reconciliation—Replace broken nodes, don't fix them
Observation over assumption—Know the state space before acting
Primitives over abstractions—S3 + EC2 tags, not custom databases
Scoped lifetimes over manual cleanup—RAII for infrastructure

The result is a system that:

Requires zero manual intervention during spot terminations
Handles failures gracefully and predictably
Scales from 2 nodes to 200 nodes without architectural changes
Costs 70-80% less than always-on infrastructure

"The system observes itself before acting. It never fails blindly—it knows what it needs and tells you exactly what's missing."

This is what Dijkstra understood in 1965, applied to the cloud era: never enter a state you cannot safely exit. The Banker's Algorithm scales from processes to distributed systems. The invariant holds.