#11

Distributed Rate Limiter

January 30, 2026

RustRedisGCPTerraformToken Bucket

High-performance distributed rate limiter built in Rust + Redis using Token Bucket algorithm. Millisecond latency, millions of checks per second. Deployed on GCP with Terraform + Memorystore.

What is it?

A high-performance distributed rate limiter that handles millions of checks per second with sub-millisecond latency. Built in Rust for the HTTP hot path and Redis (GCP Memorystore) as shared atomic state — so it works consistently across multiple distributed service instances.

How it works

A Rust HTTP service exposes POST /check with a client identifier. It reads the client's token bucket from Redis atomically, checks token availability, decrements if allowed, and returns allow/deny in under 1ms. Deployed on GCP with Terraform + Memorystore (managed Redis with automatic failover).

Token bucket vs the alternatives

Token Bucket: each client gets N tokens, refilled at rate R/sec. Burst-friendly — N simultaneous requests allowed as long as the bucket is full.

Leaky Bucket: requests drain at a fixed rate. No bursting. Adds queuing latency.

Fixed Window Counter: count requests per 1s window. Simple, but allows 2× burst at window boundaries (end of window + start of next).

Sliding Window Log: exact but stores every request timestamp — memory-heavy at scale.

Token Bucket is ideal for API rate limiting: it permits controlled bursting while enforcing a long-term average rate.

Redis atomicity with Lua scripts

The race condition: two Rust instances read the same bucket simultaneously, both see 1 token, both allow, both write 0. Two requests served with one token — classic TOCTOU bug.

The fix: a Redis Lua script runs atomically on the Redis server. No other command executes between the GET and SET. The script reads the bucket, checks tokens, updates the value, and returns allow/deny as a single atomic operation. Redis MULTI/EXEC transactions work too, but Lua is faster for conditional logic.

Why Rust?

Rate limit checks are on the hot path — every API request goes through the limiter. The HTTP overhead needs to be near zero. Rust's async runtime (Tokio) handles tens of thousands of concurrent connections without GC pauses. The combination of Rust + Redis Lua means the bottleneck is always the Redis round-trip (~0.1ms on Memorystore), not the application logic.

Key takeaways

  • Token bucket algorithm: burst capacity, refill rate, when to prefer it over alternatives
  • Redis Lua scripts: atomicity, why TOCTOU races happen and how Lua prevents them
  • GCP Memorystore: managed Redis with VPC peering, automatic failover
  • Rust + Tokio: async HTTP serving with zero GC pauses
  • Terraform for GCP: provisioning Memorystore, Cloud Run, and VPC networking together
← all projects