Guided scenarios to build real intuition for distributed systems.
Beginner
URL Shortener
Design a URL shortener that handles 10,000 read requests per second with sub-10ms latency.
Order Notification Pipeline
When a customer places an order, send an email AND update inventory — reliably, even if one service is down.
Async Image Processor
Users upload images to S3. A pipeline resizes them to 3 sizes without blocking the upload response.
Event-Driven Inventory Sync
A warehouse uploads a CSV to S3 every hour. Three downstream systems (pricing, search, analytics) each need the changes — at their own pace, without blocking each other.
GCP Webhook Pipeline
A SaaS app receives webhooks that must be accepted instantly and processed asynchronously on Google Cloud.
Azure Serverless Order Processor
Accept orders at any rate and process them reliably on Azure, even during traffic spikes.
Edge API with Workers + D1
Serve a low-latency JSON API from the edge, backed by a serverless SQLite database.
Intermediate
Fraud Detection System
Flag suspicious orders in real time without blocking the checkout flow. High traffic: 500 orders/s at peak.
Real-Time Leaderboard
A gaming leaderboard that serves 50,000 reads/s but only ~100 writes/s. Cost must be minimised.
E-Commerce Checkout
Your checkout service hammers RDS with Lambda connections and crashes on Black Friday. Fix it without touching the schema.
Auto-Scaling Web Tier
A traditional web app on EC2 fails under 10× traffic spikes. Redesign it to scale automatically without over-provisioning.
GCP Pub/Sub Dead-Letter Handling
A poison message is being redelivered forever and stalling your Pub/Sub subscriber. Make the pipeline resilient.
Azure Service Bus Dead-Lettering
Repeatedly-failing messages are clogging your Service Bus queue. Quarantine them and stop double-processing.
Workers + R2 + Queues Pipeline
Accept uploads at the edge, queue background processing, and store results durably with zero egress fees.
Real-time Edge Coordination (KV + DO)
Run a live, strongly-consistent counter (or chat room) at the edge while serving config from a fast, eventually-consistent cache.
Advanced
Multi-Region Disaster Recovery
A region-wide AWS outage would take your API completely offline. Design for RTO < 1 min and RPO < 5 seconds.
Cost-Optimised Data Pipeline
Your nightly pipeline processes 50 GB of events and costs $800/month. Cut it to under $200 using the right AWS options — without changing the output.
GCP Containerized API Platform
Migrate a monolith to a containerised API on GCP that autoscales and stays under 100ms at the read path.
Azure Media Processing Pipeline
Media uploaded to Blob Storage must be transcoded asynchronously and indexed — and survive a transcoder crash.