Interviewer: design Stripe’s payments ledger.
I could answer, but only after pinning scope. Is this a general ledger or an internal payments ledger? Do we need double-entry? Multi-currency? Pending vs posted? Chargebacks, disputes, refunds? Idempotency? Audit requirements like immutable history and replay?
Once we agreed: double-entry, append-only, supports auth/capture/refund/chargeback, multi-tenant (merchant), multi-currency, strong auditability.
APIs I sketched:
POST /transfers (or /ledger/entries) with idempotency_key, merchant_id, amount, currency, event_type, external_ref
GET /balances?merchant_id=…
GET /entries?account_id=…&cursor=…
POST /reconcile/bank_statement (optional)
Data model:
accounts(id, merchant_id, type, currency)
entries(id, tx_id, account_id, direction, amount, currency, created_at, metadata, event_id)
transactions(tx_id, state, idempotency_key, external_ref)
Every business event writes a balanced set of entries that sums to zero per currency. No updates, only new reversing entries.
Architecture:
Write path is synchronous to a primary DB for correctness. I assumed Postgres with serializable or at least per-tx constraints, plus unique(idempotency_key, merchant_id) to stop duplicates. Async consumers build read models: balance tables, per-merchant statement views, exports. Event log (Kafka/PubSub) for downstream systems, but DB is source of truth.
Scaling:
Partition by merchant_id, keep hot merchants isolated. Entries table is append-only so it partitions well by time too. Balance reads hit a materialized table updated by the same transaction or a consumer with exactly-once-ish semantics plus idempotent updates. Keep pagination cursor-based, indexed by (account_id, created_at, id).
Tradeoffs:
If you compute balance by summing entries at read time, it’s simple but dies at scale. If you maintain balances, you need careful reconciliation jobs and invariants. Serializable transactions help correctness but reduce throughput; per-merchant locks are often enough.
Failure cases I called out:
At-least-once delivery from payment processors so idempotency is not optional. Partial writes violate double-entry, so constraints transaction boundaries matter. Clock skew breaks ordering assumptions, so ordering by DB sequence not timestamps. Backfills need deterministic replays. Corrupt consumer state means you rebuild read models from entries, which is why append-only is the hill to die on.