We're coming to the @aiDotEngineer World Fair 👀
Something massive is dropping, and AIE is where it goes live first. Come see it for yourself at booth LG3.
Football fans: we've got you too. âš½
We'll get into the real stuff:
→ Why metering and decisioning are different problems
→ How the decision waterfall works under the hood
→ The trade-offs behind provably correct billing at scale
→ What AI companies building usage-based products should know now
This isn't a polished talk. It's a live AMA.
Start thinking about your questions. This will be an honest conversation, not a presentation.
Register here: luma.com/gaj3eim4
You're mid-flow in Codex. It's working. Then: rate limit hit. Hard stop. Come back later.
That experience is what pushed OpenAI to rethink access control entirely. We got the engineer who built it to come talk to our community. 🧵
luma.com/gaj3eim4
Jonah Cohen, Tech Lead for Financial Engineering at OpenAI, designed and built this system. He published a rare honest write-up about it in February.
And now he's joining HTTP 402 for a live AMA.
And they built it entirely in-house.
Not because they couldn't afford a metering platform. Because no third-party tool could give them real-time correctness AND full auditability at the same time.
When both matter, you own it.
They built a real-time access engine that makes a single per-request decision:
How much usage is allowed? From where?
Rate limits, credits, entitlements, evaluated together, synchronously, in milliseconds. They called it the decision waterfall.
The problem wasn't just UX.
Raise rate limits → lose fairness and capacity control.
Switch to usage billing → introduce lag, overages, reconciliation debt.
Neither worked for products people were actually using and loving. So they did something harder.
At Stripe Sessions, Miro's session drew more people than the room could fit. Engineers on the floor, others crowded at the entrance.
Here's what they shared 🧵
So they brought in Stigg as a single entitlement layer, and shipped in six weeks.
✅ Hybrid seat usage AI model live in production
✅ 5,000 engineering hours saved
✅ Pricing changes: months → days
✅ 4 major monetization changes shipped in a single 4-month window
All monetization logic - entitlements, credits, metering, enforcement, plan management - moved into one place.
Instead of being scattered across the codebase, it became a single layer that every system could rely on.
Clean. Auditable. Easy to change.
stigg.io/blog-posts/how-miro…
Outreach just moved from seats to credits. One of the first established software companies to actually do it.
@robbylit has been tracking this shift across hundreds of companies. Here's what the data is showing right now 🧵
"People don't talk about your pricing page anymore. They talk about pricing.md." - @Dor
If an AI agent evaluates your product and your pricing isn't machine-readable, it routes to a competitor whose is.
Visibility isn't a nice-to-have. It's infrastructure.
The real bottleneck when shipping AI monetization isn't the strategy. It's that the systems can't keep up with it.
Rob sees a new role emerging to bridge that gap. Like GTM engineering, but for AI usage and control.
Full recap: stigg.io/blog-posts/the-grea…
Same question kept coming up everywhere: we're shipping agents fast. But who consumed what, at what cost, attributed to which team or user, nobody has a clean answer. And nobody wants to add 200ms to every inference call to get it.
That's exactly what we build for. Real-time usage control across every entity in your stack. Zero latency. Your cloud. Audit trail included.
Good few days.