AMDAHL'S ARGUMENT FOR AI
The productivity speedup AI apps can provide is limited by how much human-in-the-loop work is required. Humans are ~1-3 tokens per second. They can't really be sped up – unless you're
@neuralink.
So if your application requires a human completion for every LLM completion (i.e. ChatGPT or AI Copilots) then your maximum speedup is ~2x – even when LLMs become 10x faster.
@cognition_labs Devin is better, because it needs a human completion only every ~10 iterations. At current speeds, this feels about 2.9x better than raw ChatGPT, which is nice, but not mind-blowing. But because they're frugal with human tokens, they can go to ~10x productivity speedup just by waiting for models to get faster!
The fun stuff starts when AI agents get to the 100-1000x range, i.e. only require human input every 100-1000 iterations. It's going to be a long way there – but I'm excited every time I see something that will get us closer: Like code execution from @e2b_dev, browsing from
@browserbase and a context engine from
@sid_ai.
Many copilots & current ChatGPTs will seem silly in hindsight: Like doing a 1 on 1 with your intern every 15 minutes – when you could be managing a team that does a month's worth of progress between every meeting.
Today, developers are frugal with LLM tokens (I know: they're expensive) – alas we've built tools to use them wisely:
@PareaAI,
@humanloop,
@langfuse,
@langchain. But the most important thing to be frugal with are human tokens (both input and output) – they will define the overall productivity speedup your application can provide. Humans are insanely slow.
AI agents don't yet work well – but it won't be a competition once they do. If you can think of one that does or you're working on one, please post it below!
Naturally, there are many caveats here: Iterations are gameable, and reducing human tokens has been an important trend outside of agents, too: Google let you find information with fewer keystrokes and reading than anyone else – same holds for
@perplexity_ai today. Button presses can be tokens (depending on the action they trigger) etc.
Some chart explanations:
0. I pin human completions at 1 token per second in all calculations. That is realistic for high quality human tokens, although some people are faster or slower.
1.
@BCG put this number at 1.4x. I'm fine disagreeing. The 1.8x is at current GPT-4 speeds.
2. Devin doesn't fully realize it's potential yet.
3. Let's free that y axis! "Future agent" only needs a human completion every 100 iterations.