out of distribution

Joined May 2009
7 Photos and videos
OpenRouter launches Mixture-of-Midwits
Introducing the Fusion API, the smartest compound model in the market. Fusion achieves Fable-level intelligence at half the price. How it works 👇
108
ficus retweeted
i framed a colorful e ink on my wall to display a collage of any birds that have passed by my window today
i mounted a tiny microphone on my apartment balcony to listen for any birds passing by and built a site to collage them as they're heard
93
378
8,187
524,492
Jun 13
I wonder if a child will be conceived tonight because Anthropic had to take Fable down.
109
Jun 12
🇺🇸
World Cup tourists have discovered New Jersey deli:
2
115
ficus retweeted

105
227
2,620
1,211,143
Jun 11
I've had a similar thought. A lot of the way we structure code right now is the result of decades of hard-won knowledge about how to enable efficient collaboration between people.
It seems like LLMs could optimize coding style by exploring ways of structuring code so weaker and weaker models can still successfully perform tasks in a codebase. There are surely stylistic quirks that are peculiarly impactful to transformers, but I bet there would be a lot of overlap with human capabilities. Optimizing for understanding should help even the top frontier models, allowing them to understand things “at a glance” without having to explicitly explore. There will remain “better” and “worse” ways to code.
120
Jun 11
Genuinely heartwarming to know that some people never had to use any software made by RealNetworks.
I am generally a very optimistic, positive person. I do not like to hate on other products. But Workday is, by a wide margin, the worst piece of software I've ever used in my career. I cannot believe that this is a $34 billion company. I pray the AGI gods fix this hot mess.
1
276
Jun 10
Fable is really good but unfortunately it is still doing a lot of load-bearing real work.
2
81
Jun 8
Your reaction to this kind of thing should be anger, not admiration. You would have had this two years ago if Apple didn't decide that only they have the right to do it.
Ok, first “woah” (although simple), Apple flexing platform advantage: Ask Siri what’s on screen (from any app). Siri AI pops out of notch and helps right there with context what’s going on on phone. You know how many times I have taken screenshots to upload to ChatGPT/Claude?
1
183
Jun 8
Apple moving like an AI startup (making overreaching claims on their models because of a small amount of finetuning)
Craig Federighi says at a post-event talk with media that Apple isn't using the same Gemini models that Google deploys to its own users.
1
185
Jun 8
Can someone at OpenAI please make /skill work in codex CLI instead of $skill? I was going to make fun of this as Codex's own little version of insisting on CLAUDE.md but I tried it in the GUI app and slash appears to work there so I'm wondering if it is just an oversight.
168
Jun 8
seems fine
128
ficus retweeted
This is the best and most balanced report I've read by Anthropic, free of many of the super sci-fi, everything-is-exponential language of some other reports I've read by this amazing team. But one line is dead wrong. This one about recursive self-improvement: "[If] AI systems themselves become capable of full recursive self-improvement, and begin building their successors...In this world, the pace of progress in AI development becomes determined entirely by the availability of compute (or the speed of discovering various efficiencies in algorithmic training or inference) for AI systems." Compute is absolutely NOT the only limiting factor in recursive self-improvement and not even the most important one. They are two more: 1) Time 2) Multiplicity Time is how long it takes to get an answer. Multiplicity is when there is no right or wrong answers but only shades of gray with right(ish) answers and wrong(ish). They even point to one of them (time) just a few paragraphs later: "More intelligence can’t learn what a drug does over decades of use, can’t hold elections sooner than a constitution dictates, and can’t turn a stranger into an old friend in a weekend. For most people, the felt pace of this future will still be set by the bottlenecks, even if the laboratory upstream runs at the speed of compute. That collision, where recursive intelligence building itself ever faster meets the world of humans, relationships, and governance, is another part of this future we can’t predict." But let me make it even more clear: AI got good at code and games because they have great feedback loops and tight timelines. If the code works or does not, you know pretty quickly. It good at driving for the same reasons. Don't die or drive off the road or hit someone are achievable (though difficult) goals with clear, fast feedback. You cannot answer the question "is this a good article?" or "do I write well?" because that is multiplicity, shades of gray that are hard to judge. Humans judge this by self-awareness and feedback from others. AI might be able to approximate the second but only if it develops more of the first (harder). "Will my wife like this surprise present?" Hard to get good at that even if you're a master. Took me many years of trying and judging her responses. :) Time is also a massive factor. The question of "did I make money in business?" can't be answered in a short time line. There is no way to know the answer faster, and short term success doesn't predict long term. "Will this drug cause bad side effects twenty years from now?" That can only be answered in twenty years. No amount of compute changes that. "Will this building fall down faster than this one if I build it a different way?" You can run basic physics and math rules to help you heuristically figure it out, but only time gives you the true answer. These two constraints, time and multiplicity, are the death knell of any Doomer/Less Wrong fantasies about fast takeoff and instant super genius AI. You can have all the compute in the universe and you still can't compress twenty years of drug side effects into twenty minutes. You can have a 500 trillion parameters and you still can't definitively answer "is this beautiful?" because beauty is not a optimization target with a clean gradient. The recursive self-improvement loop doesn't hit a wall because of compute. It hits a wall because of reality. Reality is slow, messy, ambiguous, and full of questions that only time and lived experience can answer. Compute is the bottleneck that engineers see because it's the one they can measure. Time and multiplicity are the bottlenecks that the real world imposes and no amount of silicon can brute force past them. That's why even nature only "solved" good/bad by brute force: evolution. Does this agent/human/creature survive and reproduce? That's good. Otherwise not good. Companies follow the same rule. Did this survive and make money over time? Good. Otherwise bad. Imperfect, lossy, dumb, blind, slow. AI is changing the world already. It will get better and better. But the road to better is long and winding, not a vertical line to godhood. And that should make you more hopeful, not less.
Our internal data shows Claude is accelerating AI development—a possible path to recursive self-improvement, or AI autonomously building a more capable successor. It’s happening faster than we thought, and the implications deserve greater attention. anthropic.com/institute/recu…
58
57
527
125,664
Jun 5
I think this is going to end up looking like a step change rather than the beginning of an exponential.
Replying to @AnthropicAI
Today, Anthropic engineers on average ship 8x as much code per quarter as they did compared to 2021-2025.
130
Jun 2
Go vote!
43
May 30
Ball knowers choose D. I live in D and still choose D.
You just won a 2-week, all-expenses-paid vacation. But there’s a catch: you have to stay within one region the whole time. What are you picking?
3
179
May 29
auto mode be like “denying destructive action Bash(rm /tmp/throwaway-script-that-has-existed-for-15-seconds.sh)”
89
May 26
The fundamental rift in wealth tax discourse is about whether it's more important for the rich to bear a commensurate share of society's economic burden (% of total tax revenue) or commensurate pain to other taxpayers (% of their own income or wealth).
92
May 26
Naval is the first person I ever muted on Twitter. Deep thoughts for dumb people.
May 25
nothing ever changes
2
237
May 15
I don't like how codex is always using perl. Makes me nervous
1
197