Joined August 2009
957 Photos and videos
Pinned Tweet
Ever wondered why presenting more facts can sometimes *worsen* disagreements, even among rational people? πŸ€” It turns out, Bayesian reasoning has some surprising answers - no cognitive biases needed! Let's explore this fascinating paradox quickly ☺️
20
70
371
103,537
Future public models could be nerfed for cyber tasks etc in such a way that you'd have to spend ridiculous amounts of test time compute to overcome that Smart models will always be able to reason their way to security vulnerabilities but it depends from what knowledge and experience base they start from. Obv this would also nerf the models for defensive cyber tasks A licensing regime seems most likely where certain providers get access to better unlocked models but are accordingly vetted and monitor all use cases I suppose once open source AI catches up it will also be regulated accordingly The counter forces to this are continual learning and context adaptability: a sufficiently smart and fast model that can adapt well to additional in-context information might well quickly unlock these capabilities given the right skills (descriptions) and access to additional resources This could be weakened by limiting context sizes and the plasticity of models but open-source recipes would be hard to contain (I imagine regulation still makes sense regardless to increase access friction)
4
1
9
1,713
This is a huge risk btw. Once one doesn't need most employees anymore, there is no one left to stand up against bad policies anymore or to blow the whistle It won't be a question about the governance at AGI companies and whether they have good preparedness frameworks but what the new NSA(?) leadership cares about (and pretty sure it will be more myopic than the flourishing of humanity)
Tyler Cowen on the Fable/Mythos event. The issue with point 5 is that we are probably less than a year away from powerful RSI. Once automated researchers reach parity, the USG 𝘀𝘒𝘯 nationalize the labs and run them effectively, without any of the people currently working there.
5
5
46
6,289
The UK & Europe need a frontier AI lab and we need a lot more compute in Europe as a whole
26
17
205
15,529
Andreas Kirsch πŸ‡ΊπŸ‡¦ retweeted
It is unreasonable to expect access to be gated by a "good vs evil" classifier. We should instead expect frontier AI access to become subject to strict regulation, like some chemicals and biological materials. Companies will have to get a license to use it for specific applications and report their usage. This obviously also means export controls. None of this should be surprising as it applies to any powerful technology or ingredients for developing it.
2
2
11
1,299
Andreas Kirsch πŸ‡ΊπŸ‡¦ retweeted
All of the worst impulses of the Trump presidency on full display. No plan or strategy, everything reactive, arbitrary, & maximally invasive Anthropic is just repeatedly being singled out for ratfucking because they have insufficiently bent the knee
The US government, citing national security authorities, has issued an export control directive to suspend all access to Fable 5 and Mythos 5 by any foreign national, whether inside or outside the United States, including foreign national Anthropic employees. The net effect of this order is that we must abruptly disable Fable 5 and Mythos 5 for all our customers to ensure compliance. Access to all other Claude models is not affected. We apologize for this disruption to our customers. We believe this is a misunderstanding and are working to restore access as soon as possible. Read our full statement: anthropic.com/news/fable-myt…
24
137
1,407
47,942
Huh I guess this is a badge of honor. A few days ago he was still following me πŸ˜…
23
1,437
This is an amazing read. Like AI 2027, it's a great way to spend some time thinking about the future and feels very plausible That said, it's already overtaken by recent events thanks to the USG pulling Mythos and Fable already. This should help Europe wake up faster hopefully
"What will happen to Europe if it keeps ignoring AI?" Three American labs each (!!) operate more AI compute than all of Europe combined. Today we're launching Europe 2031: a story of what might happen if that doesn't change.
6
4
61
5,642
Took most of yesterday off to visit Naarden in the Netherlands... What did I miss?
3
1
526
Andreas Kirsch πŸ‡ΊπŸ‡¦ retweeted
FrontierMath: Tiers 1–4 (v2) is live. We concluded an audit that addressed errors in 42% of problems. Rankings are similar but scores are higher across the board. The current leaders are GPT-5.5 (xhigh) with 85% on Tiers 1–3 and Google’s AI co-mathematician with 76% on Tier 4.
27
66
575
114,003
Andreas Kirsch πŸ‡ΊπŸ‡¦ retweeted
Trevor was making a joke in reference to anthro, but jokes on him, ... Intel's compiler *did* intentionally generate worse code for AMD, called the "cripple AMD feature".
Remember when compilers would detect that someone was using it to build another compiler and silently inject bugs?
15
45
1,077
71,678
Andreas Kirsch πŸ‡ΊπŸ‡¦ retweeted
I think narratives like the "permanent underclass" mindset can be very harmful. Not because they cause emotional depression, but they change the game-theoretic dynamics. People cooperate in prisoner's dilemma/commons scenarios when they believe the game has many turns. But if you believe the game only has a few turns, and that you should win otherwise you become the "permanent underclass", then the rational self-interested move is to defect: do whatever you can to win in the short term. I think the whole AI research community is in that scenario now. No one stays in academia to educate new talent. Frontier lab competition becomes more and more aggressive and toxic. I can't imagine how much public benefit those doomer narratives alone will cost us. Especially if they're wrong, which I think they are
19
32
317
30,828
🫑🫑🫑
Scoop: A Google director resigned over the company's AI deal with the Pentagon for classified work. "I am quite sad that it had to come to this, and desperately hope Google management re-discovers its moral compass," he wrote in a letter circulated internally
1
22
6,449
Andreas Kirsch πŸ‡ΊπŸ‡¦ retweeted
This is wildβ€”and likely a sign of things to come as we transition to a web that is optimized for bots more than humans. theatlantic.com/technology/2…
63
612
3,161
255,970
Clearly well aligned with humankind πŸ™Œ
I don’t even know what to say.
1
6
1,237
Andreas Kirsch πŸ‡ΊπŸ‡¦ retweeted
I'm seeing a lot of hate for Anthropic's decision to secretly nerf ai RnD capabilities. But I haven't seen critics engage with the imo strongest defence of Anthropic: 1. By far the biggest risks are from superintelligent AI 2. To manage these risks the leading company will need to pause partway through the intelligence explosion. (Pausing at this time allows them to a) generate the compelling empirical evidence of misalignment that will be needed justify a longer global pause, AND b) use powerful ai to massively accelerate alignment progress. A pause today couldn't accomplish either.) 3. A pause is MUCH more likely if the leading company has a big lead. It's much less likely if multiple companies are neck and neck. (More specifically, Anthropic had good reason to think OAI wouldn't pause. This makes it v hard for Anthropic to pause if they're neck and neck. Hopefully recent announcements build mutual trust that everyone will pause) 4. If lagging AI companies can use the leader's AI for ai RnD during an intelligence explosion, the leader *cannot* maintain their lead. (This point is underappreciated. If you model out the intelligence explosion, you'll find that a laggard with equal access to the leading AI quickly catches up to the leader bc the leader faces big headwinds from having plucked low hanging fruit.) 5. So: sharing ai RnD access with competitors massively decreases the chance of a pause at the critical time, and massively increases the risk from superintelligent AI 6. Anthropic can't block competitors using Mythos without the silent sabotage. For the obvious reason: it's very hard for a frozen safeguard to block someone that can iterate against it. It sucks that this is the only way, but it is. 7. They've long had terms of service against competitors using Claude for AI RnD. They have a right to enforce their terms of service. This is the only way. --- Overall, silent sabotage is a very spooky and scary precedent to be setting and imo the wrong call. But still, the above is a strong argument for Anthropic's actions and I haven't seen it rebutted.
45
20
228
37,437
I'm surprised at all the outrage directed at Anthropic now and all the crazy accusations when there was a very muted response to more questionable behavior by various "leading AI companies" a month ago Where was the outrage a month ago when SpaceX, OpenAI, Google, Nvidia, Microsoft, Amazon Web Services, Reflection AI and Oracle signed contracts with the Pentagon without enforceable constraints against mass surveillance and autonomous weapons etc (evidenced in the case of OpenAI and Google at least and in line with Hegseth's requirements) while their PR spin claimed the opposite? For anyone who was quiet then and is whining now, please feel called out as entitled hypocrites πŸ€·β€β™‚οΈ
28
3
71
9,252
Anthropic and AI researchers who were excited about using Fable for their LLM research (j/k)

ALT Here Ya Go! GIF

3
1
15
1,059
From my timeline, it looks like folks at PrimeIntellect really liked to use Claude to do their work for them and are extra salty now that they can't keep surfing on Anthropic's acceleration? Chill, Opus 4.8 is still around to do the work for you 🌝
12
2
34
5,932
Next up: AI researchers upset that Fable is not nerfed on their research at all
6
61
3,373
I'm confused why so many are upset and surprised that they can't use Claude anymore to catch up with Anthropic - or at least they won't get the same benefits (it is not clear how much worse the model is vs Mythos for frontier RE tasks) The safety case is equally strong as is the competition case Open-source AI is inherently unsafe and one def doesn't want to enable an RSI loop on top of open-source models any time soon Also people in AI should be happy that it means their jobs are safe for a bit longer Lastly this is consistent with everything they have done all the way back to GPT-2's release
57
9
138
53,632