Dad / Dreamer / Coder

Joined December 2007
621 Photos and videos
Jake Luciani retweeted
it's 2am and i must go to bed, but... been testing fable 5 for hours and I know exactly how this sounds i'm literally the guy who calls out hype for a living but there have been so many small moments tonight that left me genuinely speechless this is not a normal model release and i see now why anthropic is so cautious the world is not the same after today
21
7
184
13,579
Jake Luciani retweeted
We've reset 5-hour and weekly rate limits for all users. Enjoy Fable 5!
1,355
1,818
35,765
2,196,167
Jake Luciani retweeted
50 minutes straight running 3 sessions on Fable 1M High and only 2% of weekly used thats suprisingly not bad, $200 plan
time to become bankrupt
17
2
148
27,578
Jake Luciani retweeted
Holy SHittttt Claude Fable 5 just finished Pokémon FireRed with vision alone 🤯 raw screenshots only no map / no nav / no hidden game state older Claude needed a helper harness This timelapse goes hardddddd....
🚨 Here we Go Fable 5 is state of the art on nearly all tested benchmarks, with exceptional performance in software engineering, knowledge work, scientific research, and vision. It can run for days, and the longer the task, the larger its lead over our other models.
86
144
2,979
766,471
Jake Luciani retweeted
Six months ago, no model could crack 20% on Vibe Code Bench. This week, Claude Fable 5 hit 90.4% How did we get here?
13
34
392
29,630
This is the Opus 4.5 moment x10.
Introducing Claude Fable 5: a Mythos-class model that we’ve made safe for general use. Its capabilities exceed those of any model we’ve ever made generally available.
2
42
Jake Luciani retweeted
Personal update: I've joined Anthropic. I think the next few years at the frontier of LLMs will be especially formative. I am very excited to join the team here and get back to R&D. I remain deeply passionate about education and plan to resume my work on it in time.
7,988
11,152
150,237
27,565,658
Jake Luciani retweeted
🚨 JUST IN: Anthropic’s Mythos AI reportedly found macOS flaws that could bypass Apple security, per WSJ.
81
143
2,138
145,156
Jake Luciani retweeted
A lot of people have been wondering about Mythos, Glasswing, and the vulns we / our partners are fixing. Today, I’m excited for us to start sharing more. (For context, I lead Glasswing @AnthropicAI.) Two independent evaluations this week—from XBOW and the UK AISI—confirm what we've been seeing internally: Claude Mythos Preview is a step change in autonomous cybersecurity capabilities. We need to start preparing fast for a world of models with this level of capabilities. The UK AI Security Institute tested the model we shipped at the launch of Project Glasswing and found Mythos Preview is the first model to solve both of their end-to-end cyber ranges, including one (Cooling Tower) which no model had ever cleared. But attackers (and defenders) have sophistication & cost constraints – Mythos is also the only model that clears every one of their tasks estimated over 8 hours under their deliberately low 2.5M-token cap. XBOW tested it on their offensive security benchmarks, finding "token-for-token, unprecedented precision." It's the only model to succeed at subtle V8 sandbox work. Other Glasswing partners shared similar stories. In a few weeks of testing, Mythos Preview has helped them find many thousands of (estimated) high critical severity vulnerabilities, sometimes double what they'd normally find in a year. I don't share this to boost Mythos. In fact, this is not about Mythos. It’s about preparing for the coming world of models being better, faster, cheaper, and more creative than some of the best human experts at dual use capabilities. Clearly, we need them supporting defenders as widely as can be done safely – and especially the least resourced ones. Within a year, Mythos will probably look quite dumb (relative to other new models). And others may release openly available or unguardrailed models of Mythos-level capabilities. We started Project Glasswing because capabilities like Mythos Preview's won't stay rare, or stay in careful hands. We are bringing it to defenders as fast as we responsibly can, while working to figure out, for example, the right safeguards and patching & disclosure processes. Also, to be clear, compute has never been a limiter in our rollout. Expect a fuller update on our Glasswing work in the coming days. XBOW report: xbow.com/blog/mythos-offensi… UK AISI report: aisi.gov.uk/blog/how-fast-is…
Replying to @AISecurityInst
Our cyber range results illustrate this step-up. Since our first Mythos evaluation, we received access to a newer Mythos Preview checkpoint. On a 32-step corporate network attack we estimate takes a human expert ~20 hours, this checkpoint completes the full attack in 6 /10 attempts.
72
221
1,430
674,060
This is slick
Two weeks ago at Sequoia, Karpathy complained about having to manually deploy things to the internet. Last week I won @aitinkerers Raleigh with the answer [p2claw] You tell your agent "publish this app" and you get a working, permanent link. No DNS, no cloud, no tunnel. Runs from your machine.
1
2
212
Jake Luciani retweeted
Two weeks ago at Sequoia, Karpathy complained about having to manually deploy things to the internet. Last week I won @aitinkerers Raleigh with the answer [p2claw] You tell your agent "publish this app" and you get a working, permanent link. No DNS, no cloud, no tunnel. Runs from your machine.
4
10
12
1,222
Jake Luciani retweeted
Some of you ran into Opus 4.7 refusing normal code edits with "this might be malware" warnings. That was a bug on our side, not the model being cautious. Older builds applied a stale safety prompt that Opus 4.7 doesn't need. Run claude update or relaunch the app.
169
163
4,537
318,097
Jake Luciani retweeted
A lot of bugs that folks may have hit yesterday when first trying Opus 4.7 are now fixed. Thanks for bearing with us🙏
I'll give Anthropic credit for moving quickly. Opus 4.7 Adaptive Thinking now triggers thinking much more often, including for the tasks it failed at yesterday. That also means it is doing a lot more web search. So far, a large improvement in output quality on non-coding tasks.
103
43
1,211
101,837
Jake Luciani retweeted
Apr 17
Introducing Claude Design by Anthropic Labs: make prototypes, slides, and one-pagers by talking to Claude. Powered by Claude Opus 4.7, our most capable vision model. Available in research preview on the Pro, Max, Team, and Enterprise plans, rolling out throughout the day.
4,141
14,999
148,135
63,894,780
Jake Luciani retweeted
We fixed a bug where rate limits on Claude subscriptions weren't properly adjusted for long context requests in Opus 4.7. We've reset 5-hour and weekly rate limits. Enjoy Opus 4.7!
676
877
18,917
1,938,885
Jake Luciani retweeted
Apr 7
Thank you to @AnthropicAI for sending FFmpeg patches
Introducing Project Glasswing: an urgent initiative to help secure the world’s most critical software. It’s powered by our newest frontier model, Claude Mythos Preview, which can find software vulnerabilities better than all but the most skilled humans. anthropic.com/glasswing
132
797
17,504
1,696,604
Jake Luciani retweeted
We’re committing up to $100M in Mythos Preview usage credits for our partners and over 40 other organizations that maintain critical software, including open-source projects. Anthropic will report back what we learn.
30
66
2,331
652,961
I’m so glad Anthropic got to Mythos level abilities first. The world needs time to adjust existing software for a model like this. In the wrong hands it would be a disaster.
77
Jake Luciani retweeted
🚨 NEWS 🚨 The Apache Software Foundation Announces $1.5M Donation from Anthropic buff.ly/7bqBmU0 #AI #opensource #cloudnative #DataInfrastructure #ArtificialIntelligence
12
20
1,248