am.will

am.will

1,359 Photos and videos

Tweets

Pinned Tweet

am.will

@LLMJunky

May 29

Introducing Lynk, a brand new way to interact with your favorite harnesses on the go. Compatible with OpenClaw, Hermes, Codex, and local edge models. Fully featured client allowing you to easily and quickly switch between your favorite agents. Lynk has been my absolute favorite way to kick off tasks on the go, completely replacing Telegram. Only available as a beta on Android at present, but iOS version is in development. Works via local network or Tailscale. Features: - Create and continue threads with your favorite harness - OpenClaw, Hermes, Codex, and local models - Android Phone control - Realtime voice agent - Speech-to-text transcription - Codex Pets live notifications - Draw over screen quick access chat overlay - Open Source software Join the beta in the comments.

2:15

128

47,255

clem 🤗

am.will retweeted

clem 🤗

@ClementDelangue

There is no inevitability in AI. We all have agency in what comes next: Path 1: closed-source APIs, concentration of power, and a future decided by a handful of people in Silicon Valley and DC Path 2: open-source AI, where everyone gets to participate, own, and build together, including orgs like the city of Rio. Pick your path anon!

SemiAnalysis

@SemiAnalysis_

SITUATION DETECTED: The city of Rio de Janerio has post-trained a model. Based on Qwen 7/2, Rio 3.5 Open 397B adds SwiReasoning on top of the base Qwen model — a framework that dynamically switches between standard chain-of-thought and latent-space reasoning, guided by entropy-based confidence signals, so the model only "thinks out loud" when it needs to and otherwise reasons silently in hidden space for better token efficiency.

382

27,121

am.will

am.will

@LLMJunky

Congratulations Knicks fans. You earned this. Wembenyama and Castle walking off without shaking anyone's hands is extremely weak. Lost a lot of respect from me. Jalen Brunson was incredible.

1,180

am.will

am.will

@LLMJunky

Great idea. Is @karpathy single?

Alex Kehr

@alexkehr

Jun 13

I’m willing to marry any Anthropic employee who needs a US citizenship

1,906

DanT

am.will retweeted

DanT

@uyintans

Replying to @GeorgeMayer

Preventing models from finding and fixing bugs would break agentic coding

681

am.will

am.will

@LLMJunky

No Dario cannot "just fix it"

clem 🤗

@ClementDelangue

10h

Lots of people have known for a while that guardrails for frontier model APIs are very easily jailbroken, quite shallow and impossible to fix. They’re mostly a smokescreen and distraction, in my opinion. We need a different paradigm for AI safety!

1,316

Everlier

am.will retweeted

Everlier

@Everlier

Replying to @LLMJunky @DavidSacks

As they say in agentic security, there's no firewall for English.

901

am.will

am.will

@LLMJunky

11h

Oh, so the only problem is that Anthropic didn't "fix the jailbreak issue" Gee, didn't realize it was that simple. Can't believe no one has ever thought about this problem before. Shame on Anthropic for not fixing a potentially unsolvable problem inherent to all LLMs.

David Sacks

@DavidSacks

11h

I’ve had a number of conversations with folks inside and outside government about the current situation with Anthropic, and here is what I believe to be true: — As we know, Anthropic publicly released its Mythos class models earlier this week under the commercial name Fable. — Fable is Mythos with guardrails. But if those guardrails fail, then you’ve exposed Mythos and its advanced cyber capabilities to people who shouldn’t have them. (Keep in mind that Anthropic itself widely promoted the idea that Mythos was a cyberweapon and needed to be regulated as such. They asked for government regulation of Mythos and championed the guardrails on Fable. If there is a vulnerability — big or small — it is Anthropic’s responsibility to patch.) — A highly credible trusted partner of both Anthropic and the USG who was testing Fable came forward with a jailbreak of those guardrails. The Admin asked Dario to fix the jailbreak or de-deploy the model. Dario refused. — In their blog post, Anthropic defended its decision by saying the jailbreak isn’t serious. That is not what the trusted partner and the USG believe; nor is that kind of minimizing language consistent with Anthropic’s brand as the AI safety company. It’s difficult to fathom how they could claim a jailbreak allowing operability of a cyber weapon could be defined as not “serious.” — In the past, Anthropic has always said that safety must be top priority and taken super seriously. In this case, Anthropic prioritized the continued offering of the consumer model over safety. — In reaction, the Admin issued the export control. The Admin did this reluctantly. It’s been very surprised that Anthropic hasn’t wanted to cooperate with a reasonable safety request (ie fixing the jailbreak issue). Anthropic’s reaction is very much at odds with their branding and ethos as a safe AI research community. — The Admin’s hope now is that Anthropic remediates the safety issue, the export control is lifted, and Fable goes back into general release. The Admin wants all of this to happen as soon as possible. It is frankly bewildered that Anthropic hasn’t wanted to comply with safety requests that it previously said were its highest priority. — Those trying to misdirect and tie this action to the prior DoW/Anthropic issues are wrong. The Admin values Anthropic’s technical capabilities and feels that this issue, while serious, should be easily resolved. The ball is in Anthropic’s court.

4,602

am.will

am.will

@LLMJunky

13h

I'd really like to see more 3D modeling benchmarks. Maybe I should just make one myself. IMO these tests really demonstrate a clear difference between SOTA models and their open source counterparts. This isn't even close to Fable 5 or GPT 5.5.

关木

@ZeroZ_JQ

15h

GLM-5.2确实有点东西

0:22

10,863

Alexander Smyslowski

am.will retweeted

Alexander Smyslowski

@smyslowski

21h

Replying to @LLMJunky

1,258

am.will

am.will

@LLMJunky

23h

Tonight is a sobering reminder that that literally anytime, these labs can turn the faucet off just like that. No warning. Just the flip of a switch. Intelligence on their terms, if at all. They will continue to withhold the best models, and tell us what we're allowed to have access to. This is why open source must win.

Anthropic

@AnthropicAI

Jun 13

The US government, citing national security authorities, has issued an export control directive to suspend all access to Fable 5 and Mythos 5 by any foreign national, whether inside or outside the United States, including foreign national Anthropic employees. The net effect of this order is that we must abruptly disable Fable 5 and Mythos 5 for all our customers to ensure compliance. Access to all other Claude models is not affected. We apologize for this disruption to our customers. We believe this is a misunderstanding and are working to restore access as soon as possible. Read our full statement: anthropic.com/news/fable-myt…

364

13,001

am.will

am.will

@LLMJunky

Jun 13

We got GTA6 before we got GTA6

Adam Sabla

@AdSabla

Jun 12

I made this GTA6 clone in 1 day with Claude Fable 5. Crazy (@GPTA6_Slop_City ). You can play it now at GPTA6.com You can run around, drive cars, shoot guns, fly planes and helicopters, run from cops and army. Play with friends, there’s even a real-time multiplayer mode with frag count (press TAB in game)! - Coded in @claudeai with Fable 5. - Music made in @Suno (When you sit into a car, it has a radio. Hosts talk about stuff and play 8 different tracks. All are AI related, soundtrack on Spotify) - 3d models used from @SyntyStudios (all characters, maps, objects, animations, rigs) - Voiceovers made with @ElevenLabs (People talking in the radio, characters, etc). - In-game ads by SpecificResume.com made in @ChatGPTapp - used SFX pack by imphenzia.com/universal-soun… GPTA 6: Slop City... Game Prompted Today, ASAP!!!

2:13

2,861

am.will

am.will

@LLMJunky

Jun 13

Okay, this is wild. Unless granted some kind of exclusion, Andrej Karpathy would not be allowed to work on, or with, Mythos due to export restrictions implemented by President Trump.

Andrew Curran

@AndrewCurran_

Jun 13

According to Grok, Andrej Karpathy is an EB-1 extraordinary ability green card recipient, not a US citizen. Thus under these new restrictions he is not permitted to use, or work on, Mythos 5 or Fable 5 as of 5:21pm tonight.

14,564

am.will

am.will

@LLMJunky

Jun 13

It's June 22nd already!?

1,004

Derya Unutmaz, MD

am.will retweeted

Derya Unutmaz, MD

@DeryaTR_

Jun 12

Some people are defending Anthropic despite everything that is happening, largely because Fable 5 is an amazing coding model. I can confirm this is true, and I do not think anyone would seriously argue against it. Even its high cost should not be an issue. We live under capitalism, and companies can charge what the market allows. But, and this is a very big but, there is a much more important issue here. What these people do not understand is that Fable will be the best coding model only for a short while. Others are very close behind. In fact, even better models will arrive soon, and coding itself is clearly on the path to being largely solved. Models will then become cheaper over time. If you do not get to build your amazing software a few months earlier, that probably will not change much in the world. At most, it may delay your personal ability to benefit from it for a while. It might even benefit you, by saving you from wasting money and time before better and cheaper versions arrive. But a doctor cannot wait to treat patients. A scientist trying to cure cancer does not have the luxury of waiting months. Every day of delay in research and clinical applications costs lives, potentially thousands of them. Every day that scientists around the world are denied access to the best models is another day the world is delayed from becoming better. And this is not only about model access. Anthropic has also advocated for pauses and regulatory capture. They are strongly against open models. In my view, this is not driven by some pure concern for humanity, but by the fact that such control would give them more money, more power, and more leverage over the future. Therefore, I believe it is far more important to be principled and stand for humanity than to chase short-term personal benefit. That is the reason for my outrage against Anthropic. It is nothing personal against poor Fable 5, or against the great AI engineers at Anthropic who are building these models. I do not doubt that many of them are sincere, and I am grateful to all frontier AI engineers who are pushing this technology forward. But I do hold those in charge of Anthropic responsible. Their founders and leadership should be held accountable for what I see as self-serving and deeply misanthropic actions. I also do not think they care. Not one of them has meaningfully responded to the outrage. This is also a note to everyone who keeps claiming that AI itself is the threat to human existence. No. It is not AI itself. It is the humans who control AI who may become the real threat to humanity, as I have said repeatedly. We have to resist this power capture at all costs, if we truly care about the rest of humanity.

584

34,230

am.will

am.will

@LLMJunky

Jun 12

There's one extremely important chart that they didn't mention here, and it's wildly important... Cost to run the benchmark. GPT 5.5 xhigh: $3,357 Fable 5: $9,940 Nearly 3x more expensive for a singular point on the intelligence index. Fable is a good model, but this only makes me more bullish on what GPT-5.6 will offer: Lower pricing Higher availability Significantly fewer refusals Anthropic should be proud, credit to them where its due. It is a very impressive model. But after the 22nd, I can't imagine I'll use it. They gotta find a way to get these costs down, or keep it on the plan.

Artificial Analysis

@ArtificialAnlys

Jun 12

We've updated the Artificial Analysis Coding Agent Index, replacing SWE-Bench Pro with Datacurve's DeepSWE benchmark - the swap lifts Codex with GPT-5.5 (xhigh) above Claude Code with Opus 4.8 (max), while the newly released Claude Fable 5 (max) in Claude Code debuts at the top DeepSWE, built by @datacurve, writes its tasks from scratch rather than adapting them from public GitHub issues or pull requests, so no model has seen the solutions during training. That matters because SWE-Bench Pro, the benchmark it replaces in our Coding Agent Index, had grown gameable, with some models recovering the fix from the repository's commit history instead of solving the task. The swap reorders the index: Codex with GPT-5.5 (xhigh) rises from 65 to 76, overtaking Claude Code with Opus 4.8 (max) at 73. Claude Code with Fable 5 (max), which enters directly on the refreshed index, leads at 77. SWE-Bench Pro had been flattering some combinations and penalizing others. More below.

216

33,234

am.will

am.will

@LLMJunky

Jun 12

x.com/i/status/2065542865380…

Everlier

@Everlier

Jun 12

I'll just leave it here

2,251

am.will

am.will

@LLMJunky

Jun 12

I would like to apologize in advance to everyone who bought $SPCX I just bought some. Enjoy the ride to zero.

2,606

am.will

am.will

@LLMJunky

Jun 12

Can't wait for Composer 2.7 to drop 🤪

Kimi.ai

@Kimi_Moonshot

Jun 12

🌘 Kimi-K2.7-Code, our latest coding model, is now released and open-sourced! 🔷 Improved coding & agent performance over K2.6: 21.8% on Kimi Code Bench v2, 11.0% on Program Bench, and 31.5% on MLS Bench Lite. 🔷 Reasoning efficiency: Less overthinking, with 30% lower reasoning-token usage compared to K2.6. 🔷 Long-horizon coding: Improved instruction following, higher end-to-end coding task success rates. ⚡️ 6x High-Speed Mode coming soon! 🔌 Available today via Kimi API and Kimi Code. 🔗 Kimi Code: kimi.com/code 🔗 API: platform.moonshot.ai

830

42,018

stevibe

am.will retweeted

stevibe

@stevibe

Jun 12

Would love to see if OpenAI drops a new version of gpt-oss.

257

26,722

am.will

am.will

@LLMJunky

Jun 12

As it turns out, GPT 5.5 is also pretty good at building in CAD! This is insanely impressive compared to just a few months ago. Required a handful of extra prompts though, $45 in cost, roughly. I don't think we're that far away from models being able to build almost anything.

0:22

am.will

@LLMJunky

Jun 12

Absolutely mind blown right now. In just three prompts, I went from an empty canvas to a theoretically fully-functional nitromethane RC car. Complete with working drivetrain, suspension, and motor. All done with Claude Mythos inside of @adamdotnew's AutoDesk Fusion extension. You simply could not do something like this in month's past. It's not 100% perfect, but with an eye for detail, you could easily fix the issues and build a real product. Special thanks to @zachdive for letting me take a Max plan for a spin. This used roughly $35 in tokens to build. Comment below if you want me to write a blog about how I did it, and I'll do more. In fact, help me think of another challenge to build.

207

30,814