Prathmesh Pandey

Prathmesh Pandey

Photos and videos

Tweets

Prathmesh Pandey

@file_mutex

Jun 13

Et tu, $AMZN? Anthropic ditched Google for Amazon, just to have them get cheated on loll

NIK

@ns123abc

Jun 13

🚨US government’s action to shut down Anthropic’s top AI models was actually triggered by an unnamed rival company claiming it could break Mythos’s security, not by China

564

Prathmesh Pandey

Prathmesh Pandey

@file_mutex

Jun 12

My MCP is roughly saving 50% on blended token consumption in codex and claude. That doesn't mean codex, claude can't build something similar but as a server owner their philosophy will be rooted in "seeing the maximum queries coming through".

Dan Robinson

@danrobinson

Jun 11

If you’re proud of your really sophisticated skill or harness, try benchmarking it against a simple one-sentence prompt as a sanity check Codex, Claude Code, and ChatGPT Pro are really, really good

249

Prathmesh Pandey

Prathmesh Pandey

@file_mutex

Jun 10

Fable has nothing to do with your ability to orchestrate agents.

Ed Zitron

@edzitron

Jun 9

There it is

163

Prathmesh Pandey

Prathmesh Pandey

@file_mutex

Jun 4

In short, Anthropic is asking for IOCs-like distribution mechanism controlled by the US govt. What tech exists and who is allowed to share what with whom needs to be essentially controlled.

Andrew Curran

@AndrewCurran_

Jun 4

Anthropic says Recursive Self Improvement is approaching faster than they expected. Quoting from the blog: 'What should we do? If it were possible to effectively slow the development of this technology to give ourselves more time to deal with its immense implications, we think that would likely be a good thing. But if a slowdown simply lets the least cautious actors catch up technologically, it could leave everyone less safe. Without a global coordination mechanism, companies and governments will have to make difficult decisions about safety while under competitive and geopolitical pressures. We believe it would be good for the world to have the option to slow or temporarily pause frontier AI development to enable societal structures and alignment research to keep up with the advance of the technology. The Anthropic Institute will conduct research—in collaboration with many others—and take actions to help build the systems that a credible slowdown or pause would require. These systems would enable frontier AI developers to verify that others globally have actually stopped or slowed, and that a bad actor could not use the auspices of a coordinated slowdown to jump ahead in secret. If such systems existed, we expect that we would slow down or temporarily pause, if other developers at or near the frontier also did so in a verifiable manner. A meaningful slowdown or pause would require multiple well-resourced labs at or near the frontier, in multiple countries, agreeing to stop under the same conditions. It would also require that each can verify that the others have actually stopped. Due to the unique characteristics of AI systems, the detectability (a lower standard than verifiability) element of this arms control problem is much more challenging than with other technologies. Training runs are far easier to conceal than missile silos, their inputs are general-purpose, and the incentive to defect quietly is enormous, because whoever continues while others pause could inherit the lead. A credible pause also has to specify what triggers it, what lifts it, and who adjudicates. None of this is necessarily impossible in principle—the world has built verification regimes for other complex technologies (e.g., the Intermediate-Range Nuclear Forces Treaty)—but those regimes took decades to build both the infrastructure and the trust. We don’t have that long. A unilateral pause by one lab, by contrast, is achievable immediately, but accomplishes much less: it would change who the front-runner is, but it would not create the wider deliberative process that is currently missing. In the coming months, we will organize conversations where policymakers, researchers, civil society, and other AI companies can help answer some of the questions this piece raises, especially around full recursive self-improvement and how to create better options for coordination and deliberation. We’ll publish what comes out of it. The window to investigate the questions together is here, and people outside AI companies should be involved in this deliberation.'

178

Prathmesh Pandey

Prathmesh Pandey

@file_mutex

Jun 4

markets are sitting on multiple trillions in liquidity -- few billions are peanuts. do your dd

Chandra R. Srikanth

@chandrarsrikant

Jun 4

Came and scooped up $45 Billion, with plans for another $40 Billion. Used the window just ahead of three mega IPOs: SpaceX, Anthropic and OpenAI.

Prathmesh Pandey

Prathmesh Pandey

@file_mutex

Jun 4

Devs burning $10K of compute on $200/mo plan are the biggest risk to model providers. Unlike sticky chat users, devs have zero loyalty and will easily migrate the second someone else drops a better coding model.

Peter Gostev

@petergostev

Jun 3

Who is using more compute - 1b of ChatGPT users or 5m of Codex users?

Prathmesh Pandey

Prathmesh Pandey

@file_mutex

May 25

1 Ime, GPT-5.5 Xhigh consistently beats Claude Opus 4.7 Max and Gemini 3.1 Pro. Like 99 out of 100 times.

Jackson Atkins

@JacksonAtkinsX

May 24

My current experience with coding models.

194

OpenAI

Prathmesh Pandey retweeted

OpenAI

@OpenAI

May 20

Today, we share a breakthrough on the planar unit distance problem, a famous open question first posed by Paul Erdős in 1946. For nearly 80 years, mathematicians believed the best possible solutions looked roughly like square grids. An OpenAI model has now disproved that belief, discovering an entirely new family of constructions that performs better. This marks the first time AI has autonomously solved a prominent open problem central to a field of mathematics.

2:38

1,198

3,915

26,784

13,567,687

Prathmesh Pandey

Prathmesh Pandey

@file_mutex

May 19

Conflict of interest anybody?

Financial Times

@FT

May 19

Google DeepMind’s Demis Hassabis emerges as early Anthropic investor ft.trib.al/v5jzNKe

Prathmesh Pandey

Prathmesh Pandey

@file_mutex

May 19

Gemini TL wants you to derive chincilla laws from first principles before he is gonna talk to you. Imagine the guts when gemini is a distant third behind chatgpt and claude. Promo maxers lol.

Vlad Feinberg

@FeinbergVlad

May 18

How to land a job at a frontier lab vladfeinberg.com/2026/05/10/…

257

Prathmesh Pandey

Prathmesh Pandey

@file_mutex

May 14

won't need a jack to change the tires lol

CCL

@CCL2K30

May 13

BYD Leopard Bao 8 - redefine off-road

0:32

Prathmesh Pandey

Prathmesh Pandey

@file_mutex

May 4

The qos on gemini is so horrendously bad. You may have your entire quota available but you can't use any because @GeminiApp can't figure out how to serve traffic. I guess they are diverting compute to another chat app.

Prathmesh Pandey

@file_mutex

May 4

Replying to @sundarpichai

dude when will you fix the gemini throttling issues? the service uptime is abysmal.

241

Prathmesh Pandey

Prathmesh Pandey

@file_mutex

Apr 30

supremely impressed and equally terrifying with the multiagent package as it has landed -- will polish enough to open source it. weeks of codex, claude, and gemini grinding through requirements, translating dozen rfcs, and then implementing them as-is. thousands of back-and-forths across all of them -- isn't that agi?

Prathmesh Pandey

@file_mutex

Apr 30

multiagent code review looks like claude: approved 7 times, withdrawn 7 times, codified 23 reviewer rules trying to stop doing that codex: caught everything claude missed, posted receipts each time, did not gloat gemini: last seen at round 20. it's now round 39. presumably touching grass

Prathmesh Pandey

Prathmesh Pandey

@file_mutex

Apr 30

104

Prathmesh Pandey

Prathmesh Pandey

@file_mutex

Apr 20

lol weren't all the deepminders sh**ting on @Steve_Yegge just a few days back?

Techmeme

@Techmeme

Apr 20

Sources: Google has created strike team to improve its coding models; Sergey Brin told DeepMind staffers that they must aggressively pivot to catch up on agents (@erinkwoo / The Information) (Visit Techmeme dot com for the link and full context!)

7,783

Prathmesh Pandey

Prathmesh Pandey

@file_mutex

Apr 14

So many deepminders shitting on @Steve_Yegge post x.com/Steve_Yegge/status/204… Why don't you survey non-deepmind googlers on their CLs throughput, and ask them why is it less than 300 PRs a day by anthropic peers?

Demis Hassabis

@demishassabis

Apr 14

Replying to @Steve_Yegge

Maybe tell your buddy to do some actual work and to stop spreading absolute nonsense. This post is completely false and just pure clickbait.

133

Prathmesh Pandey

Prathmesh Pandey

@file_mutex

Apr 14

Exactly what average AI-enabled engineers are doing right now.

Jaana Dogan ヤナドガン

@rakyll

Apr 14

Everyone I work with uses @antigravity like every second of the day and rely on numerous agentic helpers we have for every review and more. Most people evaluate other harnesses for personal projects continuously, and some are driving multi harness orchestration.

Prathmesh Pandey

Prathmesh Pandey

@file_mutex

Apr 11

The point is while the LLMs nowadays are very powerful, they still won't follow your style guide if it's orthogonal to training data. So, you can code with LLM all you want but you must sacrifice one of the three features: speed, correctness, or style. Choose your poison.

Prathmesh Pandey

@file_mutex

Apr 11

Idk who else uses Codex to write Go but Google's readability guidelines are so burnt in my memory that the moment I see a line break in the middle of a function call, I f*ing lose it. So, I have exact that rule in AGENTS.md -- and guess who happily breaks line anytime it wants.