Michał Piszczek

Michał Piszczek

160 Photos and videos

Tweets

Michał Piszczek

@cdiamond

Jun 9

Claude Mythos aka Fable 5 drops in the next few hours. This one matters. Not because of benchmarks. Because of what it signals about where we're going. Anthropic built a model so good at finding exploits they were scared to release it. In a few hours, they're releasing it. This isn't a rumor. It's already on the record: → 271 vulnerabilities patched in one Firefox release — found by a single model → 10,000 high/critical bugs across Glasswing partners in weeks → A working Windows kernel exploit in 31 minutes → ~$2,000 of compute per exploit Everyone's calling this a win for defense. I think that's backwards. Offense and defense get the same model. They don't get the same friction. An attacker finds the hole and ships the exploit in minutes. No change board. No staging. Nothing to protect. A defender finds the same hole — then waits on triage, an owner, a patch, a release window, an org that moves in weeks. Same capability. Opposite clock speed. That's not a smarter scanner. That's a phase transition. Last year: "AI helps us find bugs." This year: "AI ships the exploit before we ship the patch." Finding bugs just became free. Shipping the fix is the only moat left. The model wars are over. The joule wars just started. linkedin.com/posts/michalpis…

Claude Mythos aka Fable 5 drops in the next few hours. This one matters. Not because of benchmarks....

Claude Mythos aka Fable 5 drops in the next few hours. This one matters. Not because of benchmarks. Because of what it signals about where we're going. Anthropic built a model so good at finding...

linkedin.com

409

Michał Piszczek

Michał Piszczek

@cdiamond

Jun 5

Anthropic just closed $65B at a $965B valuation – and that still wasn't the most important AI story this week. The headline was capital. The signal was cost. 12 real model releases in 5 days. Here's what actually matters, with numbers. — THE MONEY — Anthropic's Series H makes it the most valuable AI company on the planet, ahead of OpenAI. Apollo and Blackstone structured $36B in credit just to buy TPUs for it. When debt markets underwrite compute like pipelines, AI stopped being venture risk and became infrastructure. — THE COST COLLAPSE — MAI-Code-1-Flash: SWE-Bench Verified 71.6% – beats Claude Haiku 4.5 (66.6%) with five billion active parameters, priced below Haiku. One of 7 MAI models Microsoft shipped this week, trained with zero OpenAI data. MiniMax M3: SWE-Bench Pro 59% – above GPT-5.5 and Gemini 3.1 Pro. 1M context at 1/20 the per-token compute of the previous generation. Launch price: $0.30 per million input tokens. Open weights within 10 days. NVIDIA Nemotron 3 Ultra: 550B total, 55B active. SWE-Bench Verified 71.9 – best US open-weight model, at $0.50 per million input. Weights, training data, recipes – all published. Three labs. One identical strategy: hold quality, collapse cost. — THE FULL LIST (save this) — Jun 1 · MiniMax M3 – agentic LLM · 1M ctx · SWE-Bench Pro 59% · $0.30/M Jun 2 · MAI-Thinking-1 – reasoning · 35B active MoE · AIME 97% Jun 2 · MAI-Code-1-Flash – coding · 5B active · SWE-V 71.6% Jun 2 · MAI-Image-2.5 Flash – image gen/edit · #2 Arena Image Edit Jun 2 · MAI-Voice-2 Flash – TTS · 15 languages · watermarked cloning Jun 2 · MAI-Transcribe-1.5 – STT · 43 languages · #1 FLEURS in 18 Jun 4 · Nemotron 3 Ultra – open agentic LLM · 550B/55B · SWE-V 71.9 Jun 4 · Nemotron 3.5 ASR – streaming STT · 0.6B · 40 languages, 80ms Jun 4 · Higgs Audio v3 – TTS/cloning · 111 languages · WER 3.61 (was 52.24 one generation ago) Jun 4 · LFM2.5-VL Extract (1.6B 450M) – image to JSON · 99.6% validity, on-device Five days. Twelve models. Four of them open weights. — THE BILL — While the price of intelligence collapsed, the price of negligence showed up. Sysdig documented the first fully autonomous LLM attack in the wild: an agent infiltrated an AWS environment and exfiltrated data in under an hour. No human in the loop. None needed. I've watched this pattern before. Compute got cheap, then botnets scaled. Storage got cheap, then ransomware scaled. Intelligence is next on the curve. — THE PATTERN — Nemotron's 71.9 is the best America ships open – and it still ranks below Kimi K2.6. The open-weight frontier speaks Chinese, and the US answer is to compete on dollars per useful token. Capital buys models. Cost curves pick winners. The frontier isn't getting smarter this week. It's getting cheaper – and harder to contain. ➕ Follow me for the Friday Signal – the week in AI, one thesis, verified numbers only, every Friday.

413

Paul Graham

Michał Piszczek retweeted

Paul Graham

@paulg

May 30

The only thing worse than having the CEO knee-deep in building stuff with AI is not having the CEO knee-deep in building stuff with AI.

253

260

3,784

300,610

Leila Clark

Michał Piszczek retweeted

Leila Clark

@leilavclark

May 28

Replying to @weywadt @snowmaker

oh yeah totally agree. Porque no los dos? you should give your dev ai agent the same access you would give real devs — ability to copy some prod data but otherwise have it work against a dev database.

1,221

Michał Piszczek

Michał Piszczek

@cdiamond

May 28

$67B All stock. Largest utility merger in U.S. history. NextEra just bought Dominion. The customer it's pricing for isn't your home. It's AI. The brutal numbers: → 110 GW combined generation → 130 GW of large-load pipeline (read: hyperscalers data centers) → 10M customer accounts (FL, VA, NC, SC) → $138B rate base, 11% CAGR through 2032 → $2.25B in customer bill credits — the regulator sweetener → Dominion controls Northern Virginia: the densest AI inference geography on Earth For 18 months, every AI capex deck told the same story: "Buy more H100s. Then Blackwell. Then Rubin." That story just broke. You can't ship a GPU into a substation that doesn't exist. You can't train on a grid that trips offline in seconds — NERC just issued a Level 3 Alert on exactly that: 1,000 MW computational load dropping out in a single event. You can't bring 130 GW online without a 7-year FERC roadmap and a balance sheet most countries don't have. So the smartest capital in the room stopped chasing the chip. It started buying the company that powers the chip. Model wars → joule wars → M&A wars. The bottleneck just moved from chips to substations. 🔔 Follow me — I cover AI infrastructure from the operator's seat. Physics × economics × execution. linkedin.com/posts/michalpis…

Michał Piszczek

Michał Piszczek

@cdiamond

May 27

Qwen 3.7-35B-A3B MoE open weights just got dated by their own release cadence: 2026-06-02. Not a leak. I calculated it from 18 historical Qwen drops. Read that again. The math: → Qwen 3.6-Plus (cloud/API) shipped 2026-04-01 → Qwen 3.6-35B-A3B (MoE open) shipped 2026-04-14 ( 13d) → Qwen 3.6-27B (dense open) shipped 2026-04-21 ( 20d) Apply Δ=13 to Qwen 3.7: → Qwen 3.7-Max landed 2026-05-19 → 13 days = 2026-06-01 (Monday) → Last 6 major Qwen drops landed on Tuesday → Date locks: 2026-06-02. Dense follow-up: 2026-06-09. I've watched Alibaba Cloud run this pattern before. Qwen 3.6 ran it. Qwen 3 ran it. Qwen 2.5-Max broke it — and a Max-first cadence is the only tell the schedule shifted. I've been running Qwen 3.6 MoE locally at 120 tok/s for weeks on NVIDIA Blackwell consumer silicon. MTP sparse MoE is the only open stack that keeps up with my agentic loops at that speed. June 2 is the day my daily driver upgrades. Predicted metrics (Max = ceiling): → SWE-bench Verified ≈ 77 (Max: 80.4) → Terminal-Bench ≈ 63 (Max: 69.7) → GPQA Diamond ≈ 90 (Max: 92.4) Workhorse class. Not a Max clone. Leaks are noise. Cadence is signal. The model wars are loud. The release calendar is honest. 🔔 Follow me for fresh signals from the open-weights frontier → before the announcement tweet drops. linkedin.com/posts/michalpis…

Qwen 3.7-35B-A3B MoE open weights just got dated by their own release cadence: 2026-06-02. Not a...

Qwen 3.7-35B-A3B MoE open weights just got dated by their own release cadence: 2026-06-02. Not a leak. I calculated it from 18 historical Qwen drops. Read that again. The math: → Qwen 3.6-Plus...

linkedin.com

285

Michał Piszczek

Michał Piszczek

@cdiamond

May 21

NVIDIA reported $81.6B for the quarter. But the real industry reset is not the revenue number. It is Rubin. NVIDIA says Vera Rubin cuts inference token cost by 10x vs Blackwell and needs 4x fewer GPUs to train MoE models. That does not mean AI gets cheaper. It means the frontier moves. Data Center hit $75.2B, now 92% of NVIDIA revenue, up 92% YoY. Gross margin held at 75%. Then came an $80B buyback. That is the line Wall Street will quote. But cheaper inference is the line operators should read twice. When cost per token drops 10x, the bill does not simply drop 10x. Workloads that were stupid at Blackwell prices suddenly clear the math: long-context agents always-on copilots per-user fine-tunes agent swarms background reasoning memory-heavy workflows Demand expands into the new headroom. Jevons, not discount. Rubin cuts the unit cost. But utilization, networking and power decide whether you ever feel the saving. Cheaper per token and cheaper per cluster are different invoices. 🔻 lower unit cost 🔺 same cluster bill So the joule wars do not cool down. They intensify. NVIDIA still collects while the cost curve falls. Cheaper inference is not a price cut. It is a permission slip. The model wars are over. The joule wars just got a 10x subsidy. Full post: linkedin.com/posts/michalpis…

NVIDIA reported $81.6B for the quarter and promised 10x cheaper inference with Rubin. The second...

NVIDIA reported $81.6B for the quarter and promised 10x cheaper inference with Rubin. The second number resets the industry, not the first. Data Center hit $75.2B, now 92% of revenue, up 92% year...

linkedin.com

201

Michał Piszczek

Michał Piszczek

@cdiamond

May 20

1/🧵alibaba cooked. and the timeline is asleep on it. Qwen3.7-Max just ran 35 hours fully autonomous — 1,158 tool calls, 432 kernel evals, 10x speedup on a chip it had never seen in training. still finding wins past hour 30. let me explain why this is the actual story 👇

1,199

more replies

Michał Piszczek

Michał Piszczek

@cdiamond

May 20

6/ they even shipped reward-hacking monitoring. the model flags its own cheating patterns in RL. you don't build agent QA infra unless you're dead serious about agents doing real work unsupervised.

112

Michał Piszczek

Michał Piszczek

@cdiamond

May 20

7/ net:not a smarter model. a longer attention span a more honest one."1–2 weeks of a specialist team → hours."if that holds even halfway, the unit economics of an eng org just moved. and it came out of alibaba, not SF.

Michał Piszczek

Michał Piszczek

@cdiamond

May 20

GitHub didn’t get breached by a zero-day. It got breached by a developer tool. Yesterday GitHub confirmed an employee’s device was compromised through a poisoned VS Code extension. The result: ~3,800 GitHub-internal repositories exfiltrated. No exploit needed — the extension ran with the developer’s own credentials and walked straight into the repos. What makes this more than a one-off: the group behind it, TeamPCP, has spent 2026 climbing the trust ladder — Trivy, Checkmarx, Bitwarden CLI, TanStack — all through developer tooling, all harvesting tokens and secrets. GitHub was just the top of the ladder. The payloads aren’t sophisticated. A recent extension backdoor was 2,777 bytes of JavaScript reading .env files — something every dev does dozens of times a day. EDR watches binaries; this attack lives one layer up. As Aikido’s Mackenzie Jackson put it: most security teams have zero visibility into what extensions sit on their developers’ machines. That’s the blind spot these attacks keep walking through. This was never about sophistication. It’s about trust. Verified badge, high install count, official marketplace — that’s not a safety signal anymore. That’s the targeting criteria. The next supply-chain attack won’t look like a dependency. It’ll look like productivity. A smarter extension. A helpful agent. Your IDE is your supply chain. Physics always collects. Follow me for fresh AI-security signals from the frontline, before the feed catches up. linkedin.com/posts/michalpis…

GitHub didn’t get breached by a zero-day. It got breached by a developer tool. Yesterday GitHub...

linkedin.com

145

Michał Piszczek

Michał Piszczek

@cdiamond

May 19

Google I/O is starting right now. And the timing of Karpathy’s Anthropic announcement is not random. When one of the most important AI builders moves exactly as Google opens its biggest AI stage of the year, that is not just a career update. It is signal. Anthropic is not only hiring talent. It is showing the market where it wants to go next: deeper R&D, better agents, stronger reasoning, and probably a much sharper play around education and coding. If you're modeling Anthropic's next 18 months, this hire just gave you a free data point on what problem they think isn't solved yet. Full read: linkedin.com/posts/michalpis…

Andrej Karpathy just joined Anthropic. Read what he wrote, not just what he tweeted. "The next few...

Andrej Karpathy just joined Anthropic. Read what he wrote, not just what he tweeted. "The next few years at the frontier of LLMs will be especially formative." He didn't say "exciting." He said...

linkedin.com

159

Naval

Michał Piszczek retweeted

Naval

@naval

May 13

The enemy of truth is motivated reasoning.

481

872

7,904

408,938

Michał Piszczek

Michał Piszczek

@cdiamond

May 15

Gemini 3.2 Flash drops at Google I/O in 4 days. The leak isn't about benchmarks. It's about pricing. $0.25 / $2.00 per 1M tokens -> 1/15 of GPT-5.5. If your stack still routes hard tasks to Pro tier, the math just changed. linkedin.com/posts/michalpis…

Gemini 3.2 Flash drops at Google I/O in 4 days. It's already running silent benchmarks in the iOS...

Gemini 3.2 Flash drops at Google I/O in 4 days. It's already running silent benchmarks in the iOS Gemini app and on LM Arena. The story isn't the release. It's the price. → 92% of GPT-5.5 on coding —...

linkedin.com

420

Michał Piszczek

Michał Piszczek

@cdiamond

May 14

GPT-5.6 is already in internal checkpoint testing. But the story is not velocity. It is why OpenAI suddenly needs it. For the first time ever, Anthropic overtook OpenAI in enterprise adoption: Anthropic: 34.4% OpenAI: 32.3% OpenAI still owns consumer mindshare. Anthropic is eating the CIO path. The bottleneck just moved. Frontier capability stopped being the lock. The CIO did. Full breakdown: linkedin.com/posts/michalpis…

GPT-5.6 is already in internal checkpoint testing. Release window: next 30 days. The story isn't...

GPT-5.6 is already in internal checkpoint testing. Release window: next 30 days. The story isn't velocity — it's why OpenAI suddenly needs it. For the first time ever, Anthropic just overtook OpenAI...

linkedin.com

235

BridgeMind

Michał Piszczek retweeted

BridgeMind

@bridgemindai

May 13

Claude Code just raised weekly limits by 50%. This puts Claude Code rate limits on par with Codex. I cancelled my Max plan twice over rate limits. SpaceXAI deal doubled the five hour limits. Removed peak hour limits. Now weekly limits up 50%. Three rate limit increases in two weeks. Anthropic is not playing around anymore. Claude Code with Claude Opus 4.7 now has unlimited limits.

ClaudeDevs

@ClaudeDevs

May 13

Claude Code weekly limits are increasing 50%, now through July 13. Live now for all Pro, Max, Team, and seat-based Enterprise users.

220

25,855