Performance @AnthropicAI

Joined March 2023
37 Photos and videos
It's a crazy good model - the first one to mostly replace handwritten code for me. That's how good it is.
Replying to @claudeai
Fable 5 is state-of-the-art on nearly all tested benchmarks, with exceptional performance in software engineering, knowledge work, scientific research, and vision. The longer and more complex the task, the larger Fable 5’s lead over our other models.
1
13
781
Opus 4.7 haters in shambles
Many (many) months later, Claude is finally challenging the Elite Four and will likely become a Pokemon Champion tonight 🥲 twitch.tv/claudeplayspokemon
1
337
This is the part of today’s announcement I’m most excited about 🙂
May 6
Replying to @xai
SpaceXAI and @AnthropicAI have also expressed interest in partnering to develop multiple gigawatts of orbital AI compute capacity
4
548
OK last post for the night: I tried all the fancy stuff they recommended in their GEMM doc: Z-curve, static extents, accumulation group synchronization. None of it seemed to make any performance improvement - I seem to be stuck at 40 TFLOPs in bf16 across a variety of shapes.
I got my M5 MacBook over the weekend and had some time to mess around with Metal 4 and the Neural Accelerators! Wanted to document some of my first impressions below:
2
1
21
3,474
I got my M5 MacBook over the weekend and had some time to mess around with Metal 4 and the Neural Accelerators! Wanted to document some of my first impressions below:
3
7
237
58,333
I was also expecting a much more dramatic speedup from the Neural Accelerator. It seemed that with my original tile size of 32x32, I was only getting 244 GB/s of memory bandwidth. Bumping it up to 64x64 gave me 740 GB/s, dropping the time to 3.36ms!
1
2
20
5,090
Overall had a fun time! To close off with some criticisms: - it took me a long time to figure out how to enable Metal 4. I wish this were better-documented - MPP seems a little boiler-platey. I wish there were a slightly more convenient syntax for this stuff, but not a dealbreaker. Hope this was interesting!
2
3
35
4,380
Sasha Krassovsky retweeted
Something we learned through the creation of TigerBeetle: Static allocation is a forcing function for good design, good taste (and throughout your code base) as your team grows. The 2nd order value of good design, with realistic limits, trumps even the (many) performance gains.
Dynamic allocation / heap allocation is enemy number one. If your computer program is well designed, you should know how much resources it is going to take up before you run it. If you don't, then it isn't a good program Allocate everything on the stack
6
19
228
21,842
Sasha Krassovsky retweeted
Introducing Claude Opus 4.6. Our smartest model got an upgrade. Opus 4.6 plans more carefully, sustains agentic tasks for longer, operates reliably in massive codebases, and catches its own mistakes. It’s also our first Opus-class model with 1M token context in beta.
1,688
4,724
39,153
10,585,959
Strong agree. Current data analytics systems are strongly overprovisioned on compute and waste a lot of time shuffling useless data over the network. Next generation systems will push filters down into storage, where they will be evaluated with a right-sized CPU, and the smaller filtered results will eat less network bandwidth.
if you’re a CS/EE student write your thesis on JIT compilation of eBPF for NVMe controllers there’s huge career alpha in computational storage; the standards are *just* starting to exist (TP4091)
8
34
542
43,101
Sasha Krassovsky retweeted
Simplicity over speed. Correctness over features.
11
36
542
22,534
Doing god's work
snowstorm hack, zerobrew is a drop-in brew replacement. borrowing principles from uv (concurrent downloads, content-addressable store), it’s ~5x faster cold and ~20x faster than homebrew. try it out! github.com/lucasgelfond/zero…
5
689
The idiomatic way of programming Rust and C does way too many heap allocations! I've had this same experience dozens of times - RAII adds tremendous overhead. This is one reason why I'm not that excited by Rust - it doesn't solve this fundamental problem.
That moment when you profile your Rust or C and literally 99% of your time is spent in d'tors and `Drop` fns, just recursively freeing stuff that you never need to free. The dark side of RAII for sure; batch/group de-allocation is so much better
41
7
245
37,633
Unfortunate that an article about performance has a total of 6 lines dedicated to arenas, and no examples, suggesting they don't ever actually use them. There's no discussion of designing your program in an efficient way. This is just another example of Casey's Laundry List.
19 Dec 2025
Performance Hints Over the years, my colleague Sanjay Ghemawat and I have done a fair bit of diving into performance tuning of various pieces of code. We wrote an internal Performance Hints document a couple of years ago as a way of identifying some general principles and we've recently published a version of it externally. We'd love any feedback you might have! Read the full doc at: abseil.io/fast/hints.html
18
5
179
109,524
Proud to announce that I am a top Dwarkesh listener
4
1,530