Joined March 2021
50 Photos and videos
Pinned Tweet
7 Oct 2022
Everybody and their mums building the infra. So who's building the damn apps?
20
73
Incredibly weird take. Just customise your harness to fit the model as much as possible?
Jun 13
life lesson: never bet on a custom harness like Pi been loving my custom Pi setup for the last few weeks, the fact that I can build any extension, use any models but things are moving too fast today huge teams behind Claude / Codex change the way we develop almost every month so by building and maintaining a custom agent you're more likely to get left behind most models perform better in their native harnesses anyway, and using external ones is likely to get banned so I recommend betting your workflow on a portable primitives, like prompts, skills or scripts, instead of custom agents
1
149
Hashline is almost never the cheapest edit tool to use. I benchmarked 3 different edit tools across 5 different models to find out. 1) Replace: plain old string replacement 2) Patch: OpenAI's V4A patch format 3) Hashline: references lines via content hash anchors
1
1
166
Hashline was the one I most wanted to like given the hype. But across the full benchmark, it failed to beat replace on dollar cost. The hash anchor references do reduce some output tokens, but they add too much input overhead during reads to be worth it.
1
46
The best edit tool depends on your model and what you're optimising for: - Replace: as the sensible default - Patch: if the model was trained on it - Hashline: when edit density is high enough to amortise the anchor tax Full writeup charts: aaroncql.com/writings/harnes…
41
Interesting, Opus starting from 4.7 seems to be trained on the apply_patch tool call that GPT uses.
86
Can it run doom?
Introducing Monako Glass 👓 The world's first wearable Linux computer in glasses form. Run Claude Code, Codex, and any coding agent — anywhere.
4
6
579
Normally an open source tool getting acquired by a larger corporation usually doesn't bode well. But cloudflare has a good reputation amongst open source projects, so I'm cautiously optimistic!
VoidZero is joining Cloudflare. Our mission stays the same: to make JavaScript developers more productive than ever before. Vite, Vitest, Rolldown, Oxc, and Vite remain MIT-licensed. Evan and the VoidZero team will continue leading them. Cloudflare shares our commitment to open source. Together, we can keep investing in the tooling developers rely on every day, while bringing the Vite ecosystem and Cloudflare’s platform even closer together.
2
206
Any terminal benchmark scores out yet for Gemma 4 12B?
1
119
Pretty pretty please open source 35B and 27B next!!! 🥹
👏👏 Introducing Qwen3.7-Plus — a multimodal agent model that unifies vision and language into one versatile agent foundation. ✅ Multimodal interactive hybrid agent: unified GUI & CLI operation across visual and text tasks ✅ Versatile coding agent & productivity assistant with full-modality input ✅ Visual Agent: perception, reasoning, grounding, and search-augmented QA ✅ Cross-harness generalization across diverse agent frameworks One model. Sees, thinks, codes, acts.🙌🙌 Now available via API on Alibaba Cloud Model Studio. Try it — let us know what you build.😎 🔗🔗⬇️⬇️ Blog:qwen.ai/blog?id=qwen3.7-plus Qwen Studio:chat.qwen.ai/?models=qwen3.7… API:modelstudio.console.alibabac…
1
2
238
AaronCQL retweeted
👏👏 Introducing Qwen3.7-Plus — a multimodal agent model that unifies vision and language into one versatile agent foundation. ✅ Multimodal interactive hybrid agent: unified GUI & CLI operation across visual and text tasks ✅ Versatile coding agent & productivity assistant with full-modality input ✅ Visual Agent: perception, reasoning, grounding, and search-augmented QA ✅ Cross-harness generalization across diverse agent frameworks One model. Sees, thinks, codes, acts.🙌🙌 Now available via API on Alibaba Cloud Model Studio. Try it — let us know what you build.😎 🔗🔗⬇️⬇️ Blog:qwen.ai/blog?id=qwen3.7-plus Qwen Studio:chat.qwen.ai/?models=qwen3.7… API:modelstudio.console.alibabac…
271
457
3,946
488,394
If you work with multiple different models, you can almost certainly map them to people you've known. GPT: principled, meticulous, never wants to be wrong, but replies are slightly autistic. Opus: gets your vague brief immediately, but occasionally goes rogue and destroys half the work. Gemini: the creative one who just wants to rest and vest. Qwen/Deepseek: the intern grinding 80hr weeks who never quite hits the mark.
4
370
People of Pi! Made a Bun-native extension pack, with web access, subagents, revamped core tools, ANSI-compatible themes, fzf-style completions, Telegram mode, and more. Called it Pim (Pi IMproved) 😏. Its goal is to improve the out-of-the-box experience for both users and agents, without sacrificing composability with other Pi extensions. Pim with Qwen3.6-35B managed to average 37.8% (Peak of 41.6%!) on Terminal-Bench 2.0 over 3 full runs. Pretty damn amazing that a local model running on my MacBook can rival the performance of Claude Code Sonnet 4.5 (40.1%) and outperform Codex GPT-5-Mini (31.9%)... Open source and MIT. Try it out and tell me what you think: github.com/AaronCQL/pim-agen…! Ps, huge thanks to @badlogicgames @mitsuhiko for creating and making Pi so easy to customise, it was an absolute joy to work with!
6
647
TIL Jira MCP costs 12K tokens on startup, even when you don't use it... Audit your installed MCPs folks.
7
16
2,033
We finally solved a pet peeve of ours: simulated balance changes on Jupiter Wallet for Jito bundles show accurately across all txs now! Thanks @PierreArowana for the PR at github.com/jito-foundation/j…, and @0xTsathir for helping out.
1
12
707
Who approved this epileptic animation?
2
230
Ok, claude legit feels kinda lobotomised now
1
879
One of the few things I'm extremely proud of in Jupiter is RTSE. Working with the team day in day out to fine tune how we best estimate expected output and slippage, scouring over tons of data to get the most optimal calculation - the magical point that maximises output amount against minimising swap failures. This article is a great read and sums up quite succinctly the battles we've had to fight over the past few years. Special shoutout to @melvinzzy, @gn_dnomsed (and many more behind the scenes) - the brains behind it all. And special shoutout to all of our users, for always giving us the feedback that we need to improve on the system. Even now, we're still iterating and making it better. If figuring out slippage is still a chore to you, come speak to us!
Route A quotes 100 tokens, executes 95 Route B quotes 150, executes 50 Metis v8 picks Route A. It routes on what each path will actually execute at, not what it quotes. The best quote isn't the best swap. Read the full article by @melvinzzy here: developers.jup.ag/blog/why-y…
2
7
744