Alejandro AO 🤗

Alejandro AO 🤗

144 Photos and videos

Tweets

Pinned Tweet

Alejandro AO 🤗

@_alejandroao

Jun 4

I just recorded a deep dive into Pi's architecture. It's a minimalist AI coding agent with a beautiful design. Here's how it actually works 👇

33:00

667

78,236

Alejandro AO 🤗

Alejandro AO 🤗

@_alejandroao

Reminder that you can serve Kimi K2.7, Minimax M3, DeepSeek V4 on Inference Providers 🤗 👉 Single API endpoint 👉 Use the same credits for all models 👉 No markup price (we don't make money from this) 👉 OpenAI API compatible 👉 Largest selection of open models

Alejandro AO 🤗

Alejandro AO 🤗

@_alejandroao

15m

learn to improve your skills without breaking them (testing for skills) 🤓

Noé Flandre @NoeFlandre

16h

Finally catching up on some blog posts I saved for later. This one is VERY interesting. I never really dedicated that much of time to skills, and that was a mistake. Easy to read, well thought out and directly actionable in your everyday tasks! huggingface.co/blog/upskill

Alejandro AO 🤗

Alejandro AO 🤗

@_alejandroao

18h

I just built a quick harness that evaluates different combinations of Kimi K2.7, GPT 5.5 and Opus 4.8 Each one as architect or implementer. 👉 Ran each combination on a recent GH issue 👉 ⁠⁠measured cost and quality on each of the runs. 🧵 Takeaways and cost breakdown below

1,311

more replies

Alejandro AO 🤗

Alejandro AO 🤗

@_alejandroao

18h

Key takeaways: 👉 The top combinations mostly use Kimi as coder 👉 The best quality comes from using GPT-5.5 or Opus-4.6 as planner and Kimi as the implementer 👉 Combining the absolute SOTA models with open models shrinks your bill without sacrificing quality! 🤗

231

Alejandro AO 🤗

Alejandro AO 🤗

@_alejandroao

18h

Full write up: alejandro-ao.com/llm-planner…

DuoBench: Best LLM pairs for coding?

I used DuoBench to compare Kimi K2.7, Kimi K2.6, GPT-5.5, and Claude Opus 4.8 on a recent CPython bug fix by quality per dollar.

alejandro-ao.com

141

Alejandro AO 🤗

Alejandro AO 🤗

@_alejandroao

18h

Using multiple models in the same workflow can reduce your costs 7x without losing quality. My favorite team so far: 👉 GPT-5.5 planner Kimi K2.7 Coder Here's how you can measure the best planner-coder pairs for your workspace 👇

17:45

2,882

Alejandro AO 🤗

Alejandro AO 🤗

@_alejandroao

20h

x.com/i/article/206651604494…

2,694

Alejandro AO 🤗

Alejandro AO 🤗

@_alejandroao

20h

follow @ben_burtenshaw for the best agent material 🤯

Ben Burtenshaw

@ben_burtenshaw

21h

we need to get real and move fast about training our own agents. orgs, teams, and individuals all need to be improving their agent. not waiting for the next API to be released (or not). this is a practical hands-on session where I'll show you how to train (and improve) your own agent based on traces. it's supported by tutorials and a repo, starts ofs simple, and works today. over the coming weeks, I'll add more advanced classes working up to RL environments and harnesses. If you're building around training agents, jump on stream, and demo your work.

269

Alejandro AO 🤗

Alejandro AO 🤗

@_alejandroao

Jun 14

guys we need to use telegram more to talk to agents

Pavel Durov

@durov

Jun 13

We now support rich formatting for all chatbots. Tables, nested lists, inline media, formulas, headers and more — right in Telegram messages. 🔨 Start building! Docs: core.telegram.org/bots/api#r…

0:15

186

Alejandro AO 🤗

Alejandro AO 🤗

@_alejandroao

Jun 10

fable 5 feels about as good to me as gpt 5.5... which is, like, pretty damn good 👍

151

Alejandro AO 🤗

Alejandro AO 🤗

@_alejandroao

Jun 10

While everyone is busy with Fable 5, Cohere just dropped a local coding model that can run on your laptop. Enters North Mini Code — an open-source coding model that runs on a single H100. 🔹 30B params, only 3B active (MoE) 🔹 Apache 2.0 licensed 🔹 256K context window 🔹 On-prem deployment ready

1,672

Alejandro AO 🤗

Alejandro AO 🤗

@_alejandroao

Jun 10

Some highlights: 🤖 33.4 on the Artificial Analysis Coding Index, outperforming Qwen3.5 (35B), Gemma 4 (26B), and Devstral 🚀 2.8× faster than Devstral Small 2 🎖️ Ranked #8 out of 127 open-weight models for output speed 👾 Designed for agentic workflows: sub-agents, code reviews, and architecture mapping 👨‍💻 MoE architecture: 128 experts with just 8 activated per token. Try it on your favorite coding agent: huggingface.co/CohereLabs/No…

CohereLabs/North-Mini-Code-1.0 · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

184

Alejandro AO 🤗

Alejandro AO 🤗 retweeted

Alejandro AO 🤗

@_alejandroao

Jun 4

I just recorded a deep dive into Pi's architecture. It's a minimalist AI coding agent with a beautiful design. Here's how it actually works 👇

33:00

667

78,236

Alejandro AO 🤗

Alejandro AO 🤗

@_alejandroao

Jun 8

opened claude code for the first time in a while today. is this normal?? i didn't even resize the terminal window and it keeps getting all messed up. is this a common issue?

426

Alejandro AO 🤗

Alejandro AO 🤗

@_alejandroao

Jun 8

i got many new followers recently. hey there! 🤗 how can i help you build better AI apps? what tutorials/courses would you like to see?

111

Alejandro AO 🤗

Alejandro AO 🤗

@_alejandroao

Jun 8

what's your main LLM for coding?

40% gpt-5.5

40% claude (opus,sonnet,etc)

20% open source (kimi,m3,etc)

0% other

20 votes • Final results

305

Alejandro AO 🤗

Alejandro AO 🤗

@_alejandroao

Jun 7

great question! in Pi, sessions are stored JSONL files that form trees. each line is a new message. each message is a node. when you do `/tree` and fork the conversation. you append a new message to the JSONL with a `parent` property that points to a previous message. this lets you fork a conversation and work across different branches. and all is stored in the same session JSONL file. great design.

Thiezn

@thiezn_

Jun 5

Replying to @_alejandroao

@grok help me better understand the pi concept of trees in its session management. What happens when you fork a new tree branch from an earlier message? Will that basically create a new session, with all the messages so far in its context? Can you then work of both tree branches independently? Or is it more like a rollback thing where it will undo any tool calls that might have happened before? Also, could you explain what the effect on kv cache would be when creating a new tree branch? Would the model need to keep two separate caches or can the first common part of the cache be reused (is that perhaps prefill cache purpose?) note, im an amateur, be gentle with me

2,015

Alejandro AO 🤗

Alejandro AO 🤗

@_alejandroao

Jun 5

Just posted this video on YouTube with chapters, and an extra section on skills parsing at the end. Link here 👇

Alejandro AO 🤗

@_alejandroao

Jun 4

I just recorded a deep dive into Pi's architecture. It's a minimalist AI coding agent with a beautiful design. Here's how it actually works 👇

33:00

1,431

Alejandro AO 🤗

Alejandro AO 🤗

@_alejandroao

Jun 5

Please subscribe 🤓 youtu.be/gTeujlv8qK0

PI Architecture EXPLAINED | Agent Loop, Tools, TUI and More

How Pi Works: Agent Architecture, Tools, TUI, and SkillsIn this v...

youtube.com

407