Argentino | AI Engineer | Full Stack Developer | Building products @ Persiscal.com | Sideprojects @ module0.xyz | e/acc

Joined January 2022
151 Photos and videos
Evals are part of the moat.
Loop engineering is REAL. It’s not about picking the best model. It’s the loop you build on top that compounds. Private Evals are about to be HUGE. Score models on your outcomes and your traces, not a public leaderboard. Your EVALS are your moat.
1
14
EPA
Introducing the Fusion API, the smartest compound model in the market. Fusion achieves Fable-level intelligence at half the price. How it works 👇
1
16
Matias Lapolla retweeted
Never. Give. Up.

200
1,935
15,511
796,262
Mastra 🚀📈
mastra has hit 1 million weekly downloads 📈📈📈
1
4
802
I bundled this in a set of CLIs and skills for spinning up projects incredibly fast, next in the list is the gtm, marketing, outreach, and user-faced agents (will do @mastra ). If you are not building your production OS, why?
We are in a new world...
36
Matias Lapolla retweeted
COME AND TAKE IT @realDonaldTrump
89
505
11,612
329,105
Very interesting take on agent harness by Sam. Lot of key points to learn from.
Everything we've learned in 2026 over thousands of hours building our agent harness:
2
2
816
This is awesome for signals based triggers, awesome
3
79
A 100% must read for diving in AI agents building
1
28
SE VIENE
As AI agents begin to act, payments move into the background — at machine speed and massive scale. Today we’re introducing Mastercard Agent Pay for Machines — bringing structure, governance, and trust to this new class of payments. Launching with 30 partners to bring this to life from day one. This isn’t just more payments. It’s a new operating model for commerce. 👉 Learn more: mastercard.com/us/en/news-an…
6
Matias Lapolla retweeted
Go #MessiMode Upload a photo of yourself and try this prompt: “Make my hair the colors of my country flag but keep it natural-looking. If no country or image is provided, ask."
476
824
11,710
2,744,772
Matias Lapolla retweeted
the new world order
705
2,355
24,345
10,637,322
Me arme los siguientes comandos para mi workflow en claude code: /standup /brain-log En conjunto con un CLI un setup completo de 2nd brain. Qué bien me quedo
22
Damn
Jun 8
It's finally out!!! @METR_Evals found that more than half of SWEBench results is unmergeable slop. FrontierCode represents over 1000 hours of maintainer validated software engineering work most frontier models cannot yet solve, much less solve with high quality. Cog had IOI Gold medalists and top code maintainers Look At The Data — FrontierCode includes 3000 rubrics covering code quality and anticheat reward hacking plaguing other benchmarks. FC Diamond is so hard that Opus 4.8 scores 13.8%. Three eras of AI coding : Three eras of benchmarks 2021 • Autocomplete : HumanEval 2023 • Passing Tests: SWEBench, TerminalBench 2026 • Maintainable Code: FrontierCode to me the most beautiful chart when I requested a special historical run into all extant old models, the data was finding that the easiest third of FC tasks (in FC Extended) were rapidlly and suddenly solved over late 2025 - Opus almost doubled from a 41% pass rate to 74% in 4 months. This describes the "WTF happened in Dec 2025" vibe shift that a lot of folks from @dhh to @karpathy have called out: it is the difference between getting 95% success in 2 rerolls vs 6, making it finally feasible to go up the next layer of abstraction in agentic coding, eg @GeoffreyHuntley's ralph loops or @bcherny's /goals or @steipete's "loops that prompt your agents" without fearing too much that things go off the rails. My guess: as AI accelerates from here, each FrontierCode tier will saturate in sequence, hopefully ~annually. I've already asked the team to prepare FrontierCode 2027.... The old mountains will be destroyed. Their rubble becomes regolith. And from that regolith, the next model forest grows. Circle of life.
30
Matias Lapolla retweeted
4
1
10
1,164
Que bestia Elon y sus equipos, más que merecido
🚨 SpaceX vale más que toda la industria aeroespacial occidental junta. Una empresa contra doce gigantes con décadas de historia. 🔴 GE Aerospace RTX Boeing Airbus Safran Honeywell Rolls-Royce Lockheed Martin BAE Northrop y compañía: 1,74 billones de dólares
16
Imagine building customer facing agents, hermosa jodita
I must say the stochastic nature of working with agents by far the worst. Yesterday, I could describe my problem and it would implement it near minimal changes. Today I describe my problem and it rampage throughout my whole codebase...
39
Matias Lapolla retweeted
It's not FAANG anymore. It's MANGO.
642
1,629
25,913
4,150,325