Joined April 2014
88 Photos and videos
tony retweeted
Our open source observability platform Arize Phoenix just crossed 10,000 stars on @github. ✨ That number belongs to the people who tested it, broke it, filed issues, opened PRs, asked better questions, and helped turn AI observability into an engineering workflow.
1
2
13
214
tony retweeted
the four horsemen of the apocalypse
307
1,302
21,445
2,957,069
tony retweeted
TanStack Virtual now has first-class chat support: end anchoring, append-follow, stable prepends, and streaming messages that stay pinned when they should. The modern web is now a lot of streaming UI on top of lists, so this needed to feel boring 😉 tanstack.com/blog/tanstack-v…
24
55
1,245
259,223
tony retweeted
I was talking to @ritakozlov about this a few weeks ago, and she stated my thoughts perfectly: the kids are all right. I think we (seniors) have been mapping our old ways of doing things onto AI too much. Juniors don’t have this baggage.
Tactical vs Strategic Programming, and why I'm nervous for juniors: Good programming involves a mix of tactical and strategic decision-making: - Tactical: on the ground, short-term. The soldier doing the fighting. - Strategic: high-view, long-term. The general planning the war. You need to be a tactician to write good code. To choose the right syntax. To figure out the file structure. To figure out how best to test your changes. But you need to be a strategist to build code that lasts. To design the architecture. To automate away problems. To think beyond today. Agents have eaten the tactical part of programming. When you can pay below minimum wage for code, there's no point going into the trenches yourself. But AI cannot code strategically. Agents need someone at the top of the pyramid to tell them what to do. They need oversight. So, a developer's day-to-day job has become 100% strategy. Long-term thinking, all the time. (maybe this is why I'm so tired all the time now) If you identify as a tactical programmer - a code monkey - then you are out of luck. The job has changed. Personally, I like it. I always preferred thinking strategically about code. If you asked me what my job was about, I'd say 'building apps', not 'writing code'. But what makes me nervous is that we've pulled down the only bridge that brought juniors into the industry. We used to train juniors like this: 1. Give them only tactical tasks 2. Let them build up their strategic experience slowly Eventually, they are a good enough strategist that they are no longer a junior. But what happens when all tactical code is written by AI? What is the point of a junior? We obviously need juniors. We need new lifeblood coming into the industry. We need to leave paths open for extraordinary hires to enrich our companies. But how do we train them? How do you train strategic thinking? These are the questions I'm thinking about. I'd love to know your thoughts.
1
2
7
2,491
tony retweeted
I can’t take your opinion on taste in software seriously if you don’t watch movies, listen to music, read books, get brunch with friends, enjoy baths and long drives, kiss someone under the moonlight, eat a flaky croissant with a bitter coffee… this is not a joke post.
62
86
938
45,424
high performance web computing needs to be the standard; this is the quality bar
diffshub[dot]com Take any public diff from GitHub and virtualize it nearly instantly, no matter how large, with DiffsHub. Built to show off our brand new CodeView component. To try it out, replace `github` with `diffshub` in your address bar.
1
53
I am often asked "whats the best local model right now for my project idea" and I never know what to say because there are so many options and so many levers to pull. Going to just start sending this article as my response in the future
1
3
254
tony retweeted
A comprehensive 2-hour evaluations workshop, for free! At AI Engineer: Europe, head of DevRel Laurie Voss gave this workshop that covers: - What is an eval? - Why are they important? - How and why to manually examine the data - Using built-in Phoenix evals - Writing custom evals youtube.com/watch?v=Xfl50508…
1
2
10
1,259
tony retweeted
opening my laptop with a gun in one hand ready to end its life if it attempts to either install from or publish to npm today #securitytips #techtuesday
11
18
294
10,095
More OTel 🙌
Your AI agents are running blind in prod? One line of otelMiddleware() and every chat, iteration, and tool call lands in your OTel backend with full GenAI semconv attributes. Vendor-neutral. Optional peer dep. Already shipped on @tan_stack ai. tanstack.com/ai/latest/docs/…
1
1
2
88
tony retweeted
We (well @rachelnabors) spent time evaluating different agent harness finish conditions across GPT-4o and Claude. The evals surfaced a sneaky failure mode: agents exiting after describing the next action instead of taking it. But in the transcript, those runs still looked plausible.
3
1
3
585
tony retweeted

35
180
944
198,928
tony retweeted
Three years ago we started OpenInference because there was no standard way to trace an LLM call, let alone an agent. OpenTelemetry was always the right home — same primitives as the rest of the stack, no parallel system bolted on for AI. At Google Cloud Next, Arize and Google Cloud put a public marker on the convergence: OpenInference OTel as the standard for agent observability. GCP telemetry flows directly into Arize. No translation layer. The arc that made traditional software debuggable, finally bending toward agents. youtube.com/watch?v=nLH0IqHL… @jason_lopatecki (Arize) · Ameer Abbas & Rami Shalom (Google Cloud)
1
4
7
363
tony retweeted
Apr 22
trees.software has landed! Sadly, no full IDE—yet? 🫣 ✅ Always virtualized ✅ Git status ✅ Context menus ✅ Drag and drop ✅ Search w/ options ✅ Shiki themes & CSS variables ✅ Density control ✅ Keyboard shortcuts ✅ Custom icons

Apr 21
👀
12
17
356
31,843
This is why Claude Code bothers me vs something like Opencode. Too much magic on top of LLM magic.
regarding agent memory, I'm realizing: I never want anything loaded automatically. no loading yesterday's memories etc (AGENTS.md is different, not memory) I don't want chronological memories. I want topical memories grouped based on what I'm doing I want to be explicit about it generally, especially saving but also reading. some automatic reading based on me switching "into" a topic might make sense most thing I see don't really fit what I want
68
I have been hoping to see a bit more motion from otel on its js sdk offerings since ESM is now so prevalent across web and baseline LTS Node js versions. Thinking I may have to roll up my sleeves on this soon
why is it so hard to setup OTEL in javascript apps? in dotnet I add one line and it works with dependencies and literally everything in javascript you add 7 dependencies and 50 lines? its so much friction idk, i must be doing somethign wrong?
2
71
tony retweeted
You've heard of @Cloudflare Durable Objects... but what are they? What do they enable you to build? What are the platform guarantees? What new patterns do they open up? Good news! I just released a completely free series on Durable Objects! databaseschool.com/series/du…
38
99
912
141,110
tony retweeted
My dear front-end developers (and anyone who’s interested in the future of interfaces): I have crawled through depths of hell to bring you, for the foreseeable years, one of the more important foundational pieces of UI engineering (if not in implementation then certainly at least in concept): Fast, accurate and comprehensive userland text measurement algorithm in pure TypeScript, usable for laying out entire web pages without CSS, bypassing DOM measurements and reflow
1,335
8,197
64,989
24,003,285
tony retweeted
TLDR; GH actions, but for agents. ~0ms cache, retry-on-failure, insanely fast. Agents need validation. CI is the last defense. They shouldn't bother you unless everything is green! GH Actions is usually in the top-5 expenses for dev-teams. Add agents to that mix? It'll easily double. It's the wrong tool for the right job: Slow boot, slow cache, retrieving logs is token expensive for agents, the list goes on... So I built a tool with one amazing feature: live-reload for failures. Agent-CI is a local CI runner. I tweaked the control pane and mounts to provide 0ms caching, insanely fast boots. When a step fails it pauses, provides the agent with the failure, and waits for the agent to fix and retry just that step. It uses the standard GH Actions image (via Docker), but emulates the control pane via a local HTTP server. You don't have to change any of your existing GH workflows. Tighter loops. Greener builds. Less babysitting. (Demo below.)
16
19
142
15,245