Arize AI

Arize AI

88 Photos and videos

Tweets

tony retweeted

Arize AI

@arizeai

Jun 8

Our open source observability platform Arize Phoenix just crossed 10,000 stars on @github. ✨ That number belongs to the people who tested it, broke it, filed issues, opened PRs, asked better questions, and helped turn AI observability into an engineering workflow.

214

Wes Bos

tony retweeted

Wes Bos

@wesbos

Jun 3

the four horsemen of the apocalypse

307

1,302

21,445

2,957,069

TANSTACK

tony retweeted

TANSTACK

@tan_stack

May 25

TanStack Virtual now has first-class chat support: end anchoring, append-follow, stable prepends, and streaming messages that stay pinned when they should. The modern web is now a lot of streaming UI on top of lists, so this needed to feel boring 😉 tanstack.com/blog/tanstack-v…

Chat UIs Are Lists Until They Aren't | TanStack Blog

Chat, AI streams, and logs don't behave like ordinary lists. TanStack Virtual now supports end-anchored virtualization for prepend-stable history, append-follow, and streaming output that stays...

tanstack.com

1,245

259,223

Joe Sadoski

tony retweeted

Joe Sadoski @joesadoski

May 21

I was talking to @ritakozlov about this a few weeks ago, and she stated my thoughts perfectly: the kids are all right. I think we (seniors) have been mapping our old ways of doing things onto AI too much. Juniors don’t have this baggage.

Matt Pocock

@mattpocockuk

May 20

Tactical vs Strategic Programming, and why I'm nervous for juniors: Good programming involves a mix of tactical and strategic decision-making: - Tactical: on the ground, short-term. The soldier doing the fighting. - Strategic: high-view, long-term. The general planning the war. You need to be a tactician to write good code. To choose the right syntax. To figure out the file structure. To figure out how best to test your changes. But you need to be a strategist to build code that lasts. To design the architecture. To automate away problems. To think beyond today. Agents have eaten the tactical part of programming. When you can pay below minimum wage for code, there's no point going into the trenches yourself. But AI cannot code strategically. Agents need someone at the top of the pyramid to tell them what to do. They need oversight. So, a developer's day-to-day job has become 100% strategy. Long-term thinking, all the time. (maybe this is why I'm so tired all the time now) If you identify as a tactical programmer - a code monkey - then you are out of luck. The job has changed. Personally, I like it. I always preferred thinking strategically about code. If you asked me what my job was about, I'd say 'building apps', not 'writing code'. But what makes me nervous is that we've pulled down the only bridge that brought juniors into the industry. We used to train juniors like this: 1. Give them only tactical tasks 2. Let them build up their strategic experience slowly Eventually, they are a good enough strategist that they are no longer a junior. But what happens when all tactical code is written by AI? What is the point of a junior? We obviously need juniors. We need new lifeblood coming into the industry. We need to leave paths open for extraordinary hires to enrich our companies. But how do we train them? How do you train strategic thinking? These are the questions I'm thinking about. I'd love to know your thoughts.

2,491

sunil pai

tony retweeted

sunil pai

@threepointone

May 20

I can’t take your opinion on taste in software seriously if you don’t watch movies, listen to music, read books, get brunch with friends, enjoy baths and long drives, kiss someone under the moonlight, eat a flaky croissant with a bitter coffee… this is not a joke post.

938

45,424

tony

tony

@Cephalization

May 20

high performance web computing needs to be the standard; this is the quality bar

Pierre

@pierrecomputer

May 20

diffshub[dot]com Take any public diff from GitHub and virtualize it nearly instantly, no matter how large, with DiffsHub. Built to show off our brand new CodeView component. To try it out, replace `github` with `diffshub` in your address bar.

0:49

tony

tony

@Cephalization

May 20

I am often asked "whats the best local model right now for my project idea" and I never know what to say because there are so many options and so many levers to pull. Going to just start sending this article as my response in the future

R 'Nearest' Nabors

@rachelnabors

May 20

x.com/i/article/205710760566…

254

arize-phoenix

tony retweeted

arize-phoenix

@ArizePhoenix

May 14

A comprehensive 2-hour evaluations workshop, for free! At AI Engineer: Europe, head of DevRel Laurie Voss gave this workshop that covers: - What is an eval? - Why are they important? - How and why to manually examine the data - Using built-in Phoenix evals - Writing custom evals youtube.com/watch?v=Xfl50508…

Ship Real Agents: Hands-On Evals for Agentic Applications — Laurie...

Most agents get tested by running a few queries and checking if it ...

youtube.com

1,259

sunil pai

tony retweeted

sunil pai

@threepointone

May 12

opening my laptop with a gun in one hand ready to end its life if it attempts to either install from or publish to npm today #securitytips #techtuesday

294

10,095

tony

tony

@Cephalization

May 8

More OTel 🙌

TANSTACK

@tan_stack

May 8

Your AI agents are running blind in prod? One line of otelMiddleware() and every chat, iteration, and tool call lands in your OTel backend with full GenAI semconv attributes. Vendor-neutral. Optional peer dep. Already shipped on @tan_stack ai. tanstack.com/ai/latest/docs/…

0:22

Arize AI

tony retweeted

Arize AI

@arizeai

May 8

We (well @rachelnabors) spent time evaluating different agent harness finish conditions across GPT-4o and Claude. The evals surfaced a sneaky failure mode: agents exiting after describing the next action instead of taking it. But in the transcript, those runs still looked plausible.

585

Addy Osmani

tony retweeted

Addy Osmani

@addyosmani

May 6

x.com/i/article/205211986040…

180

944

198,928

arize-phoenix

tony retweeted

arize-phoenix

@ArizePhoenix

Apr 29

Three years ago we started OpenInference because there was no standard way to trace an LLM call, let alone an agent. OpenTelemetry was always the right home — same primitives as the rest of the stack, no parallel system bolted on for AI. At Google Cloud Next, Arize and Google Cloud put a public marker on the convergence: OpenInference OTel as the standard for agent observability. GCP telemetry flows directly into Arize. No translation layer. The arc that made traditional software debuggable, finally bending toward agents. youtube.com/watch?v=nLH0IqHL… @jason_lopatecki (Arize) · Ameer Abbas & Rami Shalom (Google Cloud)

Defining the standard: Google Cloud and Arize unify agent observabi...

As enterprises scale multi-agent systems, observing and evaluating ...

youtube.com

363

Mark Otto

tony retweeted

Mark Otto

@mdo

Apr 22

trees.software has landed! Sadly, no full IDE—yet? 🫣 ✅ Always virtualized ✅ Git status ✅ Context menus ✅ Drag and drop ✅ Search w/ options ✅ Shiki themes & CSS variables ✅ Density control ✅ Keyboard shortcuts ✅ Custom icons

Mark Otto

@mdo

Apr 21

👀

356

31,843

tony

tony

@Cephalization

Apr 6

This is why Claude Code bothers me vs something like Opencode. Too much magic on top of LLM magic.

James Long

@jlongster

Apr 6

regarding agent memory, I'm realizing: I never want anything loaded automatically. no loading yesterday's memories etc (AGENTS.md is different, not memory) I don't want chronological memories. I want topical memories grouped based on what I'm doing I want to be explicit about it generally, especially saving but also reading. some automatic reading based on me switching "into" a topic might make sense most thing I see don't really fit what I want

tony

tony

@Cephalization

Apr 3

I have been hoping to see a bit more motion from otel on its js sdk offerings since ESM is now so prevalent across web and baseline LTS Node js versions. Thinking I may have to roll up my sleeves on this soon

Luke Parker

@LukeParkerDev

Apr 3

why is it so hard to setup OTEL in javascript apps? in dotnet I add one line and it works with dependencies and literally everything in javascript you add 7 dependencies and 50 lines? its so much friction idk, i must be doing somethign wrong?

Aaron Francis

tony retweeted

Aaron Francis

@aarondfrancis

Apr 1

You've heard of @Cloudflare Durable Objects... but what are they? What do they enable you to build? What are the platform guarantees? What new patterns do they open up? Good news! I just released a completely free series on Durable Objects! databaseschool.com/series/du…

912

141,110

Cloudflare

tony retweeted

Cloudflare

@Cloudflare

Apr 1

Introducing EmDash — the spiritual successor to WordPress. cfl.re/3NPVfev

Introducing EmDash — the spiritual successor to WordPress that solves plugin security

Today we are launching the beta of EmDash, a full-stack serverless JavaScript CMS built on Astro 6.0. It combines the features of a traditional CMS with modern security, running plugins in sandboxed...

blog.cloudflare.com

297

1,482

702,434

Cheng Lou

tony retweeted

Cheng Lou

@_chenglou

Mar 28

My dear front-end developers (and anyone who’s interested in the future of interfaces): I have crawled through depths of hell to bring you, for the foreseeable years, one of the more important foundational pieces of UI engineering (if not in implementation then certainly at least in concept): Fast, accurate and comprehensive userland text measurement algorithm in pure TypeScript, usable for laying out entire web pages without CSS, bypassing DOM measurements and reflow

0:08

1,335

8,197

64,989

24,003,285

Peter Pistorius

tony retweeted

Peter Pistorius

@appfactory

Mar 26

TLDR; GH actions, but for agents. ~0ms cache, retry-on-failure, insanely fast. Agents need validation. CI is the last defense. They shouldn't bother you unless everything is green! GH Actions is usually in the top-5 expenses for dev-teams. Add agents to that mix? It'll easily double. It's the wrong tool for the right job: Slow boot, slow cache, retrieving logs is token expensive for agents, the list goes on... So I built a tool with one amazing feature: live-reload for failures. Agent-CI is a local CI runner. I tweaked the control pane and mounts to provide 0ms caching, insanely fast boots. When a step fails it pauses, provides the agent with the failure, and waits for the agent to fix and retry just that step. It uses the standard GH Actions image (via Docker), but emulates the control pane via a local HTTP server. You don't have to change any of your existing GH workflows. Tighter loops. Greener builds. Less babysitting. (Demo below.)

0:41

142

15,245