The AI engineering platform for teams shipping reliable AI agents and LLM applications. Also home to @ArizePhoenix.

Joined January 2020
Photos and videos
Pinned Tweet
Observe 2026 is a wrap. Yesterday we shared what’s next for Arize AX and our vision for the AI factory for self-improving agents. The focus: helping teams turn production behavior into a repeatable loop for finding issues, investigating root cause, testing fixes, and improving agents.
2
3
24
411,414
Three AI labs shipped something called "memory" this week. Apple paid Google a billion a year for one version of it. None of them is what users mean by the word. @jimbobbennett wrote a field map of the four kinds of memory shipping right now: → Retrieval, dressed up as memory → Compaction, sold as automated context management → Cross-session consolidation, closest to what people mean → A capability agents code for themselves in the filesystem Four different things. Four different evaluation problems. If your team is building agents, knowing which bucket you're actually shipping is the first step to making it work. Full piece: arize.com/blog/memory-is-sti…
156
London is having a moment, and we're showing up for it. Arize is sponsoring @londonmaxxing 003, a one-day hackathon at Ramen Space, Dalston, July 4th. Build something that makes London better to live in or build in. £1k prize pool credits. Apply: luma.com/maxxing-london
1
2
11
3,514
Observe 2026. 1 day at San Francisco, Shack15. 700 AI engineers, researchers, founders, and builders. 6 new Arize AX products, live demos, and countless hallway conversations. The future of AI is self-improving agents. This year's Observe focused on the infrastructure behind them: production traces, evals, fixes, and the feedback loops that turn every agent interaction into an opportunity for improvement. Many thanks to our speakers from @Uber, AG2, @FactoryAI, @CrewAI, @Cursor_ai, @PromptQL, @OpenClaw, @AnthropicAI, @Oracle, @WorkOS, @Anyscale, @OpenAI, Parlance Labs, @Daytonaio, @Coinbase, @Salesforce, @LGUplus, @Tripadvisor, @BlackRock, @Upstart, @CVSHealth, @Mastra, DMV, @NousResearch, @Glean, @WellsFargo, @FoundationCap, @LiteLLM, our sponsors @AWS, @Microsoft, Swift Ventures, @CrewAIINC, @QualityKiosk_, BAND, and all attendees for making this our most memorable Observe yet. Couldn't make it? Full talks, fireside chats, and interviews dropping soon — stay tuned. #Observe2026 #ArizeAI #Observability #AgentEvals #AIEngineering
3
10
422
Our cofounder @aparnadhinak tested whether AI agents should use databases through filesystem abstractions. PostgresFS exposed docs as virtual files. A SQL skill queried Postgres, wrote results locally, and let the agent continue with Bash. Result: SQL skill 99/100. PostgresFS 93/100.
1
1
6
351
The SQL skill paid the database cost once. It queried for the relevant slice, wrote that data to a local file, and let the agent use normal shell tools from there. That gave the agent a writable, rereadable, composable workspace.
1
1
130
The lessons for developers building agent harnesses: - Use the database for broad retrieval. - Use local files for iterative analysis. - Measure by question shape. Watch for abstractions that feel familiar while quietly increasing maintenance cost. Full experiment: arize.com/blog/postgresfs-vs…
2
132
Arize AI retweeted

6
33
317
27,518
Support escalations get expensive when engineers inherit a ticket with symptoms but no debugging context. At Arize, we rebuilt that handoff with AI agents that gather the evidence first: customer context, traces, logs, eval results, prior tickets, and relevant docs. That gave support engineers a better starting point and gave engineering cleaner escalations with actual hypotheses attached. The result: average resolution time dropped from 20 hours to 9 hours in a few months, and is now trending around 2.5 hours. We wrote up how we built the workflow and what changed. arize.com/blog/how-arize-bui…
2
2
190
Anthropic's latest and greatest model, Fable, is now available in the prompt playground!
1
3
263
A malicious VS Code extension sat in the marketplace for 18 minutes this May, long enough to hit ~6,000 machines and lift their npm, AWS, GitHub, and SSH credentials, plus the config files Claude Code keeps on disk. Here's the catch: in your traces, a credential-harvesting tool call looks just like a normal one. Same tool, same span shape, only the file path differs. A normal session reads inside the project; a compromised one reaches outside it, into the home directory and credential files. We call those off-tree reads. That's the fingerprint. Once you can see it, you can monitor for it: count the reads that land outside the project workspace, and alert the moment that count goes above zero, an afternoon of work if you already trace your agents. Your agent's harness is now part of your supply chain. The signal that catches the next attack is already in your traces. Nancy Chauhan wrote up how to detect credential theft in AI agent harness traces. arize.com/blog/how-to-detect…
1
5
1,162
Arize AI retweeted
Phoenix just hit 10,000 GitHub stars! Three years ago, Phoenix didn't exist. Arize was a closed-source company. A small team was asked to change that. Catch the full interview with the team who made it happen and where AI observability is going next: arize.com/phoenix-10k
3
8
2,028
Congratulations to our cofounder @aparnadhinak on being named one of the Top 100 Women in AI. A lot of the hardest work in AI right now starts after the demo works: evals, observability, tracing, debugging, and figuring out how to make production systems improve over time. Aparna has been pushing the industry toward that reality for years. Well deserved recognition. 100womeninai.com/aparna-dhin…
3
4
13
392
Our open source observability platform Arize Phoenix just crossed 10,000 stars on @github. ✨ That number belongs to the people who tested it, broke it, filed issues, opened PRs, asked better questions, and helped turn AI observability into an engineering workflow.
1
2
13
214
📖 Read the full story. The blog gets into how a small team built Phoenix “backwards”: features before infrastructure, support handled by maintainers, and roadmap signal coming directly from GitHub, Slack, and users. arize.com/blog/phoenix-10k/
2
1
49
Thank you to everyone who starred the repo, opened an issue, contributed code, challenged a design decision, or helped another engineer debug a production system. 10,000 stars on @github is a milestone that's possible because of you. But the real story is the community that helped define AI observability as the industry moved from notebooks to agents.
1
2
47