jus a place to distill my thoughts | it's all just monopoly money | @arizeai & @arizephoenix

Joined October 2013
122 Photos and videos
Dat Ngo retweeted

80
966
6,092
1,491,403
Hardest worker in SF
Crunching hard at Claude Build Day (@cerebral_valley @ClaudeDevs 🔥) Model’s down… but the vibe is still UP!!! Hope Fable comes back to me soon @mochipomsky
1
1
51
Dat Ngo retweeted
Observe 2026. 1 day at San Francisco, Shack15. 700 AI engineers, researchers, founders, and builders. 6 new Arize AX products, live demos, and countless hallway conversations. The future of AI is self-improving agents. This year's Observe focused on the infrastructure behind them: production traces, evals, fixes, and the feedback loops that turn every agent interaction into an opportunity for improvement. Many thanks to our speakers from @Uber, AG2, @FactoryAI, @CrewAI, @Cursor_ai, @PromptQL, @OpenClaw, @AnthropicAI, @Oracle, @WorkOS, @Anyscale, @OpenAI, Parlance Labs, @Daytonaio, @Coinbase, @Salesforce, @LGUplus, @Tripadvisor, @BlackRock, @Upstart, @CVSHealth, @Mastra, DMV, @NousResearch, @Glean, @WellsFargo, @FoundationCap, @LiteLLM, our sponsors @AWS, @Microsoft, Swift Ventures, @CrewAIINC, @QualityKiosk_, BAND, and all attendees for making this our most memorable Observe yet. Couldn't make it? Full talks, fireside chats, and interviews dropping soon — stay tuned. #Observe2026 #ArizeAI #Observability #AgentEvals #AIEngineering
3
10
432
while rooted in truths anth motivation wants control over the shape of ai governance before someone else defines it this is my take away here, and a larger valuation ofc smart move
Today I'm publishing a new essay, Policy on the AI Exponential. AI is progressing extremely fast—much faster than the policy process was built to handle. The essay lays out where I think the technology is now, and the action needed to close the gap: darioamodei.com/post/policy-…
51
Dat Ngo retweeted
works 100% of the time
I just open sourced my "Is this slop?" simple test
1
1
58
Dat Ngo retweeted
The sandbox market has become so noisy that someone needs to define what a sandbox actually is. When we started, nobody was talking about sandboxes. Now, 40-odd companies are calling themselves sandbox providers/claiming to be in the space. Most of which are not really sandboxes in any meaningful sense. - some are Kubernetes pods with a different name - some are stateless functions that can't pause or resume - some are just EC2 with better marketing The problem with undefined markets is that the noise drowns out the signal. Buyers can't tell what they actually need, and competitors who aren't really competing get lumped in with those who are. Maybe it's time to get all the players in a room and clarify what a sandbox really encompasses.
21
5
61
13,887
Day Zero, happy hunting children! Don't spend all your money!
Anthropic's latest and greatest model, Fable, is now available in the prompt playground!
2
71
some people made fun of the ralph loop earlier this year for being too simple, a meme and dumb engineering they missed the point do work, verify it, feed the result back in, repeat simple doesn't mean not useful
2
2
739
Going to be at @aiDotEngineer SF? Just come up the street to @awscloud startup loft on market st! Gonna be there with my homie @Shashikant86 to talk all things harnesses!
🧰 Harness Engineering is quickly becoming one of the hottest topics in Agentic AI. A few companies just talking about Harness Engineering in the conferences. However, some actually already started building/evaluating/optimising own harnesses. We are covering the later at this event in San Francisco 🌉 On 29th June at AWS Builder Loft. 💡 Harness Engineering: State of the Art in Agent Harnesses We are covering Harness Engineering from very different angle and hidden part of the Harness Engineering at the event that you probably haven't heard before.. 🦾 We covering 🧪 Harness evaluation by @dat_attacked ( @arizeai ) 🎛️ HyperParallel Experimentation of Harnesses by @TweetAtAKK (@RapidFireAIHQ ) ⏲️Harness Optimization by Myeongsoo Kim (@awscloud, Kiro) 🥥Fresh Data for Coding Agent by @LinghuaJ (@cocoindex_io ) ✍️ We've already crossed 75 registrations, and seats are filling up quickly. If you're building AI agents and want to learn what production-grade Harness Engineering actually looks like, reserve your spot now. 👉 luma.com/rtd0f6ka Hope to see you at the AWS Builders Loft in San Francisco 🌉 #HarnessEngineering #AgentEngineering #AgenticAI
1
1
3
305
realizing the pros and cons of being on either spectrum of ship it and fix it - vs - product love and taste there is this beautiful world where bugs are okay, but product slop and weak product judgement is never acceptable speed and taste takes a special team to be here, humility and arrogance in harmony
32
Dat Ngo retweeted
Can coding agents stay coherent over a 1 billion token budget? Can they build Slack from scratch? Rewrite a JAX codebase in PyTorch? Build a C compiler in Rust? Enter SWE-Marathon: a benchmark for autonomous long-horizon software work.
49
65
680
794,806
if i ever needed a resume, this isn't far from it nice to see the math / classic ml on here, as some foundational parts my biggest weakness is model training and deep understanding of transformer architectures time to explore weaker areas of my understanding
As an AI Engineer. Please learn: Harness engineering, not just prompt engineering Context engineering, not just long prompts Prompt caching vs. semantic caching tradeoffs KV cache management, eviction, reuse, and memory pressure at scale Prefill vs. decode latency and why they optimize differently Continuous batching, paged attention, and throughput optimization Speculative decoding vs. quantization vs. distillation tradeoffs INT8, INT4, FP8, AWQ, GPTQ, and when quantization hurts quality Structured output failures, schema validation, repair loops, and fallback chains Function calling reliability, tool contracts, argument validation, and idempotency Agent guardrails, loop budgets, tool budgets, and termination conditions Model routing, graceful fallback logic, and degraded-mode UX RAG architecture: chunking, embeddings, hybrid search, reranking, and freshness Retrieval evals: recall, precision, grounding, attribution, and citation quality Evals: golden sets, regression tests, adversarial tests, LLM-as-judge, and human evals LLM observability as a first-class discipline: traces, spans, tokens, latency, errors, and drift Cost attribution per feature, workflow, tenant, and user journey not just per model Safety engineering: prompt injection defense, data leakage prevention, and permission boundaries Multi-tenant isolation, cache safety, and cross-user context contamination prevention Fine-tuning vs. in-context learning vs. RAG vs. distillation and when each is the wrong tool Latency, quality, cost, and reliability tradeoffs across the full inference stack Production failure modes: hallucinated tool calls, malformed JSON, stale retrieval, runaway agents, and silent eval regressions Shipping LLM systems as reliable infrastructure, not demos wrapped around prompts aiengineeringfromscratch.com
1
84
Every Ai event I go to I love to see this girl! One of the homies for years. Thanks for the coming out to @arizeai Observe Conf @temporalio is lucky to have ya! @MelGoesTech @belizsoyak
1
2
6
292
Gahhh, gotta say hi to the home girl @MelGoesTech and congratulate her on her marriage!!
Replying to @MelGoesTech
Keynote started by co-founders of @arizeai feat @dat_attacked as emcee
1
2
89
using @claude design, and it had the audacity to say this
46
s/o to all my short kings before spawn, spent those attribution points on entrepreneurship rather than height 😂
before you invest in a startup always ask how tall the founders are
1
87
Dat Ngo retweeted

17
28
219
21,113
insert dj khaled meme
2
44
Dat Ngo retweeted

13
122
898
75,060