I had the privilege of MC'ing at the ODSC AI East Conference this week. Didn't expect it to be this much fun.
A lot of fabulous sessions, but a few key themes kept emerging:
1. Teams and organizations are still figuring out what "evaluations" even mean in this new world.
2. No one has a clue what human-AI collaboration could actually look like.
3. Teams are having trouble figuring out what AI agents are doing to their code. (Where have I heard that one before ๐.)
A few specific talks worth calling out in these areas:
โ Julie Yaunches (
@nvidia) on verification-driven agentic workflows. Interesting nugget: looks like NVIDIA is leaning on architecture diagrams to verify what's changing in their own systems. (๐ถ to my ears.)
She demoed live diagrams of
@NemoClaww built with a Claude Code skill. We've been tackling the same problem from a different angle at JigsawML; link to our live NemoClaw architecture page in the comments.
โ Stephanie Kirmer (Nebulock): "'It seems fine' is not a business strategy." Practical walkthrough of using rubrics to keep LLM judges from being arbitrary.
โ Andrรฉ Balleyguier (
@AnthropicAI) on how to actually scale agentic AI in 2026
โ Susan Shu Chang (
@elastic ): Grounded reminder that eval isn't a new problem, it's continuous with classic attribution work, and the dataset just needs to be good enough compared to human output.
Also caught some amazing speakers across the day, including Usama Fayyad and Sadie St Lawrence.
Thanks to the speakers,
@alinadovbysh, and the Open OData Science Conference (
@_odsc ) team for putting together a room full of practitioners sharing real lessons.
#ODSCEast #AI