New paper on Sims (2026)
A multi-context memory architecture for precision-aware personalization in language models.
We argue that better personalization needs structured context, not just longer memory.
Paper: chiragshah.org/papers/Sims_f…#LLM#AI#Personalization#MemorySystems
Something from last year's SIGIR that seems to resonate more with #AgenticAI as we go beyond #GenerativeAI:
Shah & White, From To-Do to Ta-Da: Transforming Task-Focused IR with Generative AI.
dl.acm.org/doi/pdf/10.1145/3…
Introducing SimBench (2026):
A benchmark for preference-conditioned agentic planning.
Designed to evaluate whether agents can produce correct plans for the same task under different user preferences and constraints.
Repo: github.com/VersarAI/SimBench#Benchmark#AgenticAI#Evaluation
New paper alert: SimGuide (2026)
Procedurally grounded multi-context representations for personalized agent planning.
This work focuses on making agent decisions more accurate across competing goals and constraints.
Paper: chiragshah.org/papers/SimGui…#AI#AgenticAI#Personalization
Flat user profiles are a shortcut that breaks at scale.
SimBench tests agents on users with layered, conflicting preferences — work vs. health vs. family — and scores whether the agent resolves those conflicts correctly.
Open benchmark. Runs today.
github.com/versarai/simbench
Still relying on context and memory to achieve personalization in your AI systems? Find out why that's not effective or efficient and get solutions for a privacy-driven personalization using Sims! Read more in our Velocity newsletter: versarai.beehiiv.com/p/versa…
We just shipped Velocity #001 — our take on why context windows and memory aren't enough for real personalization, plus SimBench 1.0.
Subscribe: versarai.beehiiv.com
AI agents don't fail because they can't plan.They fail because they plan for nobody in particular. SimBench measures whether your agent can tell Jordan from Rafael — and build a different plan for each. github.com/versarai/simbench
Most AI agents treat everyone the same. Same plan, same steps, same assumptions.
SimBench tests whether that's changing. Open benchmark for preference-conditioned agentic planning with 47 tasks. 9 users. 4 domains. MIT licensed. github.com/versarai/simbench