Developing a world-class agentic AI system called Metis OS that empowers decision support and mission planning in critical environments

Joined November 2012
271 Photos and videos
Everyone's building AI agents. Almost nobody is building memory. Over the last year we've open-sourced three protocols that tackle different layers of the problem:
1
12
• Rosetta → Codebase Memory • Railroad Memory → Agent Memory • ContextSync → Organizational Memory
1
1
6
Together they form what we're calling the Open Memory Stack for AI Agents. Rosetta helps agents understand codebases. Railroad helps agents remember work across long-running tasks. ContextSync helps agents stay synchronized with versioned, permissioned organizational knowledge.
9
Christian Johnson retweeted
This is a super exciting release - Claude Fable 5 is the same underlying model as Mythos but with added safeguards. The benchmarks are great and it's SOTA on everything by a margin but I'll add that *qualitatively* also, this is a major-version-bump-deserving step change forward (imo of the same order as Claude 4.5 was in November), peaking especially for long problem-solving sessions on very difficult problems. You can give it a lot more ambitious tasks than what you're used to, the model "gets it" and it will just go, and it's never felt this tempting to stop looking at the code at all (but don't do this in prod!). The model still has quirks that people will run into and the safeguards are configured to be a little too trigger happy for launch, which can hopefully be tuned over time. I feel a lot of things changing as working software increasingly comes out on a tap. The Jevon's paradox kicks in and I feel my own demand for software growing substantially. You can ask for anything - explainers, visualizers, dashboards, bespoke single-use apps (e.g. a full wandb that is hyper-specific just for your project), you can 10X your test suite, auto-optimize code, run giant research projects with custom HTML for the results, anything! "Free your mind" (Matrix ref). Really looking forward to all the things people build!
Replying to @claudeai
Fable 5 is state-of-the-art on nearly all tested benchmarks, with exceptional performance in software engineering, knowledge work, scientific research, and vision. The longer and more complex the task, the larger Fable 5’s lead over our other models.
1,261
2,355
25,205
2,664,080
Two weeks ago I captured my front porch as a 3D Gaussian splat. Today I asked it what would get damaged if it rained. That's the difference between a 3D model and a world model. Most 3D captures are things you look at. We turned one into something you can talk to.
1
41
Now I can ask: • What's on the porch? • Where's the orange chair? • If it rains tonight, what gets wet? • What changed since the last capture? And the system reasons over the scene itself, not just the pixels.
1
9
A porch today. A city block captured by drone tomorrow. The future of AI isn't just understanding documents. It's understanding the physical world. #WorldModel #SpatialAI #GaussianSplatting #DigitalTwin #GeospatialAI #MetisOS
15
I believe this is the last moment we have to ask ourselves if an AI World is what we really want. #AI
11
MetisOS Digital Operators does full end to end tasks grounded in your organizational or personal data. OpenClaw, Hermes Agent and Manus are not even close to what Digital Operators can do. Get your Digital Operator at metisos.co #digitaloperators #metisOS
49
Finding from a controlled experiment that should concern everyone shipping AI agents: The most dangerous configuration is a capable model acting without predicting the consequences of its actions first. A 70B model without foresight scored -59%. Actively harmful.
2
31
A 2B model without foresight scored -19%. Its incompetence protected it. Both converged to 55% (the theoretical max) once given a world model. Foresight dominates capability. The world model is the great equalizer.
28
Christian Johnson retweeted
Replying to @karpathy
weirdest TL
3
37
744
131,519
Christian Johnson retweeted
Replying to @RealCalacanis
@RealCalacanis Built Doublecheck for the $5K bounty. Real-time fact-checking for TWiST. Paste a YouTube URL, it listens live, detects claims, and pulls verdicts with sources in seconds. it actually sees the video too, not just the audio. Open source.: doublecheck.metisos.co
1
1
37
Every organization has the same problem: nobody knows which version of a document informed which decision. Agents didn't create that problem. They just made it impossible to ignore. Introducing Context Vault. one permissioned, provenance-tracked source of truth for humans and AI
16
Christian Johnson retweeted
In the 119th session of #MultimodalWeekly, we feature 4 exciting projects covering applications in broadcasting, law enforcement, education, and media. ✅ Gene Pao and Dave Euson will present Determining Sponsorship Media Value, which uses TwelveLabs and Qibb to detect and calculate sponsorship value in sports broadcasting content. ✅ @chrisjohnsonpr will present Helion, a multi-source intelligence fusion system for video-evidence-heavy investigations. ✅ Miodrag Tasic will present Swiftnotes; Teach - a productivity workflow that uses TwelveLabs Pegasus 1.5 through API for task intent understanding. ✅ Rish A will present @cutsio, which turns scattered footage into a single library that can search, organize, chat, and share. Register for the webinar here: mailchi.mp/twelvelabs/multim… ⬅️
1
4
156
Replying to @RealCalacanis
@RealCalacanis Built Doublecheck for the $5K bounty. Real-time fact-checking for TWiST. Paste a YouTube URL, it listens live, detects claims, and pulls verdicts with sources in seconds. it actually sees the video too, not just the audio. Open source.: doublecheck.metisos.co
1
1
37
What if your AI systems could not only understand every part of your organization's context, but also ensure that no matter who is interacting with a file, whether it's a human or an Digital Operator team member, it is permissioned, versioned, and logged with full provenance?
1
21
That's exactly what we're building. Metis Analytics is launching Context Vault, the context layer for the agent-native enterprise, built on the open source Context-Sync Protocol. One source of truth for the documents your team writes and the context your agents read.
18