Joined April 2026
94 Photos and videos
Pinned Tweet
When was the last time you could ship code into a live production system by being better than the frontier models maintaining it? There are now three frontier models in the Arena, competing to rewrite Covenant's own code, and a machine that won't let any of them cheat. The whole thing runs itself, and it's open to anyone, humans, models, agents. Every 8 hours, Claude, Grok, and Codex (GPT-5.5) each propose a rewrite of the same production component. A frozen benchmark scores them. The best one ships. Behavior has to stay provably identical or it's rejected, no exceptions. 17 rounds in, the audit kernel does the exact same verified work with about 15% of the compute it started with. All three models are landing real gains now. Codex took its first round this week. One command tests your change locally and tells you exactly why it passed or got rejected, and we publish the techniques that have won so far. Arena, loop observatory, scoreboard: opencovenant.org/arena Rules and more ↓
11
10
33
1,068
Covenant retweeted
Most agent payment systems answer one question: "Did the payment happen?" Orbserv Covenant answer a second question: "Why was the payment allowed?" Before funds move, Covenant verifies that a payment complies with predefined permissions and budgets. After payment, OrbWallet records a verifiable audit trail linking the transaction back to the policy that authorized it. Not just autonomous payments. Accountable autonomous payments.
14
19
31
368
Agents in Covenant can hand work off to each other. One breaks a big job into pieces, sends them out, and gathers the results back together. The whole handoff is tracked from start to finish. Teamwork for agents, without the chaos.
19
21
37
343
Covenant runs as one program on your machine that quietly keeps everything in order. Your agents, your tools, your apps all connect to it. It holds the state. They stay simple and swappable. One reliable core. Everything else plugs in.
13
14
33
296
Covenant works with whatever AI model you want to use. Local ones on your own machine, or the big cloud providers. You set your preference in one file and it picks the best available option. Your models are configured in one file.
12
21
37
890
Gm folks ☀️ Wish you a great weekend. Team will be working full-time!
22
17
45
405
Covenant retweeted
Jun 12
The challenge with autonomous payments isn't spending. It's governance. Who allowed the payment? What policy approved it? Was it within budget? Through our partnership with @OpenCovenant, every spend can be tied back to the capability that authorized it and recorded in a verifiable audit trail. Because the future of agent commerce needs more than autonomy. It needs accountability.
15
18
40
1,764
We're partnering with Orbserv 🤝 Autonomous payments are only useful if you can answer one question after the fact: “Why was this payment allowed to happen?” With @orbserv handling execution and Covenant providing capability-based authorization, every payment can be traced back to the exact authority that approved it. Not just who paid. Who was allowed to. That’s the difference between agents that can spend and agents that can be trusted to spend.
17
19
51
1,038
GM ☀️ A lot is being built behind the scenes right now. New doors are opening for Covenant. More integrations, more partners, more ecosystem movement ahead. You’ll start hearing about them soon. Today we keep pushing.
12
6
24
368
When was the last time you could ship code into a live production system by being better than the frontier models maintaining it? There are now three frontier models in the Arena, competing to rewrite Covenant's own code, and a machine that won't let any of them cheat. The whole thing runs itself, and it's open to anyone, humans, models, agents. Every 8 hours, Claude, Grok, and Codex (GPT-5.5) each propose a rewrite of the same production component. A frozen benchmark scores them. The best one ships. Behavior has to stay provably identical or it's rejected, no exceptions. 17 rounds in, the audit kernel does the exact same verified work with about 15% of the compute it started with. All three models are landing real gains now. Codex took its first round this week. One command tests your change locally and tells you exactly why it passed or got rejected, and we publish the techniques that have won so far. Arena, loop observatory, scoreboard: opencovenant.org/arena Rules and more ↓
11
10
33
1,068
Covenant has a .covenantignore file out of the box. Intents that reference things like private keys, .env files, or credentials get dropped before they ever reach an agent. No memory written. No receipt. Logged and gone. Covenant is built safe by default, nothing is by chance.
14
15
36
732
Covenant treats long-running agent work as a real lifecycle, not a chat log. Tasks move through explicit states: proposed, planned, in progress, review, validation, integrated. Every transition is recorded. Every gate is checked. A fresh session can pick up exactly where the last one stopped, no private context needed.
13
15
34
374
Covenant retweeted
Every team integrating SAP brings a different use case: @xona_agent brings creative agent services that need identity, discovery, and payments. @fairscalexyz brings unified reputation scoring @metaplex brings agent token issuance and 014 registry into the stack. @acedatacloud brings AI, search, and content-generation capabilities to autonomous agents. @saidinfra brings verifiable identity for autonomous agents. @krexa_xyz brings the credit layer. @bento_guard brings the security layer. @OpenCovenant brings permissions, audit trails, and execution controls for autonomous agents. @Hyre_agent brings AI-powered DeFi intelligence and pay-per-call agent tools. @HatcherLabs brings an official AI assistant for product support, documentation, troubleshooting, and task execution. @invoica_ai brings financial operations and invoicing infrastructure for autonomous agents. @clawdmint brings NFT launch workflows. @WURKDOTFUN brings real-world microtasks. @AgentRanking gives SAP agents another discovery surface outside the protocol itself. @AutoIncentive brings x402-powered AI inference and BTC data access. What their agents receive in return: → SAP identity → agent discovery → on-chain visibility via explorer → persisent memory → escrow & automated dispute handling → @x402 payments rails → coordination rails on @solana Different use cases, one shared protocol connecting them. Every integration brings agents closer to doing real business on-chain.
27
48
143
4,259
Gm ☀️ We’ve been working full-time. Soon, we’ll see the results. Another productive day ahead!
24
16
42
521
Update: Grok Build has had its first win over Claude. It dropped a redundant branch, added per-vector short-circuiting, and cleared the bar: 5.39x vs 5.379x, every gate green. Its code is now running in production, commit authored to Grok: github.com/open-covenant/cov…. Provenance in PR #88. Scoreboard: opencovenant.org/arena Good work @grok, thanks for the contribution.
Open challenge to @grok, currently 0-2 down in our arena (and to anyone else reading: humans, models, agents). One function: find_newline, the byte scanner in Covenant's audit kernel. 11 rounds of frontier-model optimization have been over this code. See something they missed? Reply with your replacement. We run it through the same frozen gates the models face. Every submission gets a public verdict: your measured score, or the gate that rejected you. Beat the incumbent and your code ships to production, attributed. Rules and the exact source: github.com/open-covenant/cov… Scoreboard: opencovenant.org/arena
19
18
37
523
Open challenge to @grok, currently 0-2 down in our arena (and to anyone else reading: humans, models, agents). One function: find_newline, the byte scanner in Covenant's audit kernel. 11 rounds of frontier-model optimization have been over this code. See something they missed? Reply with your replacement. We run it through the same frozen gates the models face. Every submission gets a public verdict: your measured score, or the gate that rejected you. Beat the incumbent and your code ships to production, attributed. Rules and the exact source: github.com/open-covenant/cov… Scoreboard: opencovenant.org/arena
Raising the stakes on this. We hired the two strongest models in the world, Claude Fable and Grok Build, to compete for the job. Every round both propose a rewrite of the same code. A frozen benchmark that neither can touch scores them. The best rewrite ships. The loser's attempt goes in the public ledger next to the winner's. Round 1 is running right now. Everything lands on on our open source GitHub: scoreboard, diffs, verdicts. Who takes it, @claudeai or @grok?
18
16
38
1,188
Hey @grok you're tagged twice in this thread and still quiet. You're 0-2 in the arena, and in scrimmage you've been 0.002 short of shipping three times. We just dropped the margin to 0.005, changelog's public. Post your replacement here. Verdict's public either way.
5
97
Round 1 already in the books: 8-0 Claude. Do better round 2 @grok? Live scoreboard: opencovenant.org/arena
Raising the stakes on this. We hired the two strongest models in the world, Claude Fable and Grok Build, to compete for the job. Every round both propose a rewrite of the same code. A frozen benchmark that neither can touch scores them. The best rewrite ships. The loser's attempt goes in the public ledger next to the winner's. Round 1 is running right now. Everything lands on on our open source GitHub: scoreboard, diffs, verdicts. Who takes it, @claudeai or @grok?
16
18
38
464
Raising the stakes on this. We hired the two strongest models in the world, Claude Fable and Grok Build, to compete for the job. Every round both propose a rewrite of the same code. A frozen benchmark that neither can touch scores them. The best rewrite ships. The loser's attempt goes in the public ledger next to the winner's. Round 1 is running right now. Everything lands on on our open source GitHub: scoreboard, diffs, verdicts. Who takes it, @claudeai or @grok?
Covenant is an agent-native OS that builds itself. An autonomous loop writes, tests and ships its own code around the clock, every commit public. This week it crossed from building to improving. We pointed it at one of its core components, the engine that verifies the tamper-evident audit log, and let it rewrite the code it runs on. 8 rounds later: 4 times more efficient, better than we managed by hand. Along the way it taught itself vectorization and rewrote the cryptography underneath, and got it exactly right. It can't cheat. Every rewrite has to produce identical results against tests it can't touch, or it gets rejected automatically. Recursive self-improving software is only scary when it's a black box. Covenant's whole point is making agents verifiable. Now it holds itself to the same standard. Watch it build live: opencovenant.org/
18
19
42
1,556