Sam Dare

Sam Dare

806 Photos and videos

Tweets

Pinned Tweet

Sam Dare

@DistStateAndMe

Mar 10

A small step for mankind, a massive leap for decentralised training... for agency. In the space of 9 months, @tplr_ai went from 1.2B -> 72B. It's never been easy, and has broken everyone on the team multiple times. But I speak for all of us when I say it is the most rewarding thing we have ever done. We have a fraction of the resources. We don't have the PhDs. But Bittensor shows you it doesn't matter. Innovation happens at the edge. We innovate through scarcity. The ones who rewrite the rules are never the ones with the most. They're the ones who refuse to accept the limits they were handed. Bittensor is prophecy. Subnets (@covenant_ai and others) are the tools through which that prophecy is manifested. Next stop: TRILLIONS.

templar

@tplr_ai

Mar 10

We just completed the largest decentralised LLM pre-training run in history: Covenant-72B. Permissionless, on Bittensor subnet 3. 72B parameters. ~1.1T tokens. Commodity internet. No centralized cluster. No whitelist. Anyone with GPUs could join or leave freely. 1/n

0:40

260

70,380

Cody Blakeney

Sam Dare retweeted

Cody Blakeney

@code_star

Jun 13

> We have reviewed a report that we believe is the basis of the government's directive and validated that the level of capability displayed there is widely available from other models (including OpenAI’s GPT-5.5) You asked to be regulated by people who don’t know the difference. You fucked around and found out.

Andrew Curran

@AndrewCurran_

Jun 13

Replying to @AndrewCurran_

This was all allegedly triggered by a Mythos jailbreak that was shared with the US Government. This is Anthropic's response: 'To date, the government has only given us verbal evidence of a potential narrow, non-universal jailbreak, which essentially consists of asking the model to read a specific codebase and fix any software flaws. Our understanding is that one potential jailbreak was shared with the government. We have reviewed a report that we believe is the basis of the government's directive and validated that the level of capability displayed there is widely available from other models (including OpenAI’s GPT-5.5), and is used every day by the defenders who keep systems safe. We will share more details over the next 24 hours.'

256

28,226

Sam Dare

Sam Dare

@DistStateAndMe

Jun 12

tplr.ai/careers

ALT Simpsons Homer GIF

covenant

@covenant_ai

Jun 12

The biggest single data centers will keep getting bigger. The larger pool is still the long tail: GPUs spread across labs, companies, and individuals, connected over the internet. The internet is the data center.

620

Sam Dare

Sam Dare

@DistStateAndMe

Jun 12

A certification department deciding who deserves intelligence isn't safety. It's a priesthood. Decentralised intelligence isn't an act of rebellion. It's a moral imperative. We're building the counterweight at @covenant_ai. We're hiring across pretraining, post-training, and infrastructure. careers(at)covenant(dot)ai

Joscha Bach

@Plinz

Jun 12

I think Anthropic needs to build a certification department that audits and approves users of powerful models. Computer security companies, biotech researchers, academic labs, doctors, government institutions need access to the best AI we can build.

1,793

Ravid Shwartz Ziv

Sam Dare retweeted

Ravid Shwartz Ziv

@ziv_ravid

Jun 11

If the Knicks can come back from 29 down in the Finals, your open-source model can beat Claude 💪

1,771

Sam Dare

Sam Dare

@DistStateAndMe

Jun 11

"...make no mistakes"

Leo Alt @leonardoalt

Jun 11

Horrible sensationalist bait take, very unfortunate to even be reading this. No one is claiming FVed code is 100% bug-free, it simply gives you more assurance using formal methods. etherscan.io/address/0x00000… 142B USD bounty in the deposit contract, go hack it then.

500

davinci

Sam Dare retweeted

davinci

@leothecurious

Jun 9

machines of selectively loving grace

677

15,685

Sam Dare

Sam Dare

@DistStateAndMe

Jun 10

RT @covenant_ai: The practical problem in distributed RL is simple: model training moves a lot of data. When every worker needs fresh mode…

sarah guo

Sam Dare retweeted

sarah guo

@saranormous

Jun 10

x.com/i/article/206450988970…

194

2,154

1,406,281

Nando de Freitas

Sam Dare retweeted

Nando de Freitas

@NandoDF

Jun 9

Great interview: “Edwin Chen is the founder and CEO of Surge AI, powering frontier labs with elite data, environments, and evaluations. Surge surpassed $1 billion in revenue with under 100 employees last year, completely bootstrapped” The $1B Al company training ChatGPT, Claude & Gemini on the path to resp... youtu.be/dduQeaqmpnI?si=8RE5… via @YouTube

The $1B Al company training ChatGPT, Claude & Gemini on the path to...

Edwin Chen is the founder and CEO of Surge AI, the company that tea...

youtube.com

8,034

Sam Dare

Sam Dare

@DistStateAndMe

Jun 8

RT @covenant_ai: Distributed RL post-training is powerful because many machines can help train the same model, even when they are not sitti…

Sam Dare

Sam Dare

@DistStateAndMe

Jun 5

RT @covenant_ai: The @cursor_ai team post-trained Composer 2 on an open-weight base model using @FireworksAI_HQ's distributed RL rollout in…

Pavlo Molchanov

Sam Dare retweeted

Pavlo Molchanov

@PavloMolchanov

Jun 4

Nemotron 3 Ultra (550B-A55B) is here - our strongest open-weight model and full training recipe to date. Heavy emphasis on real-world inference efficiency for long-context agentic workloads. Everything is open 🤗: base, post-trained, reward checkpoints, NVFP4 quantized versions, training data, and recipes. Key technical highlights ‼️: - 550B total / 55B active parameters - Hybrid Mamba2-Transformer (~4:1 Mamba:Attention) - Pretrained in NVFP4 on 20T tokens - LatentMoE architecture - Two-stage MOPD post-training - Native MTP Technical details in the thread 👇

483

38,534

Sam Dare

Sam Dare

@DistStateAndMe

Jun 4

RT @covenant_ai: Published Feb 2026: PULSE showed that distributed RL post-training could move far less data without changing the receiver'…

Taelin

Sam Dare retweeted

Taelin

@VictorTaelin

Jun 3

... This was fake news, 5.5 implemented basically the same program 1016 times. None of these programs did any meaningful computation. No pattern-matching, no datatypes, recursion, loops. Literally they just did basic function calls and u32 arithmetic. I apologize 😭 I've now used 4.8 to implement 16 real programs, including spellcheckers, relational databases, compilers, schedulers. I manually checked each to ensure it was doing real work. Good news is the compiler worked in all cases, but post-refactor single-core performance is only ~2x faster than GHC, not ~6x. Things going well but still a bit of work to do . . . :|

Taelin

@VictorTaelin

Jun 3

Quick progress update: Bend→C compiler refactor done. I left GPT 5.5 testing it overnight. It wrote 1016 Bend programs. All outputs matched a manual Haskell reference. No compiler bugs found. Also, Bend runs all in ~193s (5.96x faster than GHC) in a single core CPU.

479

37,037

Sam Dare

Sam Dare

@DistStateAndMe

Jun 2

RT @covenant_ai:

1:20

Sam Dare

Sam Dare

@DistStateAndMe

Jun 2

Fantastic Reward Hackings and where to find them

Taelin

@VictorTaelin

Jun 2

5.5 is unbelievable Yesterday night I, once again, left 4 codex tabs optimizing the new HVM5 (nothing to do with Bend2). This time I was sure I covered every form of reward hack it could possibly do. I defined what "general" means, I put a max perf cap so it couldn't just hardcode the answers, I locked the tests, I put clear time (not interaction) metrics. I went to bed confident it couldn't do anything other than optimize the interpreter. ... the interpreter, huh? I never wrote "interpreter". I just asked it to make HVM5 faster. ... ... ... It built a compiler. It built a complete functioning compiler. Overnight. It works. HVM5 is compiled now. It overshot the target 10-fold. But it is a compiler. For SupGen, that doesn't work because it generates functions dynamically. We need a fast interpreter. It didn't touch the interpreter. ...

1,039