catid

catid

1,480 Photos and videos

Tweets

Pinned Tweet

catid

@MrCatid

Jun 15

Implemented teamcodex which is a port of teamclaude to codex. Just saw it successfully seamlessly switch between two Pro plans when the first one ran out of tokens for the week: github.com/catid/teamcodex

GitHub - catid/teamcodex: Port of teamclaude to codex

Port of teamclaude to codex. Contribute to catid/teamcodex development by creating an account on GitHub.

github.com

435

catid

catid

@MrCatid

Really interesting thing about this new SPS idea to me is that top@k can be done per token with a shared KV cache by injecting random noise at the input of the prediction latent stream. Batch all the matrix operations with no additional KV cache/bandwidth needed.

catid

catid

@MrCatid

You get rid of the temperature parameter at inference and just choose how much noise to inject instead and batch size. It allows models to scale better with FLOPs instead of being limited by memory bandwidth. Exciting to me.

catid

catid

@MrCatid

Normal top@k requires different KV cache for each branch the LLM takes so memory explodes

catid

catid

@MrCatid

Enjoyed this video featuring @yoavartzi youtube.com/watch?v=hRENteFR…

World Modeling: Evaluation and State Computation

Yoav Artzi (Cornell University)https://simons.berkeley.edu/talks/y...

youtube.com

1,228

more replies

catid

catid

@MrCatid

Hm does this unify AR and diffusion models?

catid

catid

@MrCatid

The answer appears to be yes and there's also a lot of different options in how to do it

catid

catid

@MrCatid

Adds a 3 layer GELU MLP to predict the residual latents between each token in each SGD batch. Loss function is L1 between latents and the predicted token distribution (KL)

Jayden Teoh

@jayden_teoh_

Next-token prediction is myopic. What if transformers learn to predict their own next latent state? 🌠 We present 𝗡𝗲𝘅𝘁-𝗟𝗮𝘁𝗲𝗻𝘁 𝗣𝗿𝗲𝗱𝗶𝗰𝘁𝗶𝗼𝗻 (𝗡𝗲𝘅𝘁𝗟𝗮𝘁): a self-supervised learning method that teaches transformers to form compact world models for reasoning and planning. It also unlocks up to 3.3x faster inference via self-speculative decoding! 🚀

ALT illustration of next-latent prediction vs. other predictive mechanisms

344

Jayden Teoh

catid retweeted

Jayden Teoh

@jayden_teoh_

ALT illustration of next-latent prediction vs. other predictive mechanisms

124

796

46,955

catid

catid

@MrCatid

16h

This came out of nowhere super smart: forbes.com/sites/karlfreund/…

Tensordyne Revives Logarithmic Math In A Bid To Cut AI Power Use

Tensordyne says logarithmic computing could reduce AI inference costs and power demands, offering an alternative to conventional chip designs.

forbes.com

163

Egor Cherepanov

catid retweeted

Egor Cherepanov @hirasava_ui

Jun 15

5/ On MIKASA-Robo, success rate jumps 0.42→0.84, and held-out tasks with shared memory structure go 0.07→0.23. On LIBERO it holds at 96.2% - recurrence doesn't hurt when memory isn't needed.

249

catid

catid

@MrCatid

Jun 15

GitHub - catid/teamcodex: Port of teamclaude to codex

Port of teamclaude to codex. Contribute to catid/teamcodex development by creating an account on GitHub.

github.com

435

catid

catid

@MrCatid

Jun 15

Seems to be about 10x cheaper to do it this way than with API

283

catid

catid

@MrCatid

Jun 15

I miss Fable. For a few days I was getting so much more done :(

Z.ai

catid retweeted

Z.ai

@Zai_org

Jun 13

Intelligence should be open, accessible, and ready to build with, empowering every developer, everywhere. GLM-5.2 is now available to all GLM Coding Plan users, including Lite, Pro, Max, and Team plans. docs.z.ai/devpack/latest-mod… As our new flagship model, GLM-5.2 delivers powerful coding capabilities, usable 1M-context support, and continued strengths in long-horizon tasks. API and Chatbot services will launch next week. The model will also be officially open-sourced next week under the MIT License. The future of AI is open, and it belongs to the people.

How to Switch Models - Overview - Z.AI DEVELOPER DOCUMENT

docs.z.ai

362

997

8,370

2,522,007

catid

catid

@MrCatid

Jun 13

Who could have seen this coming oh no

jinjingliang

@JinjingLiang

Jun 13

Replying to @AnthropicAI

The state of things:

256

catid

catid

@MrCatid

Jun 13

All models can be jailbroken

120

Anthropic

catid retweeted

Anthropic

@AnthropicAI

Jun 13

The US government, citing national security authorities, has issued an export control directive to suspend all access to Fable 5 and Mythos 5 by any foreign national, whether inside or outside the United States, including foreign national Anthropic employees. The net effect of this order is that we must abruptly disable Fable 5 and Mythos 5 for all our customers to ensure compliance. Access to all other Claude models is not affected. We apologize for this disruption to our customers. We believe this is a misunderstanding and are working to restore access as soon as possible. Read our full statement: anthropic.com/news/fable-myt…

Statement on the US government directive to suspend access to Fable 5 and Mythos 5

The US government has issued an export control directive to suspend all access to Fable 5 and Mythos 5 by any foreign national, whether inside or outside the United States.

anthropic.com

12,600

25,782

88,121

90,222,978