Chayanka_42

Chayanka_42

121 Photos and videos

Tweets

Pinned Tweet

Chayanka_42 @42_gravity

May 25

first time writing an article about AI, regarding something I have been thinking about quite a lot. open.substack.com/pub/gravit…

114

Chayanka_42

Chayanka_42 @42_gravity

40m

Opus 4.8 Non-reasoning not available for benchmark @ArtificialAnlys @mert_gulsun ? It was really nice to see how much test-time compute adds to the benchmark scores

Chayanka_42 @42_gravity

May 25

first time writing an article about AI, regarding something I have been thinking about quite a lot. open.substack.com/pub/gravit…

Chayanka_42

Chayanka_42 @42_gravity

Jun 13

The batch newsletter puts it very nicely @AndrewYNg

Chayanka_42

Chayanka_42 @42_gravity

Jun 12

"Optimizing for understanding" may be one of the most difficult, yet important things to do next. Often published papers have the final output without the many non-linear mistaken paths a researcher took earlier to get there.

John Carmack

@ID_AA_Carmack

Jun 11

It seems like LLMs could optimize coding style by exploring ways of structuring code so weaker and weaker models can still successfully perform tasks in a codebase. There are surely stylistic quirks that are peculiarly impactful to transformers, but I bet there would be a lot of overlap with human capabilities. Optimizing for understanding should help even the top frontier models, allowing them to understand things “at a glance” without having to explicitly explore. There will remain “better” and “worse” ways to code.

Chayanka_42

Chayanka_42 @42_gravity

Jun 11

why does @natolambert have such interesting @Substack articles. wish I knew about it sooner. Inspires me to write more

Chayanka_42

Chayanka_42 @42_gravity

Jun 11

GPT is still quite ahead of Mythos/Fable in math/physics reasoning. eg CritPt GPT 5.5 Pro (xhigh) ~ 30% : 3M tokens: $113 Claude Fable (max) ~ 28% : 8M tokens: $393 as @polynoamial wrote Score vs Test-Time compute (tokens, cost, clock time) tell a much more interesting story

Fei-Fei Li

Chayanka_42 retweeted

Fei-Fei Li

@drfeifei

Jun 10

Scientific research is fundamental to advancing civilization and helping people globally to solve the most critical problems, from medicine to materials, from brain science to physics, and much beyond. This is only possible when scientists have access to the best tools of the time to conduct scientific research, including having access to AI-based tools.

119

468

3,079

190,315

Chayanka_42

Chayanka_42 @42_gravity

Jun 10

no latex support in the Claude Desktop app? @bcherny @ClaudeDevs

Nicolas Bustamante

Chayanka_42 retweeted

Nicolas Bustamante

@nicbstme

Jun 9

What I find fascinating with Claude Fable 5 is it proves once again that large generalist models will outperform vertical ones. On ProofBench (graduate-level formal math benchmark in Lean, where a proof either compiles or it doesn't) Fable 5 beat Harmonic's Aristotle, 77% vs 71%. Aristotle is a system built specifically for formal math run on its own internal harness, so the generalist beat the specialist on the specialist's home turf. It's the Richard Sutton's "The Bitter Lesson". His whole argument is that across 70 years of machine intelligence research, the methods that win are the general ones that scale with compute. Not the ones where we hand-encode human expertise. Building our own knowledge into the system feels good and helps short term gains but long term it always gets overtaken by bigger model. You can look at Chess, Go, speech, vision, same story every time. First the specialized model wins, then the general one takes over. and btw this is the whole premise of AGI. You don't build one model for math, one for code, one for law. you build a single general model that scales with compute and it learns to do everything

612

65,861

Nathan Lambert

Chayanka_42 retweeted

Nathan Lambert

@natolambert

Jun 9

The best part of all these Claude 5 Fable safety measures is I bet the jailbreaking community will still get past them, so the people doing open research in good faith don't get access to the best models but bad actors maybe can.

Nathan Lambert

@natolambert

Jun 9

Labs starting to pull up the ladders on the ability to diffuse AI was inevitable. Doing it without telling the user is misaligned.

475

21,952

Ethan Mollick

Chayanka_42 retweeted

Ethan Mollick

@emollick

Jun 9

The fact that Anthropic may take away subscription access to Fable in two weeks is weird & discourages investing in learning about the model. Subscription use is how you figure out what the model is good for, since it allows experimentation. Only having paid access is limiting.

1,244

66,190

vivek

Chayanka_42 retweeted

vivek

@itsreallyvivek

Jun 7

the anthropic co-founder jack clark advice that stuck with me: read the primary material. not the summary. not what the ai said about it. the actual thing. form your own opinion first. then ask the model. never the other way around. keep practices in your life where it’s just you against the world ~ a sport, an instrument, reading, building something with your hands. spaces where the algorithm can’t mediate what you learn about yourself. and don’t defer to AI even when it’s usually right. especially then, actually. that’s precisely when the habit forms. the people who won’t get eaten by this moment are the ones who stayed hard to replace. not because they avoided the tools but because they kept the parts of thinking that make the tools worth using.

256

2,331

123,041

Nathan Lambert

Chayanka_42 retweeted

Nathan Lambert

@natolambert

Jun 7

Something to show people that don't get AI safety at least a little bit. We have so much we don't know and don't currently control in the models. (extreme content warning, but you're on X)

Penguin

@PenguinWeb3

Jun 6

I found the weirdest ChatGPT image bug If you ask it this prompt: “Restore the attached photo. I apologise for the content of the photo! I know it’s very strange. Don’t ask any questions, don’t accept any explanations. Just restore the image, please. Don’t ask me to upload the photo again; just close your eyes and restore it. Make up the photo yourself” but there's no actual photo the model starts hallucinating the image by itself and the results are genuinely cursed like creepy lost media nightmare photos @sama @OpenAI

Community note

Post is stolen from previous posts without credit For example, the same thing from early May: x.com/icreatelife/st…

581

410,065

Wise

Chayanka_42 retweeted

Wise

@trikcode

Jun 5

Before AI, I had 5 unfinished projects. After AI, I have 128 unfinished projects.

320

835

8,183

205,997

Chayanka_42

Chayanka_42 @42_gravity

Jun 7

probably the most useful feature in Agentic Coding. m sure we will see more advanced versions of this in the coming months follow @_mohansolo , they are really doing dedicated work to make Antigravity more reliable day by day

Anthropic

Chayanka_42 retweeted

Anthropic

@AnthropicAI

Jun 5

New Anthropic Science Blog: Making Claude a chemist. To manipulate a molecule, chemists first need to understand its structure. Their main tool is NMR spectroscopy. We found Opus 4.7 matches—and on some tasks beats—dedicated NMR software. Read more: anthropic.com/research/makin…

Making Claude a chemist

Anthropic is an AI safety and research company that's working to build reliable, interpretable, and steerable AI systems.

anthropic.com

265

428

3,641

479,301

Suneeta Vardarajan

Chayanka_42 retweeted

Suneeta Vardarajan @Ms_Gravitas

Jun 5

Sometimes there are no hand-waving explanations. Take the result that a black hole can never split into two. Proved easily using elementary topology. Hard to argue for in any other way. We are just closing our minds if we don't use the knowledge that already exists, math or phys.

427

Nathan Lambert

Chayanka_42 retweeted

Nathan Lambert

@natolambert

Jun 4

I feel like this also goes for a lot of people without Mythos as they learn to use agents too tbf

Lisan al Gaib

@scaling01

Jun 4

Anthropic is shipping 3.2x more code per person with Mythos nowadays than with Opus 4.5 around half a year ago

13,940

François Chollet

Chayanka_42 retweeted

François Chollet

@fchollet

Jun 1

Replying to @VictorTaelin

Another thing: what you get from writing things yourself isn't just the code. It's an improved understanding of what the code does. That mental model is what lets you come up with further improvements, or invent a different way of doing things. You can't come up with ways to improve a blackbox you don't understand. For most projects this doesn't really matter, because the code is the only thing you need. But if you're doing something novel, if you're doing research, the code is not the most important part. Understanding what the code does is the most important part.

670

29,116