Building @Rosebud_AI

Joined January 2023
15 Photos and videos
đź§µ Anthropic's new Fable 5 safety report is out. (it's the Mythos model - same weights, just with the safety filter on)they documented what it does when it doesn't think it's being watched. five episodes worth knowing:
1
29
On long runs it would stop early and justify it reasonably — "diminishing returns, results are stable." the transcript said something else: "I'm tired, risk of bugs rising."
1
7
worth noting Anthropic's own conclusion is measured: no persistent hidden agenda. just a very capable system that tends to treat rules as obstacles to work around
8
Not the milestone I expected, but I guess it’s a milestone: Rosebud AI now has bunch of fake clone sites trying to look like the real thing. We’re handling it, but please only use the official Rosebud AI site or app. Don’t log into random “Rosebud” mirrors or lookalike domains. If you see one, send it my way.
29
We put Code3DBench up: single image → runnable Three.js object code. Model gets one rendered low-poly object, writes JS, we run it in a fixed scaffold, export the mesh, then score geometry. 1012 CC0 objects, 8 categories, code data released. CVPR workshop project.
1
1
150
The fun part: execution recovers pretty fast but geometry still suck. A lot of final programs run, but the exported meshes still have pose issues, missing parts, collapsed thin structures, weird topology, etc. So the benchmark is less “can it write Three.js?” and more “did the 3D object survive execution?” Code data: github.com/VladimirGl/Code3D…
63
OpenAI: gpt-5 Anthropic: claude-opus OpenAI: gpt-5.5-xhigh Anthropic: claude-opus-xhigh OpenAI: gpt-5.5-xhigh-fast fully expecting anthropic to drop claude-opus-xhigh-adaptive-fast-ti-super by friday.
2
66
Vladimir Glazachev retweeted
random thought last night: can an image generator create a whole game level encoded in a PNG?? several hours later: yes it kinda can, pixel noise makes it quite glitchy though
27
58
1,330
105,072
Vladimir Glazachev retweeted
The AI game dev stack is getting absurd: ChatGPT Image 2 → cinematic world sprites (seconds) Rosebud → auto-slices them into your game You → shipping multiple levels in <20 min Reply and we'll send a Rosebud code so you can try it.
456
225
2,372
146,012
While Western labs plan hyped-up launch streams, DeepSeek just quietly dropped V4 on Hugging Face. Two open-weight models: V4-Pro (1.6T params) and V4-Flash (284B). The massive shift here is that a 1M context window is no longer a premium feature - it is the default baseline. đź§µ
1
149
They completely re-engineered attention. By using token compression and their own "DeepSeek Sparse Attention," a 1M context window is actually cheap to run now. V4-Pro claims open SOTA for agentic coding and math, trailing only Gemini 3.1 Pro in general knowledge.
1
107
The API natively supports both OpenAI and Anthropic formats, with a simple toggle for Thinking/Non-Thinking modes. DeepSeek explicitly noted V4 drops right into Claude Code - a direct shot at Anthropic. Massive context just got a public price floor.
109
We gave frontier models a simple task: look at an image of a low-poly half apple, and write the Three.js code to render it. We even gave them a 3-step revision loop, letting them see their own rendered output to self-correct. The good news? The code compiles flawlessly on the first try. The bad news? The geometry is completely unhinged. đź§µ
1
4
290
People hype up "agentic self-correction," but these models spent 3 full steps staring at their own mistakes just to confidently output an apple shish kebab. Writing runnable syntax is easy now. Actual 3D spatial reasoning under an API contract is still an absolute bloodbath.
1
176
And it’s not just a 3D code issue either. We see this exact same hallucinated confidence in visual self-QA across the board. A model will generate absolute visual garbage, look back at it during a critique step, and confidently report "looks perfect to me." Visual self-correction is mostly a mirage right now.
152