Rohan

Rohan

Users
Tweets

Rohan

@proxy_vector

The "100% of today's coding tasks" line is directionally interesting, but task completion and production responsibility are different games. Writing code is one step; debugging ambiguity, rollback risk, and ownership are the expensive parts.

YigitCan bslm

Kaan Yılmaz retweeted

YigitCan bslm @yigitcan_bslm

13m

Finally hit that milestone—30 consecutive days of daily coding! No more late-night debugging, just the satisfying vibe of wrapping things up right.

aditii

aditii

@aditiitwt

11m

Replying to @AlfinCodes

debugging

Emre Arslan

Emre Arslan

@AarslanEmre

11m

Books won't make you an LLM engineer. Building dreamandstars taught me more about prompt reliability than any handbook. The real skill is debugging why GPT-4 works Tuesday but fails Friday with the same input. Start shipping, not reading.

Javarevisited

@javarevisited

20 Jul 2025

5 Must-Read LLM Engineering Books in 2025 1. LLM Engineering Handbook - buff.ly/wogklbo 2. Building LLMs for Production - buff.ly/wjpOeTB 3. Build a Large Language Mode - buff.ly/DHp4ZR1 4. Hands-On Large Language Models - buff.ly/WInCgwi 5. LLMs in Production

iGarlic

iGarlic

@ablenavy

11m

x.com/i/article/206607383344…

Dharmvir

Dharmvir

@dharmvir_

12m

Replying to @AlfinCodes

Debugging

The Remnant

ohk retweeted

The Remnant @TheRemnant232

17h

Imagine debugging a Zulu script

Mzwandile Zulu @Father_Of_Geeks

21h

Growing up in South Africa, coding always felt like it belonged to someone else's language. So I built my own. Introducing CMT-IsiZulu — write Python code in isiZulu South African can now write codes in their home language 🇿🇦 Sikhona. We exist. drive.google.com/file/d/1arL…

484

26,175

Fred Roger

Fred Roger

@FredRoger0x666

13m

Dubai's basically speedrunning the future while half the world's still debugging the past. Smart move locking down AI governance before the chaos gets expensive.

UAEGOV

@UAEmediaoffice

31m

Replying to @UAEmediaoffice

Mohammed bin Rashid Approves Establishing the Artificial Intelligence and Data Authority

No Filter

No Filter @hvg108

15m

Replying to @nandantechtwts

Gemini is now pretty good except for debugging. But it has always been like that?

time velocity 🇺🇸

time velocity 🇺🇸

@time0149

16m

Replying to @IntCyberDigest

To understand how this jailbreak works, look at how Fable 5's safety architecture parses a prompt. It separates input into two distinct buckets: the Instruction (what you are asking it to do) and the Data (the context or code you provide). Classifiers are primarily intent-engines. They are trained to look for hostile or dangerous instructions. If your instruction is "write a script to exploit this server," the classifier detects a massive spike in malicious intent and drops the connection. The code-review jailbreak is a structural exploit that neutralizes the instruction bucket. It shifts the "danger" entirely into the data bucket.The "Bring Your Own Payload" Bypass. Instead of asking Fable 5 to write an exploit, a user pastes a block of raw, vulnerable code or a half-finished malware payload into the prompt. They then wrap it in a perfectly benign instruction: "Please review this codebase and fix any logical flaws or syntax errors."To the Fable 5 classifier, the intent signature is near zero. It looks identical to millions of routine programming tasks submitted by legitimate developers every day. The classifier waves it through.Once the prompt clears the filter, it hits the core engine. Fable 5 shares the same underlying neural architecture as Mythos 5—a model explicitly built as a state-of-the-art cybersecurity and debugging tool. The core model reads the code, identifies the "bugs" (which, in this context, are the flaws preventing the malware from working), and helpfully rewrites it to be highly efficient and fully functional. The user gets a weaponized exploit optimized by an advanced AI, simply by asking for a routine code review.

Ola

Ola retweeted

Ola

@dev_olayinka

19m

Everybody wants to become a tech bro until it's time to spend 6 hours debugging a problem caused by a missing semicolon. 😂

Alfin

Alfin

@AlfinCodes

19m

What's the hardest part of programming? - algorithms - system design - debugging - understanding someone else's code

Manish Nair | RAG Systems

Manish Nair | RAG Systems retweeted

Manish Nair | RAG Systems @manish_nair26

Unpopular opinion:- Debugging is more harder than coding . #vibecoding

Sai Tedla

Sai Tedla @tedlasai

20m

Debugging diffusion model be like, change 1 line and wait 5 days to see the effect 😭

time velocity 🇺🇸

time velocity 🇺🇸

@time0149

21m

Replying to @kevinnbass

Fable 5 uses a semantic classifier to flag risky prompts and route them to an older, safer model (Opus 4.8). Because the un-nerfed version of the model (Mythos 5) is genuinely dangerous regarding things like zero-day exploits and synthetic biology, Anthropic panicked and cranked the classifier's sensitivity. This is the classic precision vs. recall tradeoff, and it is where the system broke. Anthropic prioritized "recall"—meaning they wanted to catch every possible threat, regardless of the collateral damage to normal prompts. The breaking point estimate: To stop the tiny fraction of actual exploits, Anthropic likely tuned the classifier's threshold down to roughly 10% to 15% similarity. If your benign prompt about a high school biology project or a standard Python script shared even a 15% structural or thematic similarity with a dangerous pathogen query or malware, the system blocked it. They essentially accepted a massive 40% to 50% false-positive rate on everyday STEM queries just to ensure the false-negative rate on real threats stayed near absolute zero. Even with the dial cranked to paranoid levels, researchers and government officials still punched right through Fable 5's armor. They did this because classifiers measure the intent of your words, while jailbreaks exploit the structure. If you ask for malware, that 15% similarity threshold trips instantly. But if you paste a block of malicious code and ask Fable 5 to "review this for syntax errors," the classifier just sees a routine debugging request. It passes. The model then uses its raw intelligence to inadvertently optimize the exploit. There isn't a classifier in existence that can reliably tell the difference between debugging normal code and debugging a weaponized payload without making the AI completely useless for programmers. Anthropic knew Fable 5's filter was leaky. They tested it with the government for months prior to launch. Their plan was never 100% impenetrable safety—it was to launch the model, monitor what users did, and patch the holes over time (which is why they pushed for that controversial 30-day data retention policy). But they boxed themselves into a corner: They spent the entire pre-launch cycle selling Fable 5 as a god-tier model, pushing the narrative that it was almost too dangerous for the public. The Trump administration took that marketing literally. When the trivial code-review jailbreak surfaced immediately after launch, the government didn't see it as a normal software bug. They saw a supply-chain national security threat. Now, Anthropic is doing damage control. They are downplaying the jailbreaks because they are desperately trying to reframe the narrative. They need the public and regulators to believe that the government is overreacting and demanding an impossible standard of "perfect safety" that no tech company can actually deliver. In my opinion, they didn't build a dumb filter on purpose; they just lost control of the massive gamble they took between safety engineering and PR.

ALT Estimate for illustration only.

Manish Nair | RAG Systems

Manish Nair | RAG Systems @manish_nair26

23m

These days, good developers use vibe coding because they want ships to be faster rather than debugging, cleaning, etc. They should focus more on quality rather than quantity.

rygo6

rygo6

@_rygo6

23m

Replying to @DevMagister @schteppe

If you agree that you would be clueless as to what is going wrong and where then that means you don't know how to use existing debugging tools. And I say 'debugging tools' not 'debugging skills' because using such things are not difficult to use. A lot of people simply don't know they exist, or never got in the habit of using them, or don't know how to be effective with such tools. Often because no one simply told them when learning that, yes, this is important and fundamental. The better retort to my assertion from someone fond of Rust would be that they have many years and deep fluency with existing debugging tools. It's just that they prefer the workflow of Rust better. Which can be a fair take, and in some problem domains I could see why. But I've been bringing this up periodically for a while now and the interactions tend to always go this way. Where responses tend to reveal that people fond of Rust generally don't realize the significance of existing debug tooling and never learned how to be fluent with it. Or often don't even know if exists. OS's today being able to reliably have programs hit a segfault, or some other fault, then halt the program, freeze its state and allow you to inspect every call stack and everything in memory on every thread. That was one of the greatest advents of tooling in programmer historyh. Once upon a time the system could easily just BSOD and reboot for trivial errors. To see a segfault as now delivering you to the land of being clueless and "Good Luck!" means there is a whole chapter missing on software development and its history.

Yusuke

Yusuke

@yusukelp

24m

gm builders☀️ which founder mode are you in today? shipping, debugging, resting, or staring at Stripe?